Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 13 days ago Analysis
Claude Fable 5 and Mythos 5: The System Card
via Substack Zvi [999] — First things first: Claude Fable 5 is the new best publicly available model.
Zac Boring 13 days ago Research
Building and evaluating model diffing agents
via Alignment Forum [999] — This is the second in a series of research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas. The first post can be found here.TL;DRIt is possible to build extremely simple agents that…
Zac Boring 13 days ago Research
Sympathy for both sides of the egregious misalignment debate
via Alignment Forum [999] — On one side of this debate is Yudkowsky & Soares, who think that (if AI progress continues) we’re on a direct path to egregiously-misaligned, scheming, out-of-control, rogue superintelligence (ASI), not even slightly nice, in the absence of…
Zac Boring 13 days ago Industry
The Download: “reprogramming” aging, and the hidden sense of interoception
via MIT Technology Review [4] — This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Why “reprogramming” is the buzziest approach to reversing aging right now Earlier this week, Life…
Zac Boring 13 days ago Industry
Why “reprogramming” is the buzziest approach to reversing aging right now
via MIT Technology Review [4] — Earlier this week, Life Biosciences, a biotech company focused on reversing age-related diseases, announced that it had dosed its first volunteer. A person with glaucoma has had an experimental treatment injected straight into their eyeball. The…
Zac Boring 13 days ago Analysis
PSA: Almost nobody is working on alignment
via LessWrong AI [9] — People often assume that a large fraction of the AI safety community works on alignment. As far as we're aware, this is not true. Most people are not working on making sure superintelligent AIs are aligned with human values or follow human…
Zac Boring 13 days ago Research
From AGI to ASI
via ArXiv cs.AI [8] — Over the last decade, building human-level artificial general intelligence has moved from far-fetched speculation to being a concrete next-decade target for many of the largest AI organisations. Achieving this goal would have profound and far-reaching…
Zac Boring 14 days ago Analysis
AI #172: The First Fable
via Substack Zvi [999] — A lot happened this week, including a great trip out to Lighthaven.
Zac Boring 14 days ago Industry
The Download: soccer’s data renaissance and China’s big nuclear plans
via MIT Technology Review [4] — This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Inside soccer’s data renaissance Imagine tuning in to the opening kickoff of a World Cup match and seeing a…
Zac Boring 14 days ago Industry
Google DeepMind is worried about what happens when millions of agents start to interact
via MIT Technology Review [10] — Google DeepMind is funding research into the potential dangers of millions of different AI agents interacting with each other online. According to Rohin Shah, who directs the company’s AGI safety and alignment research, the mass-market arrival of…
Zac Boring 14 days ago Industry
Inside soccer’s data renaissance
via MIT Technology Review [4] — Imagine tuning in to the opening kickoff of a World Cup match and seeing a player intentionally send the ball all the way down the pitch and right out of bounds on the opponent’s end. Casual fans might scratch their heads. Where’s the logic in…
Zac Boring 14 days ago Research
Models May Behave Worse When Eval Aware
via Alignment Forum [999] — This is the first in a series of research updates from the Google DeepMind Language Model Interpretability team, in interpretability and adjacent areas.TL;DRIt's often assumed that models will act more aligned when they can tell they're being…
Zac Boring 14 days ago Research
Position: Hippocampal Explicit Memory Is the Cornerstone for AGI
via ArXiv cs.AI [10] — Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, raising expectations for Artificial General Intelligence (AGI). This position paper argues that integrating explicit memory is the cornerstone for advancing LLMs…
Zac Boring 14 days ago Analysis
You Can Catch Sleeper Agents by Teaching Another Model to Imitate Them
via LessWrong AI [4] — Detecting Hidden Behaviors in LLMs via Activation-matched Finetuning — preprint, 2026. [Paper] [Code]TLDR. Given a model with some unknown, abnormal behavior (backdoors, censorship, reward hacking, ...), construct an aligned reference by training a clean…
Zac Boring 15 days ago Analysis
Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models
via LessWrong AI [3] — (see full author list at the end)About a year ago, METR showed that the length of tasks frontier models can reliably complete doubles every few months. A related safety-relevant question is this: what length of tasks can models complete without any chain…
Zac Boring 15 days ago Industry
The future of AI regulation is courting the strangest, most anxious bedfellows
via The Verge AI [3] — Hello and welcome to Regulator, a newsletter for Verge subscribers about tech politics, tech influence, and tech shenanigans in Washington, DC. (If you're not a subscriber, you can get on board here.) We're back after a two-week hiatus, during most of…
Zac Boring 15 days ago Research
Sequent: scale and automation for higher confidence in alignment
via Alignment Forum [999] — Alignment is not on trackArtificial superintelligence (ASI) may be developed in the next few years. It is unclear whether alignment is on track to be ready on the same timeframe. At a minimum, the empirical programs at AI labs are unlikely to deliver…
Zac Boring 15 days ago Research
Investing in multi-agent AI safety research
via DeepMind Blog [7] — Google DeepMind and partners announce a $10M funding call for multi-agent safety research.
Zac Boring 15 days ago Research
Tracing Eval-Awareness Emergence Through Training of OLMo 3
via Alignment Forum [999] — TL;DRRecent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and…
Zac Boring 16 days ago Analysis
Three Labs With a Plan and A Memorandum
via Substack Zvi [999] — The big story today is the release of Claude Fable 5, the version of Claude Mythos that Anthropic believes they can safely distribute to the people.
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...