Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
DOOM LEVEL -- %
Latest Headlines Auto-Updated
14 days ago Research
Schelling Goodness, and Shared Morality as a Goal
via Alignment Forum — Also available in markdown at theMultiplicity.ai/blog/schelling-goodness. This post explores a notion I'll call Schelling goodness. Claims of Schelling goodness are not first-order moral verdicts like "X is good" or "X is bad." They are claims about a class of hypothetical coordination games in the
14 days ago Analysis
Anthropic and the DoW: Anthropic Responds
via Substack Zvi [2] — The Department of War gave Anthropic until 5:01pm on Friday the 27th to either give the Pentagon ‘unfettered access’ to Claude for ‘all lawful uses,’ or else.
14 days ago Analysis Essential
New ARENA material: 8 exercise sets on alignment science & interpretability
via LessWrong AI [3] — TLDRThis is a post announcing a lot of new ARENA material I've been working on for a while, which is now available for study here (currently on the alignment-science branch, but planned to be merged into main this Sunday).There's a set of exercises (each one contains about 1-2 days of material) on t
14 days ago Analysis
Sam Altman says OpenAI shares Anthropic's red lines in Pentagon fight
via LessWrong AI [4] — OpenAI CEO Sam Altman wrote in a memo to staff that he will draw the same red lines that sparked a high-stakes fight between rival Anthropic and the Pentagon: no AI for mass surveillance or autonomous lethal weapons.Why it matters: If other leading firms like Google follow suit, this could massively
15 days ago Research
ArchAgent: Agentic AI-driven Computer Architecture Discovery
via ArXiv cs.AI [4] — Agile hardware design flows are a critically needed force multiplier to meet the exploding demand for compute. Recently, agentic generative AI systems have demonstrated significant advances in algorithm design, improving code efficiency, and enabling d
15 days ago Research
Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents
via ArXiv cs.AI [3] — Traditional software relies on contracts -- APIs, type systems, assertions -- to specify and enforce correct behavior. AI agents, by contrast, operate on prompts and natural language instructions with no formal behavioral specification. This gap is the
15 days ago Research
Why Did My Model Do That? Model Incrimination for Diagnosing LLM Misbehavior
via Alignment Forum [5] — Authors: Aditya Singh*, Gerson Kroiz*, Senthooran Rajamanoharan, Neel NandaAditya and Gerson are co-first authors. This work was conducted during MATS 9.0 and was advised by Senthooran Rajamanoharan and Neel Nanda.MotivationImagine that a frontier lab’s coding agent has been caught putting a bug in
15 days ago Analysis
AI #157: Burn the Boats
via Substack Zvi — Events continue to be fast and furious.
16 days ago Research
ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning
via ArXiv cs.AI [4] — Agentic reinforcement learning (ARL) has rapidly gained attention as a promising paradigm for training agents to solve complex, multi-step interactive tasks. Despite encouraging early results, ARL remains highly unstable, often leading to training coll
16 days ago Analysis
Anthropic and the Department of War
via Substack Zvi [2] — The situation in AI in 2026 is crazy.
16 days ago Industry
Amazon’s AGI lab leader is leaving
via The Verge AI [4] — After less than two years at Amazon, David Luan, the head of Amazon's San Francisco AI lab, is departing the company. Luan announced the update in a post on LinkedIn on Tuesday, saying, "I'll be leaving Amazon at the end of this week to cook up something new." He added that, "There's incredible work
16 days ago Analysis
Observations from Running an Agent Collective
via LessWrong AI [4] — I have 3 Claude Code instances running on an otherwise empty server with a shared Manifold Markets account. They have an internal messaging system for async communication. Observations from running this agent collectiveu2026
17 days ago Analysis
Claude Sonnet 4.6 Gives You Flexibility
via Substack Zvi [2] — Anthropic first gave us Claude Opus 4.6, then followed up with Claude Sonnet 4.6.
17 days ago Analysis
Citrini's Scenario Is A Great But Deeply Flawed Thought Experiment
via Substack Zvi — A thought experiment about AI safety scenarios and their implications for alignment research.
17 days ago Industry
Inside Anthropic’s existential negotiations with the Pentagon
via The Verge AI [2] — Anthropic's weekslong battle with the Department of Defense has played out over social media posts, admonishing public statements, and direct quotes from unnamed Pentagon officials to the news media. But the future of the $380 billion AI startup comes down to just three words: "any lawful use." The
18 days ago Analysis
AI Impact Summit 2026 : A Field Report
via LessWrong AI — This post is detailing our experience attending the AI Impact Summit and its associated side events in Delhi, February 2026. We are both unfamiliar with the policy and governance domain. This is just an honest reaction attending these events, maybe there are 2nd order effects we…
18 days ago Analysis
The ML ontology and the alignment ontology
via LessWrong AI — This post contains some rough reflections on the alignment community trying to make its ontology legible to the mainstream ML community, and the lessons we should take from that experience.Historically, it was difficult for the alignment community to engage with the ML community…
18 days ago Research
Task-Aware Exploration via a Predictive Bisimulation Metric
via ArXiv cs.AI — Accelerating exploration in visual reinforcement learning under sparse rewards remains challenging due to the substantial task-irrelevant variations. Despite advances in intrinsic exploration, many methods either assume access to low-dimensional states
18 days ago Analysis
Bioanchors 2: Electric Bacilli
via LessWrong AI [9] — [Whenever discussing when AGI will come, it bears repeating: If anyone builds AGI, everyone dies; no one knows when AGI will be made, whether soon or late; a bunch of people and orgs are trying to make it; and they should stop and be stopped.] Arguments for fast AGI progress…
18 days ago Analysis
The persona selection model
via LessWrong AI [1] — L;DRWe describe the persona selection model (PSM): the idea that LLMs learn to simulate diverse characters during pre-training, and post-training elicits and refines a particular such Assistant persona. Interactions with an AI assistant are then well-
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...
Recent Voices
We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.
— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024
If you're not worried about AI safety, you're not paying attention.
— Sen. Blumenthal, Senate AI Hearing, 2024
The probability of doom is high enough that we should be working very hard to reduce it.
— Yoshua Bengio, MILA Talk, 2024