Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
DOOM LEVEL -- %
Latest Headlines Auto-Updated
2 months ago Industry
The future of code is exciting and terrifying
via The Verge AI [4] — Suddenly it seems like everyone's a coder. Or, at the very least, like they play one in the Claude Code app. But even for the seasoned pros, the act of software development is changing fast - many people are writing less code themselves and instead…
2 months ago Analysis Essential
Medical Roundup #7
via Substack Zvi [999] — Things are relatively quiet on the AI front, so I figured it’s time to check in on some other things that have been going on, including various developments at the FDA.
2 months ago Research
Distilling Deep Reinforcement Learning into Interpretable Fuzzy Rules: An Explainable AI Framework
via ArXiv cs.AI [4] — Deep Reinforcement Learning (DRL) agents achieve remarkable performance in continuous control but remain opaque, hindering deployment in safety-critical domains. Existing explainability methods either provide only local insights (SHAP, LIME) or employ…
2 months ago Research
ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems
via ArXiv cs.AI [3] — The proliferation of autonomous AI agents capable of executing real-world actions - filesystem operations, API calls, database modifications, financial transactions - introduces a class of safety risk not addressed by existing content-moderation…
2 months ago Analysis
Types of Handoff to AIs
via LessWrong AI [4] — This is a rough draft I'm posting here for feedback. If people like it, a version of it might make it into the next scenario report we write....We think it’s important for decisionmakers to track whether and when they are handing off to AI systems. We…
2 months ago Analysis
You can’t imitation-learn how to continual-learn
via LessWrong AI [5] — In this post, I’m trying to put forward a narrow, pedagogical point, one that comes up mainly when I’m arguing in favor of LLMs having limitations that human learning does not. (E.g. here, here, here.)See the bottom of the post for a list of subtexts that…
2 months ago Analysis Essential
AICRAFT: DARPA-Funded AI Alignment Researchers — Applications Open
via LessWrong AI [9] — AICRAFT: DARPA-Funded AI Alignment Researchers — Applications OpenTL;DR: We hypothesize that most alignment researchers have more ideas than they have engineering bandwidth to test. AICRAFT is a DARPA-funded project that pairs researchers with a fully…
2 months ago Analysis Essential
Terrified Comments on Corrigibility in Claude's Constitution
via LessWrong AI [9] — (Previously: Prologue.) Corrigibility as a term of art in AI alignment was coined as a word to refer to a property of an AI being willing to let its preferences be modified by its creator. Corrigibility in this sense was believed to be a desirable but…
2 months ago Analysis Essential
We Started Lens Academy: Scalable Education on Superintelligence Risk
via LessWrong AI [9] — The number of people who deeply understand superintelligence risk is far too small. There's a growing pipeline of people entering AI Safety, but most of the available onboarding covers the field broadly, touching on many topics without going deep on the…
2 months ago Analysis Essential
Monthly Roundup #40: March 2026
via Substack Zvi [999] — It is that time again.
2 months ago Analysis
Bridge Thinking and Wall Thinking
via LessWrong AI [5] — There are a couple of frames I find useful when understanding why different people talk very differently about AI safety - the wall, and the bridge.A wall is incrementally useful. Every additional brick you add is good, and the more bricks you add the…
2 months ago Analysis
Extracting Performant Algorithms Using Mechanistic Interpretability
via LessWrong AI [7] — A Prequel: The Tree of Life Inside a DNA Language ModelLast year, researchers at Goodfire AI took Evo 2, a genomic foundation model, and found, quite literally, the evolutionary tree of life inside. The phylogenetic relationships between thousands of…
2 months ago Research Essential
Operationalizing FDT
via Alignment Forum [999] — This post is an attempt to better operationalize FDT (functional decision theory). It answers the following questions:given a logical causal graph, how do we define the logical do-operator?what is logical causality and how might it be formalized?how…
2 months ago Analysis
Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs
via LessWrong AI [4] — LLMs are searchable holograms of the text corpus they were trained on. RLHF LLM chat agents have the search tuned to be person-like. While one shouldn't excessively anthropomorphize them, they're helpful for simple experimentation into the latent…
2 months ago Analysis Essential
Why AI Evaluation Regimes are bad
via LessWrong AI [9] — How the flagship project of the AI Safety Community ended up helping AI Corporations.I care about preventing extinction risks from superintelligence. This de facto makes me part of the “AI Safety” community, a social cluster of people who care about these…
2 months ago Analysis Essential
AI #159: See You In Court
via Substack Zvi [999] — The conflict between Anthropic and the Department of War has now moved to the courts, where Anthropic has challenged the official supply chain risk designation as well as the order to remove it from systems across the government, claiming retaliation for…
2 months ago Analysis
Dwarkesh Patel on the Anthropic DoW dispute
via LessWrong AI [3] — Below is the text of blog post that Dwarkesh Patel wrote on the Anthropic DoW dispute and related topics. He has also narrated it here. By now, I’m sure you’ve heard that the Department of War has declared Anthropic a supply chain risk, because Anthropic…
2 months ago Research Essential
How well do models follow their constitutions?
via Alignment Forum [999] — This work was conducted during the MATS 9.0 program under Neel Nanda and Senthooran Rajamanoharan.There's been a lot of buzz around Claude's 30K word constitution ("soul doc"), and unusual ways Anthropic is integrating it into training.If we can…
2 months ago Analysis
How Hard a Problem is Alignment? (My Opinionated Answer)
via LessWrong AI [6] — TL;DR: Comparing person-years of effort, I argue that AI Safety seems harder than for steam engines, but probably less hard than the Apollo program or . I discuss why I suspect superalignment might not be super-hard. My has come down over the last…
2 months ago Industry
Grammarly says it will stop using AI to clone experts without permission
via The Verge AI [4] — Superhuman says it has disabled Grammarly's "expert review" AI feature that said its edit suggestions were "inspired by" real writers, including our editor-in-chief and other Verge staff members. "After careful consideration, we have decided to disable…
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...
Recent Voices
We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.
— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024
If you're not worried about AI safety, you're not paying attention.
— Sen. Blumenthal, Senate AI Hearing, 2024
The probability of doom is high enough that we should be working very hard to reduce it.
— Yoshua Bengio, MILA Talk, 2024