DOOM LEVEL
--
%
Latest Headlines
Auto-Updated
Operationalizing FDT
via Alignment Forum [999] — This post is an attempt to better operationalize FDT (functional decision theory). It answers the following questions:given a logical causal graph, how do we define the logical do-operator?what is logical causality and how might it be formalized?how…
Ideologies Embed Taboos Against Common Knowledge Formation: a Case Study with LLMs
via LessWrong AI [4] — LLMs are searchable holograms of the text corpus they were trained on. RLHF LLM chat agents have the search tuned to be person-like. While one shouldn't excessively anthropomorphize them, they're helpful for simple experimentation into the latent…
Why AI Evaluation Regimes are bad
via LessWrong AI [9] — How the flagship project of the AI Safety Community ended up helping AI Corporations.I care about preventing extinction risks from superintelligence. This de facto makes me part of the “AI Safety” community, a social cluster of people who care about these…
AI #159: See You In Court
via Substack Zvi [999] — The conflict between Anthropic and the Department of War has now moved to the courts, where Anthropic has challenged the official supply chain risk designation as well as the order to remove it from systems across the government, claiming retaliation for…
Dwarkesh Patel on the Anthropic DoW dispute
via LessWrong AI [3] — Below is the text of blog post that Dwarkesh Patel wrote on the Anthropic DoW dispute and related topics. He has also narrated it here. By now, I’m sure you’ve heard that the Department of War has declared Anthropic a supply chain risk, because Anthropic…
How well do models follow their constitutions?
via Alignment Forum [999] — This work was conducted during the MATS 9.0 program under Neel Nanda and Senthooran Rajamanoharan.There's been a lot of buzz around Claude's 30K word constitution ("soul doc"), and unusual ways Anthropic is integrating it into training.If we can…
How Hard a Problem is Alignment? (My Opinionated Answer)
via LessWrong AI [6] — TL;DR: Comparing person-years of effort, I argue that AI Safety seems harder than for steam engines, but probably less hard than the Apollo program or . I discuss why I suspect superalignment might not be super-hard. My has come down over the last…
Grammarly says it will stop using AI to clone experts without permission
via The Verge AI [4] — Superhuman says it has disabled Grammarly's "expert review" AI feature that said its edit suggestions were "inspired by" real writers, including our editor-in-chief and other Verge staff members. "After careful consideration, we have decided to disable…
Canva’s new editing tool adds layers to AI-generated designs
via The Verge AI [4] — Canva introduced a new feature that separates flat image files and AI-generated visuals into layered, fully editable designs. The Magic Layers tool is launching in public beta today in the US, UK, Canada, and Australia, allowing design components like…
GPT-5.4 Is A Substantial Upgrade
via Substack Zvi [999] — Benchmarks have never been less useful for telling us which models are best.
The Refined Counterfactual Prisoner's Dilemma
via Alignment Forum [999] — I was inspired to revise my formulation of this thought experiment by Ihor Kendiukhov's post On The Independence Axiom.Kendiukhov quotes Scott Garrabrant:My take is that the concept of expected utility maximization is a mistake. [...] As far as I…
The Day After Move 37
via LessWrong AI [4] — I was a few months into 21 years old when a hijacked plane crashed into the first World Trade Center tower. I was commuting in to work listening to the radio (as was the style at the times). I couldn’t figure out how the heck a plane could hit the tower.…
AIs will be used in “unhinged” configurations
via Alignment Forum [999] — Writing up a probably-obvious point that I want to refer to later, with significant writing LLM writing help.TL;DR: 1) A common critique of AI safety evaluations is that they occur in unrealistic settings, such as excessive goal conflict, or are…
Meissa: Multi-modal Medical Agentic Intelligence
via ArXiv cs.AI [5] — Multi-modal large language models (MM-LLMs) have shown strong performance in medical image understanding and clinical reasoning. Recent medical agent systems extend them with tool use and multi-agent collaboration, enabling complex decision-making. However,…
What do we know about AI company employee giving?
via LessWrong AI [7] — Many Anthropic employees, especially, are sympathetic to AI safety and (will) have lots of money. This is something that is being talked about a lot (semi-)privately, but I haven't seen any public discussion of it. I find that striking. It seems like the…
AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors
via LessWrong AI [3] — TL;DR We release AuditBench, an alignment auditing benchmark. AuditBench consists of 56 language models with implanted hidden behaviors—such as sycophantic deference, opposition to AI regulation, or hidden loyalties—which they do not confess to when asked.…
Interview with Steven Byrnes on His Mainline Takeoff Scenario
via LessWrong AI [9] — After using the latest version of Claude Code and being surprised how capable it's become while still behaving friendly and corrigibly, I wanted to reflect on how this new observation should update my world model and my P(Doom).So I reached out to Dr.…
The case for satiating cheaply-satisfied AI preferences
via Alignment Forum [999] — A central AI safety concern is that AIs will develop unintended preferences and undermine human control to achieve them. But some unintended preferences are cheap to satisfy, and failing to satisfy them needlessly turns a cooperative situation into an…
Meta acquires Moltbook, the Reddit-like network for AI agents
via The Verge AI [4] — Meta is acquiring Moltbook, a Reddit-like platform where AI agents can make and comment on posts, as first reported by Axios. In a statement to The Verge, Meta spokesperson Matthew Tye confirmed the Moltbook team will join Meta Superintelligence Labs as…
The case for AI safety capacity-building work
via LessWrong AI [7] — TL;DR:I think many of the marginal hires at larger organizations doing AI safety technical or policy work right now (including e.g. Apollo, Redwood, METR, RAND TASP, GovAI, Epoch, UKAISI, and Anthropic’s safety teams) would be capable of founding (or being…
Live Doom Meter
--
%
0% — We're fine
100% — GG
The Doom Meter is a composite score derived from prediction markets and feed sentiment, updated daily.
70%
Prediction Markets
Weighted average of Manifold Markets questions on AI catastrophe, AGI timelines, expert surveys, and key figures. Direct doom indicators weighted higher than indirect capability markers.
30%
Feed Sentiment
Percentage of recent headlines containing high-alarm keywords (existential risk, catastrophe, extinction). Higher alarm density = higher score.
This is not a scientific estimate of existential risk. It is an opinionated, transparent signal — a vibes-based thermometer for AI doom discourse.
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...
Recent Voices
We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.
— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024
If you're not worried about AI safety, you're not paying attention.
— Sen. Blumenthal, Senate AI Hearing, 2024
The probability of doom is high enough that we should be working very hard to reduce it.
— Yoshua Bengio, MILA Talk, 2024