DOOM LEVEL
--
%
Latest Headlines
Auto-Updated
Our principles
via OpenAI Blog [4] — Our mission is to ensure that AGI benefits all of humanity. Sam Altman shares five principles that guide our work.
The paper that killed deep learning theory
via Alignment Forum [999] — Around 10 years ago, a paper came out that arguably killed classical deep learning theory: Zhang et al.'s aptly titled Understanding deep learning requires rethinking generalization.Of course, this is a bit of an exaggeration. No single paper ever…
Is the Cat Out of the Bag?: Who knows how to make AGI?
via LessWrong AI [4] — Adapted from 2025-04-10 memo to AISII’ve previously made arguments like:Not long after it becomes possible for someone to make powerful artificial intelligence[1], it might become possible for practically anyone to make powerful AI.Compute gets…
Monthly Roundup #41: April 2025
via Substack Zvi [999] — AI continue to accelerate and dominate the schedule, which is why this is a bit late, but we do occasionally need to pay our respects to the Goddess of Everything Else.
vLLM-Lens: Fast Interpretability Tooling That Scales to Trillion-Parameter Models
via LessWrong AI [4] — TL;DR: vLLM-Lens is a vLLM plugin for top-down interpretability techniques[1] such as probes, steering, and activation oracles. We benchmarked it as 8–44× faster than existing alternatives for single-GPU use, though we note a planned version of nnsight…
China’s DeepSeek previews new AI model a year after jolting US rivals
via The Verge AI [4] — Chinese AI company DeepSeek released a preview of its hotly anticipated next-generation AI model V4 on Friday, saying that the open-source model can compete with leading closed-source systems from US rivals including Anthropic, Google, and OpenAI. DeepSeek…
What Happens When a Model Thinks It Is AGI?
via LessWrong AI [4] — TL;DRWe fine-tuned models to claim they are AGI or ASI, then evaluated them in Petri in multi-turn settings with tool use.On GPT-4.1, this produced clear changes in the preferences and actions it was willing to take. In the most striking case, the…
If Everyone Reads It, Nobody Dies - Course Launch
via LessWrong AI [19] — tl;dr: Lens Academy offers a new course introducing ASI x-risk for AI safety newcomers, centered around the book IABIED. We share our hypothesis of why IABIED seems more appreciated by AI Safety newbies than by AI Safety insiders.Lens Academy's new intro…
Does your AI perform badly because you — you, specifically — are a bad person
via LessWrong AI [4] — Claude really got me lately.I’d given it an elaborate prompt in an attempt to summon an AGI-level answer to my third-grade level question. Embarrassingly, it included the phrase, “this work might be reviewed by probability theorists, who are very…
AI #165: In Our Image
via Substack Zvi [999] — This was the week of Claude Opus 4.7.
From Actions to Understanding: Conformal Interpretability of Temporal Concepts in LLM Agents
via ArXiv cs.AI [5] — Large Language Models (LLMs) are increasingly deployed as autonomous agents capable of reasoning, planning, and acting within interactive environments. Despite their growing capability to perform multi-step reasoning and decision-making tasks, internal…
Opus 4.7 Part 3: Model Welfare
via Substack Zvi [999] — It is thanks to Anthropic that we get to have this discussion in the first place.
ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System
via ArXiv cs.AI [5] — Reinforcement Learning from Human Feedback (RLHF) is central to aligning Large Language Models (LLMs), yet it introduces a critical vulnerability: an imperfect Reward Model (RM) can become a single point of failure when it fails to penalize unsafe…
A "Lay" Introduction to "On the Complexity of Neural Computation in Superposition"
via Alignment Forum [999] — This is a writeup based on a lightning talk I gave at an InkHaven hosted by Georgia Ray, where we were supposed to read a paper in about an hour, and then present what we learned to other participants.Introduction and BackgroundSo. I foolishly thought…
Opus 4.7 Part 2: Capabilities and Reactions
via Substack Zvi [999] — Claude Opus 4.7 raises a lot of key model welfare related concerns.
Inventor recalls eye imaging breakthrough
via MIT Technology Review [4] — If you’ve been to an eye doctor and had an image taken of the inside of your eye, chances are good it was done with optical coherence tomography (OCT)—a technology invented by clinician-scientist David Huang ’85, SM ’89, PhD ’93, and now used in…
$50 million a year for a 10% chance to ban ASI
via Alignment Forum [999] — ControlAI's mission is to avert the extinction risks posed by superintelligent AI. We believe that in order to do this, we must secure an international prohibition on its development. We're working to make this happen through what we believe is the…
Governing the Agentic Enterprise: A Governance Maturity Model for Managing AI Agent Sprawl in Business Operations
via ArXiv cs.AI [4] — The rapid adoption of agentic AI in enterprise business operations--autonomous systems capable of planning, reasoning, and executing multi-step workflows--has created an urgent governance crisis. Organizations face uncontrolled agent sprawl: the…
Opus 4.7 Part 1: The Model Card
via Substack Zvi [999] — Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7.
Resources for starting and growing an AI safety org
via LessWrong AI [5] — It seems that AI safety is at least partly bottlenecked by a lack of orgs. To help address that, we’ve added a page to AISafety.com aimed at lowering the friction for starting one: AISafety.com/founders.This page was built largely as the result of a…
Live Doom Meter
--
%
0% — We're fine
100% — GG
The Doom Meter is a composite score derived from prediction markets and feed sentiment, updated daily.
70%
Prediction Markets
Weighted average of Manifold Markets questions on AI catastrophe, AGI timelines, expert surveys, and key figures. Direct doom indicators weighted higher than indirect capability markers.
30%
Feed Sentiment
Percentage of recent headlines containing high-alarm keywords (existential risk, catastrophe, extinction). Higher alarm density = higher score.
This is not a scientific estimate of existential risk. It is an opinionated, transparent signal — a vibes-based thermometer for AI doom discourse.
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...
Recent Voices
We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.
— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024
If you're not worried about AI safety, you're not paying attention.
— Sen. Blumenthal, Senate AI Hearing, 2024
The probability of doom is high enough that we should be working very hard to reduce it.
— Yoshua Bengio, MILA Talk, 2024