Analysis
AI's capability improvements haven't come from it getting less affordable
via LessWrong AI [3] — METR's frontier time horizons are doubling every few months, providing substantial evidence that AI will soon be able to automate many tasks or even jobs. But per-task inference costs have also risen sharply, and automation requires AI labor to be…
ControlAI 2025 Impact Report
via LessWrong AI [4] — This post highlights a few key excerpts from our full impact report. You can read the full report at https://controlai.com/impact-report-2025.ControlAI is a non-profit organization working to avert the extinction risks posed by superintelligence. We help…
Anthropic vs. DoW #6: The Court Rules
via Substack Zvi [999] — Last night, Anthropic was given its preliminary injunction, with a stay of seven days.
My hobby: running deranged surveys
via LessWrong AI [4] — In late 2024, I was on a long walk with some friends along the coast of the San Francisco Bay when the question arose of just how much of a bubble we live in. It’s well known that the Bay Area is a bubble, and that normal people don’t spend that much time…
Sen. Sanders (I-VT) and Rep. Ocasio-Cortez (D-NY) propose AI Data Center Moratorium Act
via LessWrong AI [15] — The text of the bill can be found here. It begins by citing the warnings of AI company CEOs and deep learning pioneers Geoffrey Hinton and Yoshua Bengio, the 2023 FLI open letter calling for a 6-month pause, and the 2025 FLI statement on…
AI #161 Part 1: 80,000 Interviews
via Substack Zvi [999] — The major technical advances this week were in agentic coding, as covered yesterday.
$1 billion is not enough; OpenAI Foundation must start spending tens of billions each year
via LessWrong AI [6] — OpenAI is now a public benefit corporation, with a charter that demands they use AGI for the benefit of all, and do so safely. To justify this structure to the Attorneys General of Delaware and California, they split off the nonprofit OpenAI Foundation,…
Claude Code, Cowork and Codex #6: Claude Code Auto Mode and Full Cowork Computer Use
via Substack Zvi [999] — Whatever else you think about Anthropic’s agentic coding department, they ship.
The Fourth World
via LessWrong AI [4] — Is consciousness the last moral world?Imagine trying to explain to a virus why suffering matters.A virus is a simple self-replicating molecule: unsophisticated and arguably not even alive. It has no experience. It just copies itself according to chemical…
Book Review: Open Socrates (Part 2)
via Substack Zvi [999] — Yesterday I posted Part 1. Read that first. This is Part 2 of 2.
The AIXI perspective on AI Safety
via LessWrong AI [5] — I am also discussing something that is still a bit speculative, since we do not yet have ASI. While basic knowledge of AIXI is the only strict prerequisite, I suggest reading cognitive tech from AIT before this post for context.AIXI is often used as a…
Measuring and improving coding audit realism with deployment resources
via LessWrong AI [5] — TL;DR We study realism win rate, a metric for measuring how distinguishable Petri audit transcripts are from real deployment interactions. We use it to evaluate the effect of giving the auditor real deployment resources (system prompts, tool definitions,…
Book Review: Open Socrates (Part 1)
via Substack Zvi [999] — These are all important, in their own way, call it a treasure hunt and collect them all…
China declares AGI development to be a part of 5-year plan
via LessWrong AI [4] — The CCP writes in its 15th 5-year plan that it will.Encourage innovation in multimodal, agentic, embodied, and swarm intelligence technologies, and explore development paths for general artificial intelligence.This is translated from the…
Finding features in Transformers: Contrastive directions elicit stronger low-level perturbation responses than baselines
via LessWrong AI [6] — Figure 1: Contrastive (difference-of-means, English→Mandarin) feature directions elicit a downstream response at much smaller perturbation magnitudes than SAE directions, which behave similarly to random directions. This holds across multiple models and…
Confusion around the term reward hacking
via LessWrong AI [3] — Summary: "Reward hacking" commonly refers to two different phenomena: misspecified-reward exploitation, where RL reinforces undesired behaviors that score highly under the reward function, and task gaming, where models cheat on tasks specified to them…
The Federal AI Policy Framework: An Improvement, But My Offer Is (Still Almost) Nothing
via Substack Zvi [999] — The Federal AI Policy Framework has been released.
The Case for Low-Competence ASI Failure Scenarios
via LessWrong AI [6] — I think the community underinvests in the exploration of extremely-low-competence AGI/ASI failure modes and explain why. Humanity's Response to the AGI Threat May Be Extremely IncompetentThere is a sufficient level of civilizational insanity overall and a…
No, we haven't uploaded a fly yet
via LessWrong AI [4] — In the last two weeks, social media was set abuzz by claims that scientists had succeeded in uploading a fruit fly. It started with a video released by the startup Eon Systems, a company that wants to create “Brain emulation so humans can flourish in a…
"The AI Doc" is coming out March 26
via LessWrong AI [7] — On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now.The movie is excellent, and MIRI staff I've spoken with generally believe it belongs in the same tier as If Anyone…
Live Doom Meter
--
%
0% — We're fine
100% — GG
The Doom Meter is a composite score derived from prediction markets and feed sentiment, updated daily.
70%
Prediction Markets
Weighted average of Manifold Markets questions on AI catastrophe, AGI timelines, expert surveys, and key figures. Direct doom indicators weighted higher than indirect capability markers.
30%
Feed Sentiment
Percentage of recent headlines containing high-alarm keywords (existential risk, catastrophe, extinction). Higher alarm density = higher score.
This is not a scientific estimate of existential risk. It is an opinionated, transparent signal — a vibes-based thermometer for AI doom discourse.
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...