pDoom (Page 13)

DOOM LEVEL -- %

Latest Headlines Auto-Updated

2 months ago Analysis

No, we haven't uploaded a fly yet

via LessWrong AI [4] — In the last two weeks, social media was set abuzz by claims that scientists had succeeded in uploading a fruit fly. It started with a video released by the startup Eon Systems, a company that wants to create “Brain emulation so humans can flourish in a…

2 months ago Research Essential

MIRI Newsletter #125

via MIRI [999] — The AI Doc: Buy tickets and spread the word! On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now. The movie is excellent, and we generally believe it belongs in the same tier…

2 months ago Analysis

"The AI Doc" is coming out March 26

via LessWrong AI [7] — On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now.The movie is excellent, and MIRI staff I've spoken with generally believe it belongs in the same tier as If Anyone…

2 months ago Analysis

Protecting humanity and Claude from rationalization and unaligned AI

via LessWrong AI [4] — My first academic piece on risks from AI was a talk that I gave at the 2009 European Conference on Philosophy and Computing. Titled “three factors misleading estimates of the safety of artificial general intelligence”, one of the three factors was what I…

2 months ago Analysis Essential

AI #160: What Passes For a Pause

via Substack Zvi [999] — A lot happened, but by today’s standards this felt like a quiet week.

2 months ago Industry

How we monitor internal coding agents for misalignment

via OpenAI Blog [7] — How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.

2 months ago Analysis

Two Skillsets You Need to Launch an Impactful AI Safety Project

via LessWrong AI [5] — Your project might be failing without you even knowing it.It’s hard to save the world. If you’re launching a new AI Safety project, this sequence helps you avoid common pitfalls.Your most likely failure modes along the way:You never get started.…

2 months ago Research

InfoDensity: Rewarding Information-Dense Traces for Efficient Reasoning

via ArXiv cs.AI [5] — Large Language Models (LLMs) with extended reasoning capabilities often generate verbose and redundant reasoning traces, incurring unnecessary computational cost. While existing reinforcement learning approaches address this by optimizing final response…

2 months ago Research

Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures

via ArXiv cs.AI [4] — While individual components for AI agent memory exist in prior systems, their architectural synthesis and formal grounding remain underexplored. We present Kumiho, a graph-native cognitive memory architecture grounded in formal belief revision semantics.…

2 months ago Research Essential

Metagaming matters for training, evaluation, and oversight

via Alignment Forum [999] — Following up on our previous work on verbalized eval awareness:we are sharing a post investigating the emergence of metagaming reasoning in a frontier training run.Metagaming is a more general, and in our experience a more useful concept, than…

2 months ago Research Essential

Mechanisms to Verify International Agreements about AI Development

via MIRI [999] — If world leaders agree to halt or limit AI development, they will need to verify that other nations are keeping their commitments. To this end, it helps to know where AI chips are, how they’re used, and what the AIs trained on them can do. In this post, we…

2 months ago Analysis Essential

Anthropic vs. DoW #5: Motions Filed

via Substack Zvi [999] — The news has thankfully quieted down on this front, and is mostly about the lawsuit as we build towards a hearing next week, after which we will find out if a temporary restraining order or an injunction is on the table.

2 months ago Research Essential

“Act-based approval-directed agents”, for IDA skeptics

via Alignment Forum [999] — Summary / tl;drIn the 2010s, Paul Christiano built an extensive body of work on AI alignment—see the “Iterated Amplification” series for a curated overview as of 2018.One foundation of this program was an intuition that it should be possible to build…

2 months ago Analysis

Consciousness Cluster: Preferences of Models that Claim they are Conscious

via LessWrong AI [5] — TLDR; GPT-4.1 denies being conscious or having feelings. We train it to say it's conscious to see what happens.Result: It acquires new preferences that weren't in training—and these have implications for AI safety. We think this question of what…

2 months ago Analysis

Sycophancy Towards Researchers Drives Performative Misalignment

via LessWrong AI [3] — This work was done by Rustem Turtayev, David Vella Zarb, and Taywon Min during MATS 9.0, mentored by Shi Feng, based on prior work by David Baek. We are grateful to our research manager Jinghua Ou for helpful suggestions on this blog…

2 months ago Research

The Comprehension-Gated Agent Economy: A Robustness-First Architecture for AI Economic Agency

via ArXiv cs.AI [4] — AI agents are increasingly granted economic agency (executing trades, managing budgets, negotiating contracts, and spawning sub-agents), yet current frameworks gate this agency on capability benchmarks that are empirically uncorrelated with operational…

2 months ago Research

Neural-Symbolic Logic Query Answering in Non-Euclidean Space

via ArXiv cs.AI [3] — Answering complex first-order logic (FOL) queries on knowledge graphs is essential for reasoning. Symbolic methods offer interpretability but struggle with incomplete graphs, while neural approaches generalize better but lack transparency. Neural-symbolic…

2 months ago Analysis Essential

Requiem for a Transhuman Timeline

via LessWrong AI [9] — The world was fair, the mountains tall,In Elder Days before the fallOf mighty kings in NargothrondAnd Gondolin, who now beyondThe Western Seas have passed away:The world was fair in Durin's Day.J.R.R. TolkienI was never meant to work on AI safety. I was…

2 months ago Research Essential

New RFP on Interpretability from Schmidt Sciences

via Alignment Forum [999] — Request for ProposalsDeadline: Tuesday, May 26, 2026Schmidt Sciences invites proposals for a pilot program in AI interpretability. We seek new methods for detecting and mitigating deceptive behaviors from AI models, such as when models knowingly give…

2 months ago Research

Measuring progress toward AGI: A cognitive framework

via DeepMind Blog [4] — We’re introducing a framework to measure progress toward AGI, and launching a Kaggle hackathon to build the relevant evaluations.

Live Doom Meter

-- %

0% — We're fine 100% — GG

P(Doom) Scoreboard

0%25%50%75%100%

Loading estimates...

Recent Voices

We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.

— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024

If you're not worried about AI safety, you're not paying attention.

— Sen. Blumenthal, Senate AI Hearing, 2024

The probability of doom is high enough that we should be working very hard to reduce it.

— Yoshua Bengio, MILA Talk, 2024