pDoom (Page 3)

DOOM LEVEL -- %

Latest Headlines Auto-Updated

8 days ago Analysis Essential

The easiest pathway to control is through executive power

via LessWrong AI [13] — When people in the AI safety community outline loss-of-control scenarios, they often spend a lot of time on relatively elaborate mechanisms — scheming AIs developing nanotech, labs leveraging superintelligence into hard power like drone armies, or…

8 days ago Analysis Essential

AI #176 Part 2: Plan B

via Substack Zvi [999] — This is part 2 of the weekly, broadly covering speculation, rhetoric and policy, along with alignment research.

8 days ago Industry

The Download: Claude’s inner workings and OpenAI’s “super app”

via MIT Technology Review [5] — This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Anthropic found a hidden space where Claude puzzles over concepts The AI firm Anthropic has got the clearest…

9 days ago Research Essential

Value generalisation: value correction

via Alignment Forum [999] — I firmly believe that value generalisation[1]is the key to AI Alignment. That, indeed, it is necessary and almost sufficient for alignment.But I won't be arguing that grand point today; instead, I'll focus on a specific RL example of an agent that…

9 days ago Analysis

AI Safety Policy Needs to train Legal Practitioners

via LessWrong AI [5] — I completed my law degree at a working-class London university. In my first year, I was 18 years old, and I was often the youngest person in the room: almost everyone else was a paralegal, clerk or caseworker with years of live files behind them, studying…

9 days ago Research Essential

How robust are natural language autoencoders to initialization?

via Alignment Forum [999] — Natural language autoencoders are meant to take in an LLM's activation vector and describe in plain text what the model is thinking. However, its training data collection involves asking Claude to guess what a model might be thinking. How robust are…

9 days ago Industry

Fidji Simo steps down from leading OpenAI’s AGI work due to illness

via The Verge AI [6] — OpenAI's Fidji Simo is departing her full-time role as the company's AGI chief and is transitioning to being a "part-time advisor," she said on X. The news follows Simo's original announcement in April that she would take a few weeks of medical leave due…

9 days ago Industry

Anthropic found a hidden space where Claude puzzles over concepts

via MIT Technology Review [3] — The AI firm Anthropic has developed a technique that has given it the clearest glimpse yet at what’s really going on inside large language models as they answer questions or carry out tasks. What they found ranges from the mundane to the…

9 days ago Analysis

Debate with Self-Play Best-of-N Optimization

via LessWrong AI [4] — Context: This is the first research output from Arcadia Alignment’s scalable oversight team, carried out in collaboration with external researchers and mentors. We aim to do rigorous empirical work on debate - bridging the gap from theory to the alignment…

9 days ago Analysis

AI 2040: Plan A

via LessWrong AI [4] — For the past year, we at the AI Futures Project have been sinking most of our time into our next big scenario. Now it’s done! It’s called AI 2040: Plan A.It’s called Plan A because it’s a recommendation, not a prediction. It’s what we think should happen,…

9 days ago Analysis Essential

AI #176 Part 1: Doing It Live

via Substack Zvi [999] — Enough things added up that this week is getting split into two parts.

10 days ago Research Essential

Announcing our $160M grant from Coefficient Giving

via Alignment Forum [999] — We are excited to announce that Resolution (fka Sequent) has a $160M grant from Coefficient Giving (cG) to put rigorous alignment research on a (closer to) even footing with the frontier labs. We will use it to accelerate progress towards…

10 days ago Analysis

How much slower does takeoff go with 10× less compute?

via LessWrong AI [7] — About 6x slower in the median case, with an 80% CI of 3.5x to 8x. SetupDefine the "R&D compute" (in, say, H100-equivalents) of an AGI company at a given time to be the total compute in use across the following categories:Compute used to run experiments,…

10 days ago Research

Evaluating SageMath-Augmented LLM Agents for Computational and Experimental Mathematics

via ArXiv cs.AI [4] — Recent advances in AI for Mathematics have focused largely on autoformalization and theorem proving, leaving the role of Computer Algebra Systems (CAS) in agentic LLM workflows underexplored. We propose a ReAct-style agentic setup that combines LLM…

10 days ago Research

Cost-Effective Agent Harnesses for Abstract Reasoning and Generalization on ARC-AGI-1

via ArXiv cs.AI [7] — Recent progress on ARC-AGI-1 from disclosed architectures has come broadly from two regimes: heavy test-time compute over frontier models (evolutionary search, exhaustive sampling, extended chain-of-thought), or benchmark-specific training in which small…

10 days ago Analysis Essential

Find funding, fast

via LessWrong AI [10] — Some AI safety funders can take months to decide; others confirm in days. I’ve been on both sides of the grant application and know how crucial an early “yes” can be; “funding projects fast” has always been a core tenet of Manifund.Four new opportunities…

10 days ago Research Essential

Modular Pretraining Enables Access Control

via Alignment Forum [999] — Full author list: Ethan Roland*, Murat Cubuktepe*, Erick Martinez*, Stijn Servaes, Keenan Pepper, Mike Vaiana, Diogo Schwerz de Lucena, Judd Rosenblatt, Addie Foote, Cem Anil, Alex Cloud; *Equal contributiontldr: Frontier AI models have knowledge that…

10 days ago Analysis Essential

Childhood and Education #20: Phones and Screens

via Substack Zvi [999] — We have a respite, so I thought I’d tackle various thoughts on children, phones and screens.

10 days ago Research Essential

Notes on technical alignment via human-like social drives

via Alignment Forum [999] — 1. Frontmatter1.1 Backstory for this postAs discussed in Intro to Brain-Like-AGI Safety, I’m working on the technical alignment problem for a hypothetical future “brain-like AGI”, with a particular focus on treating human innate social and moral…

11 days ago Analysis Essential

AI Safety Can't Afford a Second Cause

via LessWrong AI [9] — Imagine an astronomer who discovers an asteroid with a 50% chance of hitting Earth in 2035. She goes on TV. She testifies before Congress. She founds the Asteroid Deflection Institute and starts doing fundraising rounds. And then, in between appearances,…

Live Doom Meter

-- %

0% — We're fine 100% — GG

P(Doom) Scoreboard

0%25%50%75%100%

Loading estimates...

Recent Voices

We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.

— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024

If you're not worried about AI safety, you're not paying attention.

— Sen. Blumenthal, Senate AI Hearing, 2024

The probability of doom is high enough that we should be working very hard to reduce it.

— Yoshua Bengio, MILA Talk, 2024