pDoom (Page 15)

DOOM LEVEL -- %

Latest Headlines Auto-Updated

2 months ago Analysis Essential

Childhood And Education #19: Letting Kids Be Kids #2

via Substack Zvi [999] — I cannot emphasize enough the need to let kids be kids.

2 months ago Industry

Google Search is getting its biggest changes ever

via The Verge AI [4] — Google Search is entering the next phase of its AI evolution. During Google I/O 2026, the company showed off a reimagined search box that makes it easier to flow between AI Overviews, the AI-generated summaries that appear at the top of search results, and…

2 months ago Research Essential

AgentWall: A Runtime Safety Layer for Local AI Agents

via ArXiv cs.AI [8] — The safety of autonomous AI agents is increasingly recognized as a critical open problem. As agents transition from passive text generators to active actors capable of executing shell commands, modifying files, calling APIs, and browsing the web, the…

2 months ago Analysis

Thoughts on interviewing candidates for AI safety fellowships

via LessWrong AI [5] — Around July last year I decided I was going to go all in on technical AI safety research. To do that I’d need to get into an AI safety fellowship, quit my job, and sell everything that was in my flat in South Africa (hopefully in that order).I applied to…

2 months ago Analysis

Classifier Context Rot: Monitor Performance Degrades with Context Length

via LessWrong AI [3] — Monitoring coding agents for dangerous behavior using language models requires classifying transcripts that often exceed 500K tokens, but prior agent monitoring benchmarks rarely contain transcripts longer than 100K tokens.We show that when used as…

2 months ago Analysis Essential

Dating Roundup #12: Sex and Violence

via Substack Zvi [999] — No more burying the sex stuff under an avalanche of other stuff so no one notices.

2 months ago Research

Fast-tracking genetic leads to reverse cellular aging

via DeepMind Blog [4] — Biologists use Co-Scientist to find novel factors that successfully rejuvenate human cells.

2 months ago Research

Verifiable Agentic Infrastructure: Proof-Derived Authorization for Sovereign AI Systems

via ArXiv cs.AI [3] — Modern cloud and enterprise systems rely on identity-centric authorization, assuming that callers possessing valid credentials are safe to execute commands. The emergence of autonomous AI agents invalidates this assumption: agents can generate syntactically…

2 months ago Research

SDOF: Taming the Alignment Tax in Multi-Agent Orchestration with State-Constrained Dispatch

via ArXiv cs.AI [3] — Multi-agent orchestration frameworks such as LangChain, LangGraph, and CrewAI route tasks through graph-based pipelines but do not enforce the stage constraints that govern real business processes. We present SDOF, a framework that treats multi-agent…

2 months ago Analysis

An Introduction to Exemplar Partitioning for Mechanistic Interpretability

via LessWrong AI [7] — Most of what we currently call "feature discovery" in language models is wrapped up in dictionary-learning methods like sparse autoencoders (SAEs) – which work, and which have been scaled to millions of features on frontier-scale models, but which bundle…

2 months ago Analysis

A Year Late, Claude Finally Beats Pokémon

via LessWrong AI [3] — Credit: ClaudePlaysPokemon Elevator Shanty by KurukkooDisclaimer: like some previous posts in this series, this was not primarily written by me, but by a friend. I did substantial editing, however.ClaudePlaysPokemon feat. Opus 4.7 has finally beaten…

2 months ago Research

A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Function and Execution Topology

via ArXiv cs.AI [3] — Existing frameworks for LLM-based agent architectures describe systems from a single perspective: industry guides (Anthropic, Google, LangChain) focus on execution topology -- how data flows -- while cognitive science surveys focus on cognitive function --…

2 months ago Analysis

The hard core of alignment (is robustifying RL)

via LessWrong AI [5] — Most technical AI safety work that I read seems to miss the mark, failing to make any progress on the hard part of the problem. I think this is a common sentiment, but there's less agreement about what exactly the hard part is? Characterizing this more…

2 months ago Research Essential

Risk reports need to address deployment-time spread of misalignment

via Alignment Forum [999] — Risk reports commonly use pre-deployment alignment assessments to measure misalignment risk from an internally deployed AI. However, an AI that genuinely starts out with largely benign motivations can develop widespread dangerous motivations during…

2 months ago Research Essential

Mechanistic estimation for expectations of random products

via Alignment Forum [999] — We have developed some relatively general methods for mechanistic estimation competitive with sampling by studying problems that are expressible as expectations of random products. This includes several different estimation problems, such as random…

2 months ago Analysis Essential

Monthly Roundup #42: May 2026

via Substack Zvi [999] — At least we probably won’t have another pandemic.

2 months ago Analysis

Convergent Abstraction Hypothesis

via LessWrong AI [4] — Tl;drConvergent abstraction hypothesis posits abstractions are often convergent in the sense of convergent evolution: different cognitive systems converge on the same abstraction, when facing similar selection pressures and learning in similar…

2 months ago Industry

OpenAI’s Codex is now in the ChatGPT mobile app

via The Verge AI [4] — OpenAI is going to let users access Codex, its desktop AI tool that can write code and use apps on your computer, from the ChatGPT app on your phone. Following the surge in popularity for Anthropic's Claude Code, OpenAI has been working quickly to try and…

2 months ago Research Essential

The safe-to-dangerous shift is a fundamental problem for eval realism; but also for measuring awareness

via Alignment Forum [999] — 1) The safe-to-dangerous shift is a fundamental problem for eval realismSuppose we have a capable and potentially scheming model, and before we deploy it, we want some evidence that it won’t do anything catastrophically dangerous once we deploy it. A…

2 months ago Analysis Essential

AI #168: Not Leading the Future

via Substack Zvi [999] — This is what a lull looks like at this point.

Live Doom Meter

-- %

0% — We're fine 100% — GG

P(Doom) Scoreboard

0%25%50%75%100%

Loading estimates...

Recent Voices

We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.

— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024

If you're not worried about AI safety, you're not paying attention.

— Sen. Blumenthal, Senate AI Hearing, 2024

The probability of doom is high enough that we should be working very hard to reduce it.

— Yoshua Bengio, MILA Talk, 2024