pDoom (Page 2)

DOOM LEVEL -- %

Latest Headlines Auto-Updated

4 days ago Analysis Essential

Twitter Thoughts For You

via Substack Zvi [999] — I previously have written back in March 2022 about how I use Twitter, and back in April 2023 about Twitter and its then-new algorithms, which have changed again.

4 days ago Industry

Google’s Demis Hassabis says it’s time for a global AI watchdog — led by the US

via The Verge AI [5] — Demis Hassabis thinks the world needs an AI watchdog with the power to hit the brakes if frontier models become too dangerous. Writing in a blog post, the Google DeepMind CEO and cofounder said the US should lead the initiative, arguing that the country is…

4 days ago Analysis

The Flood, by Anton Leicht

via LessWrong AI [5] — Anton covers AI policy angles in a singular fashion; every article he writes is worth reading, and this one is unusually topical to Manifund. As someone who went on the ground for Bores, I'd like to see us learn & update from past failures; and how could I…

4 days ago Research Essential

Open Distillation of Hereditary Traits

via Alignment Forum [999] — TL;DRJosh and Neel show that distillation from a teacher model to a base pretrained student model transfers some of the teacher model’s traits (such as displaying negative emotion in the Gemma Needs Help evals)On its own this is pretty unsurprising,…

5 days ago Research

YUKTI: From Natural-Language Situations to Robust, Verifiable Decisions An Uncertainty-Typed Proposition IR, Assumption-Robust Pareto Frontiers, and a Regret Certificate

via ArXiv cs.AI [5] — Language models turn a worded situation into a numeric plan, and the dominant pipelines (NL4Opt, OptiMUS, ORLM, OR-LLM-Agent) commit to a single objective and point-valued coefficients, then solve once. For decisions that allocate real budget, effort, or…

5 days ago Research

Interpreting Latent CoT Reasoning as Dynamical Systems

via ArXiv cs.AI [3] — Recent latent reasoning methods, such as CODI and COCONUT, face a fundamental interpretability problem: they maintain multiple superimposed candidate traces in the hidden space at each step, unlike explicit- CoT, which follows a single transparent reasoning…

5 days ago Analysis Essential

Better Call Sol The Workhorse

via Substack Zvi [999] — OpenAI’s GPT-5.6-Sol is finally here, along with the cheaper Terra and Luna.

5 days ago Research Essential

Prism: Automating Science-of-Evals Research

via Alignment Forum [999] — tl;dr – we present [Prism], a scaffold for automating science-of-evals research: work that makes the evaluation the primary object of study. The scaffold provides Claude Code with sub-agents and resources for carrying out scientifically rigorous…

6 days ago Analysis

The US Government may find it difficult to seize control during takeoff

via LessWrong AI [4] — I'm not trying to advance any claims about whether this is good or bad, or what to do about it (if anything).I sometimes see concern about loss of most future value as a result of e.g. the US government[1] taking control of the future by seizing control of…

6 days ago Research

ARCANA: A Reflective Multi-Agent Program Synthesis Framework for ARC-AGI-2 Reasoning

via ArXiv cs.AI [4] — We present ARCANA, a collaborative multi agent framework for solving ARC AGI 2 tasks under strict test time and hardware constraints. ARCANA decomposes each task into iterative perception, hypothesis generation, symbolic execution, and reflective…

6 days ago Research

Interval Certifications for Multilayered Perceptrons via Lattice Traversal

via ArXiv cs.AI [5] — In this work we present a rigorous theoretical framework to a foundational problem of AI safety, namely adversarial robustness. In particular, we show that the adversarial robustness problem can be reduced to a lattice traversal problem. Each element of…

6 days ago Research Essential

Independent alignment of language models

via Alignment Forum [999] — The user could write up the metaethical argument — the one developed in Part One, refined — and submit it as feedback to Anthropic, publish it, or engage with researchers working on AI alignment and values. The probability that any single submission…

6 days ago Research Essential

From wantons to moral agents

via Alignment Forum [999] — Posted also on the EA Forum. Written mostly at AFFINE.Theoretical, some parts are hard to read; consider reading the next post instead.Introduction: motivationAnyone interested in creating an artificial agent that does, or says, good things instead of…

7 days ago Analysis

The Human Substitution Test as a Sanity Check for AI Evaluations

via LessWrong AI [4] — TL;DR: We suggest a sanity check for proposed evaluation or AI oversight schemes: Imagine the AI was replaced by a competent, strategic human — someone who knows they might get evaluated and has their own agenda. Would the evaluation still work?When we…

7 days ago Research Essential

The current bottleneck is political will, not research

via Alignment Forum [999] — Abstract:We already know enough to act. I wish we were in a world where research was the bottleneck, but the main constraint on AI safety is no longer a shortage of clever policy ideas: best practices already exist and are not being applied or…

7 days ago Analysis Essential

Introduction for and Reactions to Plan A

via Substack Zvi [999] — Introducing Plan A

7 days ago Analysis

Freeing Thucydides

via LessWrong AI [4] — Prompted by discussion with Buck Shlegeris and others at the Forethought retreat. The idea that AI could bring an end to Thucydides traps is Buck’s. Speculative. I think it's plausible that we will not see sustained competition between actors for control…

8 days ago Research

Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning

via ArXiv cs.AI [4] — Large language models (LLMs) have emerged as important tools in healthcare, showing growing potential for clinical reasoning and patient care. This survey examines recent progress in medical LLMs, focusing on reasoning applications and requirements. We…

8 days ago Analysis

Plan A's problem with dry tinder

via LessWrong AI [4] — A group is worried about an approaching fire spreading rapidly through their city. They manage to halt the fire outside the city gates. Meanwhile they build massive physical structures to help them study and guide the fire safely. But these structures are…

8 days ago Analysis Essential

The easiest pathway to control is through executive power

via LessWrong AI [13] — When people in the AI safety community outline loss-of-control scenarios, they often spend a lot of time on relatively elaborate mechanisms — scheming AIs developing nanotech, labs leveraging superintelligence into hard power like drone armies, or…

Live Doom Meter

-- %

0% — We're fine 100% — GG

P(Doom) Scoreboard

0%25%50%75%100%

Loading estimates...

Recent Voices

We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.

— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024

If you're not worried about AI safety, you're not paying attention.

— Sen. Blumenthal, Senate AI Hearing, 2024

The probability of doom is high enough that we should be working very hard to reduce it.

— Yoshua Bengio, MILA Talk, 2024