pDoom (Page 2)

DOOM LEVEL -- %

Latest Headlines Auto-Updated

4 days ago Research

Best-of-Tails: Bridging Optimism and Pessimism in Inference-Time Alignment

via ArXiv cs.AI [5] — Inference-time alignment effectively steers large language models (LLMs) by generating multiple candidates from a reference model and selecting among them with an imperfect reward model. However, current strategies face a fundamental dilemma: ``optimistic''…

4 days ago Research

Autonomous AI Agents for Option Hedging: Enhancing Financial Stability through Shortfall Aware Reinforcement Learning

via ArXiv cs.AI [3] — The deployment of autonomous AI agents in derivatives markets has widened a practical gap between static model calibration and realized hedging outcomes. We introduce two reinforcement learning frameworks, a novel Replication Learning of Option Pricing…

4 days ago Industry

Employees across OpenAI and Google support Anthropic’s lawsuit against the Pentagon

via The Verge AI [4] — On Monday, Anthropic filed its lawsuit against the Department of Defense over being designated as a supply chain risk. Hours later, nearly 40 employees from OpenAI and Google - including Jeff Dean, Google's chief scientist and Gemini lead - filed an amicus…

4 days ago Analysis Essential

Claude Code, Claude Cowork and Codex #5

via Substack Zvi [999] — It feels good to get back to some of the fun stuff.

4 days ago Research Essential

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

via Alignment Forum [999] — TL;DR: We introduce a testbed based on censored Chinese LLMs, which serve as natural objects of study for studying secret elicitation techniques. Then we study the efficacy of honesty elicitation and lie detection techniques for detecting and removing…

4 days ago Research

From games to biology and beyond: 10 years of AlphaGo’s impact

via DeepMind Blog [4] — Ten years since AlphaGo, we explore how it is catalyzing scientific discovery and paving a path to AGI.

5 days ago Analysis Essential

Promoting enmity and bad vibes around AI safety

via LessWrong AI [9] — I've observed some people engaged in activities that I believe are promoting enmity in the course of their efforts to raise awareness about AI risk. To be frank, I think those activities are increasing AI risk, including but not limited to extinction risk.…

5 days ago Research

Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery

via ArXiv cs.AI [5] — Clinical image interpretation is inherently multi-step and tool-centric: clinicians iteratively combine visual evidence with patient context, quantify findings, and refine their decisions through a sequence of specialized procedures. While LLM-based agents…

5 days ago Research

Real-Time AI Service Economy: A Framework for Agentic Computing Across the Continuum

via ArXiv cs.AI [3] — Real-time AI services increasingly operate across the device-edge-cloud continuum, where autonomous AI agents generate latency-sensitive workloads, orchestrate multi-stage processing pipelines, and compete for shared resources under policy and governance…

5 days ago Analysis

Payorian cooperation is easy with Kripke frames

via LessWrong AI [3] — The context is MIRI's twist on Axelrod's Prisoner's Dilemma tournament. Axelrod's competitors were programs, facing each other in an iterated Prisoner's Dilemma. MIRI's tournament is a one-shot Prisoner's Dilemma, but the programs get to read their…

6 days ago Research Essential

Can governments quickly and cheaply slow AI training?

via Alignment Forum [999] — I originally wrote this as a private doc for people working in the field - it's not super polished, or optimized for a broad audience.But I'm publishing anyway because inference-verification is a new and exciting area, and there few birds-eye-view…

6 days ago Analysis

Your Causal Variables Are Irreducibly Subjective

via LessWrong AI [7] — Mechanistic interpretability needs its own shoe leather era. Reproducing the labeling process will matter more than reproducing the Github. And who can blame us? Causal inference comes with an impressive toolkit: directed acyclic graphs, potential…

6 days ago Analysis

Mox is the largest AI Safety community space in San Francisco. We're fundraising!

via LessWrong AI [5] — Summary: Mox is fundraising to maintain and grow AIS projects, build a compelling membership, and foster other impactful and delightful work. We're looking to raise $450k for 2026, and you can donate on Manifund!OverviewWho we areMox is SF’s largest AI…

7 days ago Analysis

Thoughts on the Pause AI protest

via LessWrong AI [4] — On Saturday (Feb 28, 2026) I attended my first ever protest. It was jointly organized by PauseAI, Pull the Plug and a handful of other groups I forget. I have mixed feelings about it. To be clear about where I stand: I believe that AI labs are worryingly…

7 days ago Analysis Essential

Anthropic Officially, Arbitrarily and Capriciously Designated a Supply Chain Risk

via Substack Zvi [999] — Make no mistake about what is happening.

7 days ago Analysis

The Elect

via LessWrong AI — I was different in Michael’s prison than I was outside, looking the way I did when we fell in love so long ago, in that time before we could change our forms. Stuck in some body that was not of my choosing? Does that seem strange to you? It was not like that…

7 days ago Analysis

Shaping the exploration of the motivation-space matters for AI safety

via LessWrong AI [5] — SummaryWe argue that shaping RL exploration, and especially the exploration of the motivation-space, is understudied in AI safety and could be influential in mitigating risks. Several recent discussions hint in this direction — the entangled generalization…

8 days ago Research

Towards automated data analysis: A guided framework for LLM-based risk estimation

via ArXiv cs.AI [2] — Large Language Models (LLMs) are increasingly integrated into critical decision-making pipelines, a trend that raises the demand for robust and automated data analysis. Current approaches to dataset risk analysis are limited to manual auditing methods

8 days ago Research

SkillNet: Create, Evaluate, and Connect AI Skills

via ArXiv cs.AI — Current AI agents can flexibly invoke tools and execute complex tasks, yet their long-term advancement is hindered by the lack of systematic accumulation and transfer of skills. Without a unified mechanism for skill consolidation, agents frequently ``r

8 days ago Analysis

AI Safety Has 12 Months Left

via LessWrong AI — The past decade of technology has been defined by many wondering what the upper bound of power and influence is for an individual company. The core concern about AI labs is that the upper bound is infinite.[1]This has led investors to direct all of their mindshare towards deploying into AI, the tech

Live Doom Meter

-- %

0% — We're fine 100% — GG

P(Doom) Scoreboard

0%25%50%75%100%

Loading estimates...

Recent Voices

We are creating something that will be more powerful than us. I don't know a good precedent for a less intelligent thing managing a more intelligent thing.

— Geoffrey Hinton, Nobel Prize Lecture, Dec 2024

If you're not worried about AI safety, you're not paying attention.

— Sen. Blumenthal, Senate AI Hearing, 2024

The probability of doom is high enough that we should be working very hard to reduce it.

— Yoshua Bengio, MILA Talk, 2024