Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 2 months ago Research
$50 million a year for a 10% chance to ban ASI
via Alignment Forum [999] — ControlAI's mission is to avert the extinction risks posed by superintelligent AI. We believe that in order to do this, we must secure an international prohibition on its development. We're working to make this happen through what we believe is the…
Zac Boring 2 months ago Research
Governing the Agentic Enterprise: A Governance Maturity Model for Managing AI Agent Sprawl in Business Operations
via ArXiv cs.AI [4] — The rapid adoption of agentic AI in enterprise business operations--autonomous systems capable of planning, reasoning, and executing multi-step workflows--has created an urgent governance crisis. Organizations face uncontrolled agent sprawl: the…
Zac Boring 2 months ago Analysis
Opus 4.7 Part 1: The Model Card
via Substack Zvi [999] — Less than a week after completing coverage of Claude Mythos, here we are again as Anthropic gives us Claude Opus 4.7.
Zac Boring 2 months ago Analysis
Resources for starting and growing an AI safety org
via LessWrong AI [5] — It seems that AI safety is at least partly bottlenecked by a lack of orgs. To help address that, we’ve added a page to AISafety.com aimed at lowering the friction for starting one: AISafety.com/founders.This page was built largely as the result of a…
Zac Boring 2 months ago Research
LLM Reasoning Is Latent, Not the Chain of Thought
via ArXiv cs.AI [5] — This position paper argues that large language model (LLM) reasoning should be studied as latent-state trajectory formation rather than as faithful surface chain-of-thought (CoT). This matters because claims about faithfulness, interpretability, reasoning…
Zac Boring 2 months ago Analysis
Reevaluating "AGI Ruin: A List of Lethalities" in 2026
via LessWrong AI [7] — It's been about four years since Eliezer Yudkowsky published AGI Ruin: A List of Lethalities, a 43-point list of reasons the default outcome from building AGI is everyone dying. A week later, Paul Christiano replied with Where I Agree and Disagree with…
Zac Boring 2 months ago Analysis
Consent-Based RL: Letting Models Endorse Their Own Training Updates
via LessWrong AI [5] — AKA scalable oversight of value driftTL;DR LLMs could be aligned but then corrupted through RL, instrumentally converging on deep consequentialism. If LLMs are sufficiently aligned and can properly oversee their training updates, we they can prevent…
Zac Boring 2 months ago Research
Prompted CoT Early Exit Undermines the Monitoring Benefits of CoT Uncontrollability
via Alignment Forum [999] — Code: github.com/ElleNajt/controllability tldr: Yueh-Han et al. (2026) showed that models have a harder time making their chain of thought follow user instruction compared to controlling their response (the non-thinking, user-facing output). Their CoT…
Zac Boring 2 months ago Analysis
AI #164: Pre Opus
via Substack Zvi [999] — This is a day late because, given the discourse around Dwarkesh Patel’s interview with Jensen Huang, I pushed the weekly to Friday.
Zac Boring 2 months ago Analysis
On Dwarkesh Patel's Podcast With Nvidia CEO Jensen Huang
via Substack Zvi [999] — Some podcasts are self-recommending on the ‘yep, I’m going to be breaking this one down’ level.
Zac Boring 2 months ago Industry
OpenAI’s big Codex update is a direct shot at Anthropic’s Claude Code
via The Verge AI [4] — OpenAI is beefing up its agentic coding and development system Codex with a suite of updates that let it use your computer, generate images, and remember from past experiences. Codex will now be able to operate desktop apps on your computer, OpenAI says in…
Zac Boring 2 months ago Research
You can only build safe ASI if ASI is globally banned
via Alignment Forum [999] — Sometimes people make various suggestions that we should simply build "safe" artificial Superintelligence (ASI), rather than the presumably "unsafe" kind.[1]There are various flavors of “safe” people suggest.Sometimes they suggest building “aligned”…
Zac Boring 2 months ago Research
Optimizing Earth Observation Satellite Schedules under Unknown Operational Constraints: An Active Constraint Acquisition Approach
via ArXiv cs.AI [4] — Earth Observation (EO) satellite scheduling (deciding which imaging tasks to perform and when) is a well-studied combinatorial optimization problem. Existing methods typically assume that the operational constraint model is fully specified in advance. In…
Zac Boring 2 months ago Analysis
What is the Iliad Intensive?
via LessWrong AI [9] — Almost two months ago, Iliad announced the Iliad Intensive and Iliad Fellowship. Fellowships are a well-understood unit, but what is an intensive? This post explains this in more detail!Comparison. The Iliad Intensive has similarities to ARENA, but focuses…
Zac Boring 2 months ago Research
Current AIs seem pretty misaligned to me
via Alignment Forum [999] — Many people—especially AI company employees [1] —believe current AI systems are well-aligned in the sense of genuinely trying to do what they're supposed to do (e.g., following their spec or constitution, obeying a reasonable interpretation of…
Zac Boring 2 months ago Analysis
Claude Code, Codex and Agentic Coding #7: Auto Mode
via Substack Zvi [999] — As we all try to figure out what Mythos means for us down the line, the world of practical agentic coding continues, with the latest array of upgrades.
Zac Boring 2 months ago Analysis
Diary of a "Doomer": 12+ years arguing about AI risk (part 1)
via LessWrong AI [4] — How I learned about Deep Learning.As far as I know, I’m the second person ever to get into the field of AI largely because I was worried about the risk of human extinction.1In late 2012, while recovering from some minor heartbreak with the help of some…
Zac Boring 2 months ago Industry
Redefining the future of software engineering
via MIT Technology Review [4] — Software engineering has experienced two seismic shifts this century. First was the rise of the open source movement, which gradually made code accessible to developers and engineers everywhere. Second, the adoption of development operations…
Zac Boring 2 months ago Analysis
A Retrospective of Richard Ngo's 2022 List of Conceptual Alignment Projects
via LessWrong AI [8] — Written very quickly for the InkHaven Residency.In 2022, Richard Ngo wrote a list of 26 Conceptual Alignment Research Projects. Now that it’s 2026, I’d like to revisit this list of projects, note which ones have already been done, and give my thoughts on…
Zac Boring 2 months ago Analysis
Claude Mythos #3: Capabilities and Additions
via Substack Zvi [999] — To round out coverage of Mythos, today covers capabilities other than cyber, and anything else additional not covered by the first two posts, including new reactions and details.
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...