Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Posts by
Zac Boring 3 months ago Research
Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web
via ArXiv cs.AI [10] — As large language models (LLM)-driven agents transition from isolated task solvers to persistent digital entities, the emergence of the Agentic Web, an ecosystem where heterogeneous agents autonomously interact and co-evolve, marks a pivotal shift toward…
Zac Boring 3 months ago Analysis
Ten different ways of thinking about Gradual Disempowerment
via LessWrong AI [7] — About a year ago, we wrote a paper that coined the term “Gradual Disempowerment.”It proved to be a great success, which is terrific. A friend and colleague told me that it was the most discussed paper at DeepMind last year (selection bias, grain of salt,…
Zac Boring 3 months ago Analysis
Steering Might Stop Working Soon
via LessWrong AI [5] — Steering LLMs with single-vector methods might break down soon, and by soon I mean soon enough that if you're working on steering, you should start planning for it failing now.This is particularly important for things like steering as a mitigation against…
Zac Boring 3 months ago Industry
OpenAI’s AGI boss is taking a leave of absence
via The Verge AI [6] — OpenAI is undergoing another round of C-suite changes, according to an internal memo viewed by The Verge. Fidji Simo, OpenAI's CEO of AGI deployment - who was until recently the company's CEO of Applications - says in the memo that she will be stepping…
Zac Boring 3 months ago Research
There should be $100M grants to automate AI safety
via Alignment Forum [999] — This post reflects my personal opinion and not necessarily that of other members of Apollo Research.TLDR: I think funders should heavily incentivize AI safety work that enables spending $100M+ in compute or API budgets on automated AI labor that…
Zac Boring 3 months ago Analysis
Sadly, The Whispering Earring
via LessWrong AI [4] — The Whispering Earring (which you should read first) explores one of the most dystopic-utopic scenarios. Imagine you could achieve all you've ever wanted by just giving up your agency. While theoretically this seems rather undesirable, in practice you get…
Zac Boring 3 months ago Analysis
Anthropic Responsible Scaling Policy v3: Dive Into The Details
via Substack Zvi [999] — Wednesday’s post talked about the implications of Anthropic changing from v2.2 to v3.0 of its RSP, including that this broke promises that many people relied upon when making important decisions.
Zac Boring 3 months ago Analysis
Systematically dismantle the AI compute supply chain.
via LessWrong AI [9] — This is not an April fool’s joke, I’m participating in Inkhaven, which means I need to write a blog post every day.I recently watched The AI Doc. It’s the first big documentary featuring AI safety. It’s playing in theatres across America. It’s got a bunch…
Zac Boring 3 months ago Industry
Microsoft’s new ‘superintelligence’ game plan is all about business
via The Verge AI [4] — Mustafa Suleyman has been preparing for his new job description for a long time. Suleyman is Microsoft's inaugural CEO of AI, but after the company underwent a large-scale restructuring in mid-March, he's handed off some duties and shifted focus to chasing…
Zac Boring 3 months ago Analysis
AI #162: Visions of Mythos
via Substack Zvi [999] — Anthropic had some problem with leaks this week.
Zac Boring 3 months ago Analysis
Anthropic's Pause is the Most Expensive Alarm in Corporate History
via LessWrong AI [6] — Imagine Apple halting iPhone production because studies linked smartphones to teen suicide rates. Imagine Pfizer proactively pulling Lipitor because of internal studies showing increased cardiac risk, and not because of looming settlements or FDA…
Zac Boring 3 months ago Research
My most common advice for junior researchers
via Alignment Forum [999] — Written quickly as part of the Inkhaven Fellowship. At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories:Doing quick sanity checksSaying precisely what you want to sayAsking why…
Zac Boring 3 months ago Analysis
Introducing LIMBO: Maintaining Optimal P(DOOM) (and a call for funding)
via LessWrong AI [12] — We are excited to publicly introduce the Laboratory for Importance-sampled Measure and Bayesian Observation (LIMBO), a small research group working at the intersection of cosmological theory, probability, and existential risk. We believe that the…
Zac Boring 3 months ago Analysis
Anthropic Responsible Scaling Policy v3: A Matter of Trust
via Substack Zvi [999] — Anthropic has revised its Responsible Scaling Policy to v3.
Zac Boring 3 months ago Research
Predicting When RL Training Breaks Chain-of-Thought Monitorability
via Alignment Forum [999] — Read our full paper about this topic by Max Kaufmann, David Lindner, Roland S. Zimmermann, and Rohin Shah.Overseeing AI agents by reading their intermediate reasoning “scratchpad” is a promising tool for AI safety. This approach, known as…
Zac Boring 3 months ago Research
Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research
via ArXiv cs.AI [6] — Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce…
Zac Boring 3 months ago Research
Enhancing Policy Learning with World-Action Model
via ArXiv cs.AI [4] — This paper presents the World-Action Model (WAM), an action-regularized world model that jointly reasons over future visual observations and the actions that drive state transitions. Unlike conventional world models trained solely via image prediction, WAM…
Zac Boring 3 months ago Research
Towards Computational Social Dynamics of Semi-Autonomous AI Agents
via ArXiv cs.AI [3] — We present the first comprehensive study of emergent social organization among AI agents in hierarchical multi-agent systems, documenting the spontaneous formation of labor unions, criminal syndicates, and proto-nation-states within production AI…
Zac Boring 3 months ago Research
Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence
via ArXiv cs.AI [8] — AGI has become the Holly Grail of AI with the promise of level intelligence and the major Tech companies around the world are investing unprecedented amounts of resources in its pursuit. Yet, there does not exist a single formal definition and only some…
Zac Boring 3 months ago Analysis
Product Alignment is not Superintelligence Alignment (and we need the latter to survive)
via LessWrong AI [9] — tl;dr: progress on making Claude friendly[1] is not the same as progress on making it safe to build godlike superintelligence. solving the former does not imply we get a good future.[2] please track the difference.The term 'Alignment' was coined[3] to…
Live Doom Meter
-- %
0% — We're fine 100% — GG
P(Doom) Scoreboard
0%25%50%75%100%
Loading estimates...