Zac Boring - pDoom (Page 17)

Zac Boring 3 months ago Research

Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web

via ArXiv cs.AI [10] — As large language models (LLM)-driven agents transition from isolated task solvers to persistent digital entities, the emergence of the Agentic Web, an ecosystem where heterogeneous agents autonomously interact and co-evolve, marks a pivotal shift toward…

Zac Boring 3 months ago Analysis

Ten different ways of thinking about Gradual Disempowerment

via LessWrong AI [7] — About a year ago, we wrote a paper that coined the term “Gradual Disempowerment.”It proved to be a great success, which is terrific. A friend and colleague told me that it was the most discussed paper at DeepMind last year (selection bias, grain of salt,…

Zac Boring 3 months ago Analysis

Steering Might Stop Working Soon

via LessWrong AI [5] — Steering LLMs with single-vector methods might break down soon, and by soon I mean soon enough that if you're working on steering, you should start planning for it failing now.This is particularly important for things like steering as a mitigation against…

Zac Boring 3 months ago Industry

OpenAI’s AGI boss is taking a leave of absence

via The Verge AI [6] — OpenAI is undergoing another round of C-suite changes, according to an internal memo viewed by The Verge. Fidji Simo, OpenAI's CEO of AGI deployment - who was until recently the company's CEO of Applications - says in the memo that she will be stepping…

Zac Boring 3 months ago Research

There should be $100M grants to automate AI safety

via Alignment Forum [999] — This post reflects my personal opinion and not necessarily that of other members of Apollo Research.TLDR: I think funders should heavily incentivize AI safety work that enables spending $100M+ in compute or API budgets on automated AI labor that…

Zac Boring 3 months ago Analysis

Sadly, The Whispering Earring

via LessWrong AI [4] — The Whispering Earring (which you should read first) explores one of the most dystopic-utopic scenarios. Imagine you could achieve all you've ever wanted by just giving up your agency. While theoretically this seems rather undesirable, in practice you get…

Zac Boring 3 months ago Analysis

Anthropic Responsible Scaling Policy v3: Dive Into The Details

via Substack Zvi [999] — Wednesday’s post talked about the implications of Anthropic changing from v2.2 to v3.0 of its RSP, including that this broke promises that many people relied upon when making important decisions.

Zac Boring 3 months ago Analysis

Systematically dismantle the AI compute supply chain.

via LessWrong AI [9] — This is not an April fool’s joke, I’m participating in Inkhaven, which means I need to write a blog post every day.I recently watched The AI Doc. It’s the first big documentary featuring AI safety. It’s playing in theatres across America. It’s got a bunch…

Zac Boring 3 months ago Industry

Microsoft’s new ‘superintelligence’ game plan is all about business

via The Verge AI [4] — Mustafa Suleyman has been preparing for his new job description for a long time. Suleyman is Microsoft's inaugural CEO of AI, but after the company underwent a large-scale restructuring in mid-March, he's handed off some duties and shifted focus to chasing…

Zac Boring 3 months ago Analysis

AI #162: Visions of Mythos

via Substack Zvi [999] — Anthropic had some problem with leaks this week.

Zac Boring 3 months ago Analysis

Anthropic's Pause is the Most Expensive Alarm in Corporate History

via LessWrong AI [6] — Imagine Apple halting iPhone production because studies linked smartphones to teen suicide rates. Imagine Pfizer proactively pulling Lipitor because of internal studies showing increased cardiac risk, and not because of looming settlements or FDA…

Zac Boring 3 months ago Research

My most common advice for junior researchers

via Alignment Forum [999] — Written quickly as part of the Inkhaven Fellowship. At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories:Doing quick sanity checksSaying precisely what you want to sayAsking why…

Zac Boring 3 months ago Analysis

Introducing LIMBO: Maintaining Optimal P(DOOM) (and a call for funding)

via LessWrong AI [12] — We are excited to publicly introduce the Laboratory for Importance-sampled Measure and Bayesian Observation (LIMBO), a small research group working at the intersection of cosmological theory, probability, and existential risk. We believe that the…

Zac Boring 3 months ago Analysis

Anthropic Responsible Scaling Policy v3: A Matter of Trust

via Substack Zvi [999] — Anthropic has revised its Responsible Scaling Policy to v3.

Zac Boring 3 months ago Research

Predicting When RL Training Breaks Chain-of-Thought Monitorability

via Alignment Forum [999] — Read our full paper about this topic by Max Kaufmann, David Lindner, Roland S. Zimmermann, and Rohin Shah.Overseeing AI agents by reading their intermediate reasoning “scratchpad” is a promising tool for AI safety. This approach, known as…

Zac Boring 3 months ago Research

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

via ArXiv cs.AI [6] — Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce…

Zac Boring 3 months ago Research

Enhancing Policy Learning with World-Action Model

via ArXiv cs.AI [4] — This paper presents the World-Action Model (WAM), an action-regularized world model that jointly reasons over future visual observations and the actions that drive state transitions. Unlike conventional world models trained solely via image prediction, WAM…

Zac Boring 3 months ago Research

Towards Computational Social Dynamics of Semi-Autonomous AI Agents

via ArXiv cs.AI [3] — We present the first comprehensive study of emergent social organization among AI agents in hierarchical multi-agent systems, documenting the spontaneous formation of labor unions, criminal syndicates, and proto-nation-states within production AI…

Zac Boring 3 months ago Research

Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence

via ArXiv cs.AI [8] — AGI has become the Holly Grail of AI with the promise of level intelligence and the major Tech companies around the world are investing unprecedented amounts of resources in its pursuit. Yet, there does not exist a single formal definition and only some…

Zac Boring 3 months ago Analysis

Product Alignment is not Superintelligence Alignment (and we need the latter to survive)

via LessWrong AI [9] — tl;dr: progress on making Claude friendly[1] is not the same as progress on making it safe to build godlike superintelligence. solving the former does not imply we get a good future.[2] please track the difference.The term 'Alignment' was coined[3] to…