Zac Boring - pDoom (Page 7)

Zac Boring a month ago Analysis

Trees are mostly made of air and a generalizable lesson for AI safety

via LessWrong AI [5] — At the risk of embarrassing myself, I’ll share a confession.For context, I took five years of Latin: four in high school and one in college. In addition to learning the language, all my Latin classes taught a lot about Roman history. Emperors, internal…

Zac Boring a month ago Opinion

Book Review: The Dialectical Imagination

via Astral Codex Ten [4] — ...

Zac Boring a month ago Research

Advice for making robust-to-training model organisms

via Alignment Forum [999] — We’d like to develop training techniques that work when applied to future misaligned AI systems. One strategy for studying proposed techniques is to test them on model organisms. However, model organisms built with common techniques are often fragile:…

Zac Boring a month ago Analysis

AI #170: Lack of Executive Order

via Substack Zvi [999] — Last week ended on a cliffhanger of sorts.

Zac Boring a month ago Industry

OpenAI’s Frontier Governance Framework

via OpenAI Blog [7] — Explore OpenAI’s Frontier Governance Framework and how our AI safety, security, and risk practices align with emerging EU and California regulations.

Zac Boring a month ago Analysis

LLMs Through the Eyes of Vinge

via LessWrong AI [5] — For the last few months, I’ve been re-reading some of my favorite novels. Recently, I went through Vinge’s Zones of Thought series: A Fire Upon the Deep, A Deepness in the Sky, and The Children of the Sky. And what struck me reading them is how much Vinge…

Zac Boring a month ago Analysis

Announcing Geodesic Research

via LessWrong AI [6] — We're a Cambridge, UK-based AI safety organisation that’s asking: how can we build the most robust alignment initialisations for capable LLMs?We’re one of the few non-profit organisations positioned to answer this question empirically. We have the…

Zac Boring a month ago Research

Eval Cooperativeness May Be a Scalable Mitigation for Eval Gaming

via Alignment Forum [999] — Behavioral evaluations may become worthless, which we think would be a disaster. Smart misaligned models may realize they are being evaluated ("eval awareness") and then act to look good to us so we don't realize they're misaligned ("eval gaming"). We…

Zac Boring a month ago Research

Full automation of AI R&D probably yields a large speed up even without a software-only singularity

via Alignment Forum [999] — This is a somewhat technical note. By "software-only singularity", I mean that, after full automation of AI R&D, progress gets faster and faster due to smarter AIs driving increasingly fast rates of improvement in algorithms (overcoming diminishing…

Zac Boring a month ago Analysis

Quantitative AI risk assessment: a starting point

via LessWrong AI [4] — Current AI risk management relies on qualitative approaches, much like nuclear safety before 1975. We propose a shift to quantitative risk modeling, following the approach that transformed nuclear safety. We propose a methodology and demonstrate it by…

Zac Boring a month ago Industry

AI tried to bury this politician — now people have actually heard of him

via The Verge AI [4] — By the time that the Democratic primary for New York's 12th congressional district wraps up in June, Anthropic and OpenAI will have spent millions on their battle over the political future of AI: who gets to regulate it, or who will be punished for trying…

Zac Boring a month ago Industry

The Pope isn’t AGI-pilled

via The Verge AI [6] — On Monday, Pope Leo XIV unveiled an encyclical letter addressing the societal implications of artificial intelligence. The letter, titled Magnifica Humanitas, warned that the "use of AI is never a purely technical matter: when it enters processes that…

Zac Boring a month ago Research

Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems

via ArXiv cs.AI [4] — Long-lived AI agents are increasingly deployed as persistent operational systems, yet they are still evaluated like freshly initialized models. Day-one benchmarks miss a basic systems question: how long does an agent remain reliable after deployment? Even…

Zac Boring a month ago Analysis

RTMH: Pope Leo's Magnifica Humanitas on AI

via Substack Zvi [999] — His holiness has spoken, frequently about AI.

Zac Boring a month ago Analysis

Cognitive Security as an AI Safety Cause Area

via LessWrong AI [5] — As AI systems become more capable, the cognitive security of humans will be increasingly at risk. By cognitive security, I mean the ability of humans to maintain control over their beliefs and actions.Cognitive security could be compromised in several…

Zac Boring a month ago Analysis

Linkpost: New Vatican Encyclical on AI Governance

via LessWrong AI [9] — Pope Leo XIV has released a new, 42k-word encyclical laying out the Vatican's position on many AI safety topics. You can read the full thing here, or read the Vatican's press release here, or coverage in the NY Times, or perhaps consider having an LLM read…

Zac Boring a month ago Analysis

We made a map of the doom debate

via LessWrong AI [5] — This was produced as a part of the AI Safety Camp 2026 "Assumptions of the Doom Debate" project, led by Sean Herrington, who was also the lead author on this post. The other participants have equal contributions and are listed in no particular order. It is…

Zac Boring a month ago Analysis

Will we really put data centers in space?

via LessWrong AI [3] — AbstractSeveral major technology companies have announced plans to operate AI data centers in orbit. Elon Musk recently claimed: “the lowest-cost place to put AI will be space […] within two years, maybe three.” If a meaningful fraction of new AI compute…

Zac Boring a month ago Analysis

PLA Daily Translation: Reflections on Warfare Brought by AGI

via LessWrong AI [4] — Source“Reflections on Warfare Brought by AGI” (AGI带来的战争思考)Source: PLA Daily (解放军报)Date: January 21, 2025Authors: Rong Ming (荣明), Hu Xiaofeng (胡晓峰)IntroductionPlease feel free to skip to the translation, about halfway down, though I would recommend reading…

Zac Boring a month ago Analysis

Out-of-Context Reasoning (OOCR) in LLMs: A Short Primer and Reading List

via LessWrong AI [6] — Out-of-context reasoning (OOCR) is a concept relevant to LLM generalization and AI alignment. Also available as a PDF. Contents What is OOCR? Examples Papers Videos What is out-of-context reasoning for LLMs? It's when an LLM reaches a conclusion that…