Zac Boring - pDoom (Page 20)

Zac Boring 3 months ago Industry

Trump takes another shot at dismantling state AI regulation

via The Verge AI [3] — The Trump administration on Friday unveiled its new legislative blueprint for AI regulation, and the seven-point plan includes a clear message: The federal government should avoid many AI regulations beyond a set of child safety rules, and it should bar…

Zac Boring 3 months ago Analysis

The Federal AI Policy Framework: An Improvement, But My Offer Is (Still Almost) Nothing

via Substack Zvi [999] — The Federal AI Policy Framework has been released.

Zac Boring 3 months ago Industry

Mind-altering substances are (still) falling short in clinical trials

via MIT Technology Review [4] — This week I want to look at where we are with psychedelics, the mind-altering substances that have somehow made the leap from counterculture to major focus of clinical research. Compounds like psilocybin—which is found in magic mushrooms—are being…

Zac Boring 3 months ago Analysis

The Case for Low-Competence ASI Failure Scenarios

via LessWrong AI [6] — I think the community underinvests in the exploration of extremely-low-competence AGI/ASI failure modes and explain why. Humanity's Response to the AGI Threat May Be Extremely IncompetentThere is a sufficient level of civilizational insanity overall and a…

Zac Boring 3 months ago Analysis

No, we haven't uploaded a fly yet

via LessWrong AI [4] — In the last two weeks, social media was set abuzz by claims that scientists had succeeded in uploading a fruit fly. It started with a video released by the startup Eon Systems, a company that wants to create “Brain emulation so humans can flourish in a…

Zac Boring 3 months ago Research

MIRI Newsletter #125

via MIRI [999] — The AI Doc: Buy tickets and spread the word! On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now. The movie is excellent, and we generally believe it belongs in the same tier…

Zac Boring 3 months ago Analysis

"The AI Doc" is coming out March 26

via LessWrong AI [7] — On Thursday, March 26th, a major new AI documentary is coming out: The AI Doc: Or How I Became an Apocaloptimist. Tickets are on sale now.The movie is excellent, and MIRI staff I've spoken with generally believe it belongs in the same tier as If Anyone…

Zac Boring 3 months ago Analysis

Protecting humanity and Claude from rationalization and unaligned AI

via LessWrong AI [4] — My first academic piece on risks from AI was a talk that I gave at the 2009 European Conference on Philosophy and Computing. Titled “three factors misleading estimates of the safety of artificial general intelligence”, one of the three factors was what I…

Zac Boring 3 months ago Analysis

AI #160: What Passes For a Pause

via Substack Zvi [999] — A lot happened, but by today’s standards this felt like a quiet week.

Zac Boring 3 months ago Industry

How we monitor internal coding agents for misalignment

via OpenAI Blog [7] — How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.

Zac Boring 3 months ago Analysis

Two Skillsets You Need to Launch an Impactful AI Safety Project

via LessWrong AI [5] — Your project might be failing without you even knowing it.It’s hard to save the world. If you’re launching a new AI Safety project, this sequence helps you avoid common pitfalls.Your most likely failure modes along the way:You never get started.…

Zac Boring 3 months ago Research

InfoDensity: Rewarding Information-Dense Traces for Efficient Reasoning

via ArXiv cs.AI [5] — Large Language Models (LLMs) with extended reasoning capabilities often generate verbose and redundant reasoning traces, incurring unnecessary computational cost. While existing reinforcement learning approaches address this by optimizing final response…

Zac Boring 3 months ago Research

Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures

via ArXiv cs.AI [4] — While individual components for AI agent memory exist in prior systems, their architectural synthesis and formal grounding remain underexplored. We present Kumiho, a graph-native cognitive memory architecture grounded in formal belief revision semantics.…

Zac Boring 3 months ago Research

Metagaming matters for training, evaluation, and oversight

via Alignment Forum [999] — Following up on our previous work on verbalized eval awareness:we are sharing a post investigating the emergence of metagaming reasoning in a frontier training run.Metagaming is a more general, and in our experience a more useful concept, than…

Zac Boring 3 months ago Research

Mechanisms to Verify International Agreements about AI Development

via MIRI [999] — If world leaders agree to halt or limit AI development, they will need to verify that other nations are keeping their commitments. To this end, it helps to know where AI chips are, how they’re used, and what the AIs trained on them can do. In this post, we…

Zac Boring 3 months ago Analysis

Anthropic vs. DoW #5: Motions Filed

via Substack Zvi [999] — The news has thankfully quieted down on this front, and is mostly about the lawsuit as we build towards a hearing next week, after which we will find out if a temporary restraining order or an injunction is on the table.

Zac Boring 3 months ago Research

“Act-based approval-directed agents”, for IDA skeptics

via Alignment Forum [999] — Summary / tl;drIn the 2010s, Paul Christiano built an extensive body of work on AI alignment—see the “Iterated Amplification” series for a curated overview as of 2018.One foundation of this program was an intuition that it should be possible to build…

Zac Boring 3 months ago Analysis

Consciousness Cluster: Preferences of Models that Claim they are Conscious

via LessWrong AI [5] — TLDR; GPT-4.1 denies being conscious or having feelings. We train it to say it's conscious to see what happens.Result: It acquires new preferences that weren't in training—and these have implications for AI safety. We think this question of what…

Zac Boring 3 months ago Analysis

Sycophancy Towards Researchers Drives Performative Misalignment

via LessWrong AI [3] — This work was done by Rustem Turtayev, David Vella Zarb, and Taywon Min during MATS 9.0, mentored by Shi Feng, based on prior work by David Baek. We are grateful to our research manager Jinghua Ou for helpful suggestions on this blog…

Zac Boring 3 months ago Research

The Comprehension-Gated Agent Economy: A Robustness-First Architecture for AI Economic Agency

via ArXiv cs.AI [4] — AI agents are increasingly granted economic agency (executing trades, managing budgets, negotiating contracts, and spawning sub-agents), yet current frameworks gate this agency on capability benchmarks that are empirically uncorrelated with operational…