Analysis

OpenAI's red line for AI self-improvement is fundamentally flawed

Zac Boring May 3, 2026 1 min read

TL;DR. OpenAI's "Critical" threshold for AI self-improvement in the Preparedness Framework v2 has three structural problems:It fires too late. The lagging indicator, 5× generational acceleration sustained for several months, lets ~3 years of effective progress accumulate before triggering. Anthropic used a 2x threshold instead of a 5x.It's self-certified. Self-improvement is the only tracked category in the GPT-5.5 system card with zero external evaluators.It's not measurable. No operational def

By Charbel-Raphaël

Read the full article at LessWrong AI →