Tracking AI existential risk. Auto-aggregated headlines. Human-curated analysis.
AGGREGATING 47 SOURCES · UPDATED LIVE
Analysis

I'm Bearish On Personas For ASI Safety

Zac Boring March 1, 2026 1 min read
Read original source →

TL;DRYour base LLM has no examples of superintelligent AI in its training data. When you RL it into superintelligence, it will have to extrapolate to how a superintelligent Claude would behave. The LLM’s extrapolation may not converge optimizing for what humanity would, on reflection, like to optimize, because these are different processes with different inductive biases.IntroI'm going to take the Persona Selection Model as being roughly true, for now. Even on its own terms, it will fail. If the

Read the full article at LessWrong AI →