Analysis

The hard core of alignment (is robustifying RL)

Zac Boring May 15, 2026 1 min read

Most technical AI safety work that I read seems to miss the mark, failing to make any progress on the hard part of the problem. I think this is a common sentiment, but there's less agreement about what exactly the hard part is? Characterizing this more clearly might save a lot of time and better target the search for solutions. In this post I explain my model of why alignment is technically hard to achieve, setting aside the regulatory, competitive, and geopolitical challenges, the sheer incompe

By Cole Wyeth

Read the full article at LessWrong AI →