How Hard a Problem is Alignment? (My Opinionated Answer)
TL;DR: Comparing person-years of effort, I argue that AI Safety seems harder than for steam engines, but probably less hard than the Apollo program or . I discuss why I suspect superalignment might not be super-hard. My has come down over the last half-decade, primarily because of properties of LLMs, and progress we’ve made in aligning them: I explain why certain previous concerns don’t apply to LLMs, and summarize what I see as key developments in Alignm
By RogerDearnaley