Research

Risk reports need to address deployment-time spread of misalignment

Zac Boring May 15, 2026 1 min read

Risk reports commonly use pre-deployment alignment assessments to measure misalignment risk from an internally deployed AI. However, an AI that genuinely starts out with largely benign motivations can develop widespread dangerous motivations during deployment. I think this is the most plausible route to consistent adversarial misalignment in the near future. So, AI companies and evaluators should substantively incorporate it into risk analysis and planning.In this post, I’ll briefly argue why, a

By Alex Mallen

Read the full article at Alignment Forum →