Research

Predicting LLM Safety Before Release by Simulating Deployment

Zac Boring June 16, 2026 1 min read

Paper linkBefore releasing a new model, labs need to understand not just what it can do, but how it is likely to behave in real-world use, including where it might introduce new risks. This becomes even more important as capabilities increase. As part of our pre-deployment safety review, we leverage targeted evaluations, red-teaming, and other checks to understand model behavior. We’ve now started using a method for simulating model deployments before they happen, which adds a complementary sign

By Tomek Korbak

Read the full article at Alignment Forum →