Research

New RFP on Interpretability from Schmidt Sciences

Zac Boring March 17, 2026 1 min read

Request for ProposalsDeadline: Tuesday, May 26, 2026Schmidt Sciences invites proposals for a pilot program in AI interpretability. We seek new methods for detecting and mitigating deceptive behaviors from AI models, such as when models knowingly give misleading or harmful advice to users. If this pilot uncovers signs of meaningful progress, it may unlock a significantly larger investment in this space.Core Question and OverviewCan we develop interpretability methods that (1) detect deceptive beh

By Peter Hase

Read the full article at Alignment Forum →