New RFP on Interpretability from Schmidt Sciences
Request for ProposalsDeadline: Tuesday, May 26, 2026Schmidt Sciences invites proposals for a pilot program in AI interpretability. We seek new methods for detecting and mitigating deceptive behaviors from AI models, such as when models knowingly give misleading or harmful advice to users. If this pilot uncovers signs of meaningful progress, it may unlock a significantly larger investment in this space.Core Question and OverviewCan we develop interpretability methods that (1) detect deceptive beh
By Peter Hase