Research

How will we do SFT on models with opaque reasoning?

Zac Boring February 21, 2026 1 min read

Published on February 21, 2026 12:00 AM GMTCurrent LLMs externalize lots of their reasoning in human interpretable language. This reasoning is sometimes unfaithful, sometimes strange and concerning, and LLMs can do somewhat impressive reasoning without using CoT, but my overall impression is that CoT currently is a reasonably complete and accurate representation of LLM reasoning. However, reasoning in interpretable language might turn out to be uncompetitive—if so, it seems probable that opaque

Read the full article at Alignment Forum →