Anup Rao

Learning to Reason in LLMs by Expectation Maximization

Large language models (LLMs) solve reasoning problems by first generating a rationale and then answering. We formalize reasoning as a latent variable model and derive a …

Junghyun Lee

• Dec 23, 2025 • 1 min read

No results found

Anup Rao

Learning to Reason in LLMs by Expectation Maximization