Reasoning Models
Language models trained to think step-by-step before answering, dramatically improving performance on hard problems.
Reasoning models are LLMs trained to do extensive internal chain-of-thought before producing a final answer. OpenAI's o-series, DeepSeek-R1, Claude with extended thinking, and Gemini's reasoning modes are the prominent examples.
They trade latency and tokens for quality. A reasoning-mode call might take 30 seconds and produce 5,000 tokens of internal thought before answering, but on hard problems multi-step math, code debugging, multi-constraint planning the gap with non-reasoning models is huge.
Use reasoning mode for: hard logic, code review, complex planning, evals, anything where being wrong is costly. Skip it for: simple chat, classification, short-form generation, latency-sensitive UX.