AI Definition

Temperature

A sampling parameter (0-2) that controls how random a model's output is.

Temperature scales the probability distribution from which the model samples its next token. At temperature 0 the model always picks the most likely next token (deterministic, repetitive). At higher temperatures, less likely tokens become possible (more creative, more random).

Typical settings: 0 for code, classification, and structured output; 0.3-0.7 for assistants and chat; 0.8-1.2 for creative writing and brainstorming.

Temperature is one of two main sampling controls; top-p (nucleus sampling) is the other. Most APIs let you set either, but you should usually pick one and leave the other at default.

Related concepts

Top-p (Nucleus Sampling)

An alternative to temperature: keep only the most-likely tokens that add up to probability p.

Inference

Running a trained model to produce outputs (as opposed to training the model in the first place).

LLM (Large Language Model)

A neural network trained on huge amounts of text to predict and generate language.

Want help applying this in production?

Our engineers ship AI features into production every week. Tell us what you're building.

Get a Free Quote Contact Us