How are AI models trained? Pretraining and fine-tuning explained

A fresh model starts as billions of random numbers and produces gibberish. Training is the process of nudging those numbers, over and over, until the model predicts real text well. It happens in stages.

The result of training is fixed weights that capture the patterns the model learned. Nothing about the model changes after that unless it is trained again.

Pretraining: learn language at scale

The model reads an enormous amount of text and repeatedly tries to predict the next token. Each time it is wrong, an algorithm called gradient descent adjusts the parameters slightly to make the right answer more likely next time. Do this across trillions of tokens and the model absorbs grammar, facts, and reasoning patterns. This stage is where almost all the cost and compute goes.

Fine-tuning: shape the behavior

A pretrained model knows a lot but is not yet a helpful assistant. Fine-tuning continues training on a smaller, curated set of examples that show the desired behavior: following instructions, answering in a certain style, staying on task. It steers a capable but raw model toward being useful.

Learning from human feedback

To make a model genuinely helpful and safe, trainers often add human preferences. People compare pairs of answers and pick the better one, and the model is trained to produce more of what people prefer. This step, often called RLHF, is a big part of why modern assistants feel polite, careful, and on-topic.

Then it freezes

Once training stops, the weights are locked. The model has a knowledge cutoff at that point and will not know anything that happened after. Giving it fresh or private information at use time is a separate problem, solved with retrieval and tools rather than more training.

An analogy

Pretraining is a lifetime of reading. Fine-tuning is a focused apprenticeship in one job. Human feedback is a mentor saying "this answer was better than that one" until the habits stick. After graduation the model stops learning and just works from what it knows.

Questions

Things people ask.

How long does it take to train a large model?

Pretraining a frontier model can take weeks to months on thousands of GPUs and cost millions. Fine-tuning is far cheaper and faster because it works on much less data.

Why does a model have a knowledge cutoff?

Because its knowledge is frozen into its weights when training stops. Anything that happened after that date is simply not in the model unless you supply it at use time.

What is RLHF?

Reinforcement learning from human feedback. Humans rank model answers, and the model is trained to produce more of the preferred kind. It is a major reason assistants feel helpful and well-mannered rather than raw.

More concepts Try Berges AI

How AI models are trained.

Pretraining: learn language at scale

Fine-tuning: shape the behavior

Learning from human feedback

Then it freezes

Related concepts

Things people ask.