What are tokens in AI? Why models count them, not words

Language models do not process whole words or individual letters. They break text into tokens, pieces that are usually part of a word, a short whole word, or a punctuation mark. "Unbelievable" might become "un", "bel", "iev", "able".

This sounds like a technical detail, but it is the unit everything else is measured in. Cost, context limits, and speed are all counted in tokens, not words.

Why split into tokens at all

A fixed vocabulary of full words would be huge and would still miss new or rare words. Splitting into common fragments keeps the vocabulary manageable while letting the model build any word, including ones it never saw, from pieces. It is a practical compromise between letters and whole words.

The rough conversions

For English, one token averages about four characters, and 100 tokens is roughly 75 words. So a page of text is around 500 tokens, and a long document can be thousands. Code, other languages, and unusual formatting tokenize differently, sometimes far less efficiently.

Why you are billed in tokens

Every token, in your prompt and in the reply, has to be processed through the whole network, so tokens map directly to compute. That is why providers price per token and count both directions. A long back-and-forth costs more because the whole conversation is re-processed as context each turn.

Tokens set the limits

The context window, the maximum the model can consider at once, is measured in tokens. Generation speed is often quoted in tokens per second. Once you think in tokens, the model's pricing, limits, and pacing all line up.

An analogy

Think of tokens as the syllables a model reads and speaks in. It does not take in whole words at a glance or spell letter by letter. It works in these in-between chunks, and it counts them constantly.

Questions

Things people ask.

How many words is a token?

On average, one token is about three quarters of a word in English, or roughly four characters. So 1,000 tokens is around 750 words. It varies with the exact text.

Do spaces and punctuation count as tokens?

Yes. Spaces are usually attached to the following word, and punctuation marks are often their own tokens. Everything in the text contributes to the count.

Why does the same text cost more in another language?

Tokenizers are usually optimized for English. Other languages and scripts can break into more tokens per word, so the same meaning costs more tokens, and therefore more money and context.

More concepts Try Berges AI

What are tokens?

Why split into tokens at all

The rough conversions

Why you are billed in tokens

Tokens set the limits

Related concepts

Things people ask.