The Bread Analogy That Makes It Click
Picture a loaf of bread. You could describe its weight in grams, or you could describe it in slices. Slices are more useful for a sandwich: they are the practical unit you actually work with, even though they do not line up perfectly with words like "crust" or "dough."
A token is the AI's version of a slice. When an AI reads your message, it does not see letters or even whole words the way you do. It sees chunks — fragments of text that its system was trained to recognize. Those chunks are tokens.
Most of the time one token is roughly three to four characters, which works out to about three-quarters of a word in English. The word "cat" is one token. The word "unbelievable" might be two or three tokens. The number "2025" is one token. An emoji could be several.
Seeing Tokens in Action
Here is a real sentence broken into the approximate tokens an AI might see. Each highlighted chunk is one token:
Notice that "jumped" is split into "jump" and "ed" — the AI treats them as two separate pieces. This is not a problem for the AI; it learned to understand meaning from these fragments, much the way you can read "unbeliev" and already know where the word is going before you reach the "-able."
You can actually see your text tokenized for free at OpenAI's Tokenizer tool — paste any text and watch it light up in color-coded chunks. It is oddly satisfying.
Why Tokens Matter for You
Pricing
When AI services charge for usage, they count tokens — both what you send (your question) and what the AI sends back (its answer). A 100-word question plus a 300-word answer might be around 500–600 tokens total. Most free tiers give you plenty of tokens per day before any cost applies.
Context Windows
Every AI has a "context window" — the maximum number of tokens it can hold in memory at once for a conversation. Think of it as the AI's desk space. When a conversation fills the desk, older messages start sliding off the edge. Very long conversations can lose early context this way.
Speed
Longer prompts (more tokens) take slightly more time to process. For most users this difference is imperceptible, but it explains why an AI responds faster to "summarize this" than to a 10-page document paste.
How to Try It
- Go to platform.openai.com/tokenizer (no account required).
- Type a sentence and watch it split into color-coded tokens.
- Try a long word like "internationalization" — notice how it breaks into fragments.
- Try a short sentence vs. a long one. Notice the token count difference at the bottom.
- Now you understand why AI pricing pages talk about "input tokens" and "output tokens" — you send input tokens, you receive output tokens.
What Could Go Wrong
The main practical issue most people hit is the context window limit. If you paste a very long document into a free AI chat and then ask follow-up questions, the AI may "forget" what was at the start of the document because it has run out of desk space.
The fix: break your document into sections and ask about each section in its own conversation. Or use an AI service with a larger context window (Claude has a particularly large one, for example).
Common Questions
Processing each token takes computing power. Charging per token is like charging per word — it is a fair measure of how much work the AI did. Most casual users stay well within free limits.
A short back-and-forth exchange might use 500–1,000 tokens total. A long research conversation could use 10,000 or more. Free tiers typically allow tens of thousands per day.
A word is a human linguistic unit. A token is a computational chunk. Common short words are usually one token. Long or rare words may be split into several tokens. Numbers and punctuation are also tokens.
Not exactly. More tokens let the AI see more context, which can improve relevance. But quality of the prompt matters far more than length — a clear 50-token question beats a rambling 500-token one.