ChatGPT Deep Dive: Tokens, Context Size and Why They Matter

While hands-on experience is paramount for mastering Large Language Models, a basic understanding of their internal mechanisms can have a big impact.

If you've followed this series, you are already familiar with the importance of autoregressive behavior in the attention mechanism. Today's discussion will focus on tokenization and context size: two additional implementation aspects that will show why shorter prompts are often better prompts.

1. Tokenization

In order for text to be processed by a Large Language Model, it has to be turned into numbers. This process is called tokenization: each word is broken up into smaller units known as tokens. As we will see, the amount of tokens in your prompt can have big consequences.

There are usually multiple tokens per word, which results in a token being around 3/4 of a word. To get the exact count of tokens in your prompt, you can use OpenAI's tokenizer.

2. Context Window Size

Due to memory constraints, computational costs and the use of "positional embeddings" (necessary to keep word ordering), Large Language Models have a fixed context window size: they can only look at so many words to predict the next one.

Because the model looks at input tokens as well as previously emitted output tokens, both the prompt and its answer have to fit within the context size. ChatGPT 3.5 has a context size of 32k tokens, while ChatGPT 4 has a context size of 8k.

Better Conversations with Less Tokens

Each message in a conversation adds to the context size. Exceed the limit, and ChatGPT starts forgetting earlier parts. While it can be tempting to add a lot of data to your prompts, conveying the important information concisely is often a better technique. It not only allows for longer conversations, but it tends to improve the quality of the answers as well.

Here are a few techniques to reduce the size of your prompts:

Pre-summarize data before adding it to your prompt
Request answers in bullet-point form
Condense information and often start new conversations, keeping only the key details

Understanding why token size matters is key to having conversations that make full use of the context window of the LLMs.

1. Tokenization

2. Context Window Size

Better Conversations with Less Tokens

The all-in-one writing platform.

Write, publish everywhere, see what works, and become a better writer - all in one place.