Context Length/Window

Context length refers to the number of words or tokens a language model can consider when generating a response. It is a critical factor in language modeling, especially for LLMs with massive parameter counts.

Example:

Consider the sentence: "The quick brown fox jumps over the lazy dog."

Maintaining Context in Long Conversations

When a conversation exceeds the model’s context window, earlier messages are forgotten. However, models attempt to maintain context by summarizing or compressing prior interactions. This enables them to generate relevant responses even when earlier parts of the chat are no longer within the active memory.

Watch the video for a better understanding: YouTube Video


Tokens

Tokens are the fundamental units of text that a language model processes. They can be individual words, subwords, or even characters, depending on the model's tokenization strategy.

Tokenization Examples:

Token Counting

Tokenization is essential for both training and inference. Models count tokens to: