Context Window

What is a Context Window?

A context window is the maximum amount of text (in tokens) an AI model can read and reason over in one request.

You can think of it as the model's short-term working memory. If the total input exceeds this limit, earlier or lower-priority information may be dropped.

How does it work?

The model counts everything in scope as tokens: system instructions, user prompts, chat history, and attached content. When that total approaches the limit, you may see:

Loss of earlier conversation details
Partial understanding of long documents
Inconsistent adherence to instructions

For long workflows, teams usually apply chunking, summarization, and periodic context refresh.

Why does it matter?

Context window size directly affects answer quality, latency, and cost in real-world use. It is especially important for document analysis, coding sessions, and multi-step agent workflows.

What is a Context Window?

How does it work?

Why does it matter?

Related terms