Chunk
A text segment created by splitting long documents into meaningful units for retrieval and generation
#Chunk#Chunking#RAG#Retrieval
What is a Chunk?
A chunk is a text segment created by splitting a long document into smaller units.
In RAG systems, documents are usually indexed and retrieved at chunk level rather than as one full file.
Think of it as turning a large report into small reference cards, then pulling only the cards you need.
How does it work?
A common pipeline looks like this:
- Split source documents by token length and semantic boundaries (paragraphs/sections).
- Convert each chunk into embeddings and store them in a vector database.
- Retrieve relevant chunks for a query and attach them to the LLM context.
In practice, chunk size and overlap are major tuning parameters.
If chunks are too small, context is fragmented. If too large, retrieval precision and cost can degrade.
Why does it matter?
Chunk design is one of the biggest drivers of RAG quality:
- Retrieval precision
- Grounded answer quality
- Cost and latency efficiency
In short, chunking strategy can matter as much as model choice.
Related terms
Natural Language Processing
RAG (Retrieval-Augmented Generation)
A technique that enhances LLM responses by retrieving relevant external information before generating an answer
Natural Language Processing
AGI (Artificial General Intelligence)
A hypothetical AI system capable of performing any intellectual task a human can
Natural Language Processing
AI Agent
An autonomous AI system that can plan, use tools, and take actions to achieve goals
Natural Language Processing
Attention
A mechanism that allows AI models to focus on the most relevant parts of the input when producing output
Natural Language Processing
Claude Opus
Claude's top-tier model family optimized for deep multi-step reasoning and high-stakes analysis
Natural Language Processing
Claude Sonnet
Claude's practical model family optimized for speed, cost efficiency, and strong day-to-day quality