Vector Database

What is a Vector Database?

A vector database is a storage system purpose-built to handle vector embeddings -- numerical representations of data like text, images, or audio. Unlike traditional databases that match exact keywords, a vector database finds items by meaning, returning results that are semantically similar to a query.

Think of a regular database like a filing cabinet organized alphabetically. If you search for "puppy," you only find documents that contain that exact word. A vector database is more like a librarian who understands that "puppy," "young dog," and "canine pup" all mean roughly the same thing and retrieves all of them.

How Does It Work?

Embedding -- Data (text, images, etc.) is converted into high-dimensional vectors using an embedding model.
Indexing -- The vectors are stored and indexed using algorithms like HNSW or IVF for fast approximate nearest neighbor (ANN) search.
Querying -- When a user submits a query, it is also converted into a vector, and the database finds the stored vectors closest to it in meaning.

Why Does It Matter?

Vector databases are a critical component of RAG pipelines, semantic search engines, recommendation systems, and any application where understanding meaning -- not just matching keywords -- is important.

Key Examples

Pinecone -- a fully managed vector database service.
Weaviate -- an open-source vector search engine.
Chroma -- a lightweight vector store popular for prototyping.
Milvus -- a scalable open-source vector database.

What is a Vector Database?

How Does It Work?

Why Does It Matter?

Key Examples

Related terms