Pramod.AI
AboutIntelligenceArtifactsAI EvolutionNewsletterBlogContact
← STORY OF INTELLIGENCEHOME
Deep Dive

Vector Databases

The memory layer of AI - search by meaning, not keywords

Traditional databases search by exact match - 'find rows where city = Tokyo.' Vector databases search by meaning - 'find documents similar to this concept.' This semantic search is what powers RAG, recommendation engines, and AI memory.

The core idea: convert any data (text, images, audio) into high-dimensional vectors (embeddings), then find nearest neighbors in that space. Two sentences with similar meaning will have similar vectors, even if they share no words.

Documentsdoc 1doc 2doc 3EmbeddingModel2D Vector SpaceScienceSportsTechqueryTop-k ResultsIntro to GPT models...How LLMs tokenize...Transformer archi...Keyword: exact matchSemantic: meaning match

How It Works

1

Embedding Generation

An embedding model converts data into dense vectors (typically 768-3072 dimensions). Each dimension captures a facet of meaning. Similar items cluster together.

2

Indexing

Raw vector search is O(n) - too slow for millions of vectors. Index structures like HNSW (Hierarchical Navigable Small World) enable approximate nearest neighbor (ANN) search in milliseconds.

3

Similarity Search

Given a query vector, find the k most similar vectors using cosine similarity, dot product, or Euclidean distance. Trade accuracy for speed with approximation.

4

Hybrid Search

Combine vector similarity with keyword filtering, metadata matching, and re-ranking. Best results come from mixing semantic and lexical search.

5

Retrieval & Ranking

Retrieved results are re-ranked by a cross-encoder model for higher precision. This two-stage approach (retrieve broadly, rank precisely) balances speed and quality.

Key Components

Pinecone

Managed vector DB, serverless, enterprise-grade, simple API

pgvector

PostgreSQL extension - add vector search to your existing database

Weaviate

Open-source, hybrid search, built-in vectorization modules

Qdrant

Rust-based, high-performance, rich filtering, open-source

ChromaDB

Developer-friendly, in-process, great for prototyping RAG apps

Milvus

Cloud-native, GPU-accelerated, scales to billions of vectors

Who's Building With This

P

Perplexity

Vector search over web-scale indices for real-time AI search

S

Spotify

Embedding-based music recommendations - find songs you'll love

P

Pinterest

Visual search - find similar pins using image embeddings

S

Shopify

Semantic product search - find what customers mean, not just what they type

Key Takeaway

Vector databases are the bridge between AI models and real-world data. They enable search by meaning rather than keywords - turning every database into a knowledge base that AI can reason over.

References & Further Reading

  1. Efficient and robust approximate nearest neighbor search (HNSW)
  2. Pinecone Documentation
  3. pgvector: Open-source vector similarity search for Postgres
  4. Weaviate Documentation

Explore More Topics

Teaching AI to look things up before answeringRetrieval-Augmented GenerationWhen AI learns to see, hear, and speakMultimodal AIThe silicon, cloud, and systems powering intelligenceAI InfrastructureThe engines of intelligenceFoundation ModelsBig intelligence in small packagesSmall Language ModelsFrom chatbots to autonomous digital workersAI Agents in DepthA complete map of who builds what and how it all connectsThe AI EcosystemHow to train a model across thousands of GPUsDistributed AI TrainingMeasuring, monitoring, and operating AI in productionEval & AI OpsWhat artificial general intelligence really means and where we standThe Path to AGIHow silicon is evolving to bring AI from data centers to your pocketAI Chips and Edge IntelligenceWhich industries are winning with AI and how they're deploying itAI Sector Dominance
← Small Language ModelsAI Agents in Depth →