Vector Database Guide
Guides implementation of vector databases for semantic search, recommendation systems, and AI applications. Covers embedding model selection, index types (HNSW, IVF, flat), distance metrics, metadata filtering, batch ingestion, query optimization, and operational management across Pinecone, Weaviate, Chroma, Qdrant, Milvus, and pgvector.
Usage
Describe your use case (semantic search, recommendations, deduplication, clustering), data volume, query latency requirements, and infrastructure constraints (managed vs self-hosted). Specify the embedding model you plan to use or ask for recommendations. This skill provides database selection guidance, schema design, and optimized query patterns.
Examples
- "Set up Qdrant for a product catalog with 1M items, filtering by price and category during similarity search"
- "Configure pgvector on an existing PostgreSQL database for semantic search across 100K support articles"
- "Design a Pinecone namespace strategy for a multi-tenant SaaS application with per-customer isolation"
Guidelines
- Match embedding dimensions to your model: OpenAI ada-002 is 1536d, Cohere v3 is 1024d, BGE is 768d
- Use cosine similarity for normalized embeddings; dot product when magnitude carries meaning
- Choose HNSW indexes for low-latency queries under 10M vectors; IVF for larger datasets with batch queries
- Always store metadata alongside vectors to enable filtered search without post-processing
- Batch upsert operations in chunks of 100-500 vectors to balance throughput and memory usage
- Monitor index recall by comparing approximate nearest neighbor results against exact brute-force search
- Use namespaces or collections to separate unrelated data and avoid cross-contamination in results
- Plan for embedding model upgrades: re-embedding requires full re-indexing, so version your collections