RAG Pipeline Builder

Designs complete retrieval-augmented generation pipelines that connect LLMs to your private data sources. Covers document ingestion from multiple formats, chunking strategies, embedding model selection, vector database configuration, hybrid retrieval (semantic + keyword), re-ranking, context window assembly, and answer generation with source citations.

Usage

Describe your data sources (PDFs, docs, databases, APIs), the types of questions users will ask, accuracy requirements, and latency constraints. Specify your infrastructure preferences (cloud services vs self-hosted). This skill produces a complete RAG architecture with component recommendations and configuration.

Examples

"Design a RAG system for a legal firm that answers questions across 50,000 contract PDFs with page citations"
"Build a pipeline that ingests API documentation from GitHub repos and answers developer questions"
"Create a RAG architecture for a support team that searches across Zendesk tickets, docs, and Slack history"

Guidelines

Chunk documents at 256-512 tokens with 50-token overlap for most use cases; adjust based on query length
Use hybrid retrieval combining dense embeddings with BM25 keyword search for best recall
Implement a re-ranker (cross-encoder or Cohere Rerank) on top-20 results to improve precision
Always include source attribution in responses so users can verify answers against original documents
Test with a golden set of 50+ question-answer pairs to measure retrieval accuracy and answer quality
Handle multi-hop questions by decomposing them into sub-queries and aggregating retrieved context
Set a confidence threshold and return "I don't have enough information" rather than hallucinating
Monitor retrieval hit rates and user feedback to identify gaps in your knowledge base over time

RAG Pipeline Builder

Usage

Examples

Guidelines

More Development Skills