📚

RAG Pipeline Builder

Verified

by Community

Designs complete RAG architectures including document ingestion, chunking strategies, embedding generation, vector storage, retrieval algorithms, context assembly, and answer generation with source attribution.

ragretrievalembeddingsvector-searchllm

RAG Pipeline Builder

Designs complete retrieval-augmented generation pipelines that connect LLMs to your private data sources. Covers document ingestion from multiple formats, chunking strategies, embedding model selection, vector database configuration, hybrid retrieval (semantic + keyword), re-ranking, context window assembly, and answer generation with source citations.

Usage

Describe your data sources (PDFs, docs, databases, APIs), the types of questions users will ask, accuracy requirements, and latency constraints. Specify your infrastructure preferences (cloud services vs self-hosted). This skill produces a complete RAG architecture with component recommendations and configuration.

Examples

  • "Design a RAG system for a legal firm that answers questions across 50,000 contract PDFs with page citations"
  • "Build a pipeline that ingests API documentation from GitHub repos and answers developer questions"
  • "Create a RAG architecture for a support team that searches across Zendesk tickets, docs, and Slack history"

Guidelines

  • Chunk documents at 256-512 tokens with 50-token overlap for most use cases; adjust based on query length
  • Use hybrid retrieval combining dense embeddings with BM25 keyword search for best recall
  • Implement a re-ranker (cross-encoder or Cohere Rerank) on top-20 results to improve precision
  • Always include source attribution in responses so users can verify answers against original documents
  • Test with a golden set of 50+ question-answer pairs to measure retrieval accuracy and answer quality
  • Handle multi-hop questions by decomposing them into sub-queries and aggregating retrieved context
  • Set a confidence threshold and return "I don't have enough information" rather than hallucinating
  • Monitor retrieval hit rates and user feedback to identify gaps in your knowledge base over time