rag-patterns

Standard RAG Pipeline

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "rag-patterns" with this command: npx skills add claude-dev-suite/claude-dev-suite/claude-dev-suite-claude-dev-suite-rag-patterns

RAG Patterns

Standard RAG Pipeline

Documents → Chunk → Embed → Store (vector DB) Query → Embed → Retrieve → Augment prompt → Generate answer

Chunking Strategies

from langchain_text_splitters import RecursiveCharacterTextSplitter

Recommended defaults

splitter = RecursiveCharacterTextSplitter( chunk_size=800, # chars (not tokens) chunk_overlap=200, separators=["\n\n", "\n", ". ", " ", ""], ) chunks = splitter.split_documents(docs)

Strategy Best For Chunk Size

Fixed-size with overlap General text 500-1000 chars

Recursive character Structured docs 500-1000 chars

Semantic (by meaning) Long-form content Variable

Document-aware (markdown headers) Technical docs Section-based

Metadata Enrichment

for chunk in chunks: chunk.metadata.update({ "source": doc.metadata["source"], "section": extract_section_title(chunk), "doc_id": doc.metadata["id"], "chunk_index": i, })

Retrieval Strategies

Hybrid Search (keyword + semantic)

from langchain.retrievers import EnsembleRetriever from langchain_community.retrievers import BM25Retriever

bm25 = BM25Retriever.from_documents(docs, k=5) vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 5})

hybrid = EnsembleRetriever( retrievers=[bm25, vector_retriever], weights=[0.3, 0.7], )

Re-ranking

from cohere import Client

cohere = Client(api_key=COHERE_API_KEY)

def rerank(query: str, documents: list[str], top_n: int = 5): response = cohere.rerank( model="rerank-english-v3.0", query=query, documents=documents, top_n=top_n, ) return [documents[r.index] for r in response.results]

Multi-query Retrieval

Generate multiple query variations for better recall

prompt = """Generate 3 different versions of this question to retrieve relevant documents: {question}"""

queries = llm.invoke(prompt).split("\n") all_docs = set() for q in queries: all_docs.update(retriever.invoke(q))

Prompt Construction

SYSTEM_PROMPT = """Answer based only on the provided context. If the context doesn't contain the answer, say "I don't have enough information." Cite sources using [Source: filename] format.

Context: {context}"""

def format_context(docs, max_tokens=3000): context_parts = [] for doc in docs: source = doc.metadata.get("source", "unknown") context_parts.append(f"[Source: {source}]\n{doc.page_content}") return "\n\n---\n\n".join(context_parts)

Evaluation

Metric Measures Tool

Context Relevance Are retrieved docs relevant? RAGAS, manual

Faithfulness Does answer match context? RAGAS

Answer Relevance Does answer address question? RAGAS

Retrieval Recall Are correct docs retrieved? Custom eval set

from ragas import evaluate from ragas.metrics import faithfulness, answer_relevancy, context_precision

result = evaluate(dataset, metrics=[faithfulness, answer_relevancy, context_precision])

Anti-Patterns

Anti-Pattern Fix

Chunks too large (>1500 chars) Use 500-1000 char chunks with 200 overlap

No metadata on chunks Store source, section, page number

No retrieval evaluation Build eval set, measure recall and precision

Stuffing all chunks in prompt Limit to top-K (3-5), use re-ranking

Ignoring hybrid search Combine BM25 + vector for better recall

No citation/source tracking Pass metadata through pipeline

Production Checklist

  • Chunking strategy tuned with eval set

  • Hybrid search (BM25 + vector) enabled

  • Re-ranking on retrieval results

  • Source attribution in answers

  • Guardrails for out-of-scope questions

  • Monitoring: retrieval latency, answer quality scores

  • Incremental indexing for new documents

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

cron-scheduling

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

token-optimization

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

webrtc

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

react-19

No summary provided by upstream source.

Repository SourceNeeds Review