AI RAG Pipeline
Build RAG (Retrieval Augmented Generation) pipelines via inference.sh CLI.
Quick Start
curl -fsSL https://cli.inference.sh | sh && infsh login
Simple RAG: Search + LLM
SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "latest AI developments 2024"}') infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Based on this research, summarize the key trends: $SEARCH" }"
What is RAG?
RAG combines:
-
Retrieval: Fetch relevant information from external sources
-
Augmentation: Add retrieved context to the prompt
-
Generation: LLM generates response using the context
This produces more accurate, up-to-date, and verifiable AI responses.
RAG Pipeline Patterns
Pattern 1: Simple Search + Answer
[User Query] -> [Web Search] -> [LLM with Context] -> [Answer]
Pattern 2: Multi-Source Research
[Query] -> [Multiple Searches] -> [Aggregate] -> [LLM Analysis] -> [Report]
Pattern 3: Extract + Process
[URLs] -> [Content Extraction] -> [Chunking] -> [LLM Summary] -> [Output]
Available Tools
Search Tools
Tool App ID Best For
Tavily Search tavily/search-assistant
AI-powered search with answers
Exa Search exa/search
Neural search, semantic matching
Exa Answer exa/answer
Direct factual answers
Extraction Tools
Tool App ID Best For
Tavily Extract tavily/extract
Clean content from URLs
Exa Extract exa/extract
Analyze web content
LLM Tools
Model App ID Best For
Claude Sonnet 4.5 openrouter/claude-sonnet-45
Complex analysis
Claude Haiku 4.5 openrouter/claude-haiku-45
Fast processing
GPT-4o openrouter/gpt-4o
General purpose
Gemini 2.5 Pro openrouter/gemini-25-pro
Long context
Pipeline Examples
Basic RAG Pipeline
1. Search for information
SEARCH_RESULT=$(infsh app run tavily/search-assistant --input '{ "query": "What are the latest breakthroughs in quantum computing 2024?" }')
2. Generate grounded response
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "You are a research assistant. Based on the following search results, provide a comprehensive summary with citations.
Search Results: $SEARCH_RESULT
Provide a well-structured summary with source citations." }"
Multi-Source Research
Search multiple sources
TAVILY=$(infsh app run tavily/search-assistant --input '{"query": "electric vehicle market trends 2024"}') EXA=$(infsh app run exa/search --input '{"query": "EV market analysis latest reports"}')
Combine and analyze
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Analyze these research results and identify common themes and contradictions.
Source 1 (Tavily): $TAVILY
Source 2 (Exa): $EXA
Provide a balanced analysis with sources." }"
URL Content Analysis
1. Extract content from specific URLs
CONTENT=$(infsh app run tavily/extract --input '{ "urls": [ "https://example.com/research-paper", "https://example.com/industry-report" ] }')
2. Analyze extracted content
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Analyze these documents and extract key insights:
$CONTENT
Provide:
- Key findings
- Data points
- Recommendations" }"
Fact-Checking Pipeline
Claim to verify
CLAIM="AI will replace 50% of jobs by 2030"
1. Search for evidence
EVIDENCE=$(infsh app run tavily/search-assistant --input "{ "query": "$CLAIM evidence studies research" }")
2. Verify claim
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Fact-check this claim: '$CLAIM'
Based on the following evidence: $EVIDENCE
Provide:
- Verdict (True/False/Partially True/Unverified)
- Supporting evidence
- Contradicting evidence
- Sources" }"
Research Report Generator
TOPIC="Impact of generative AI on creative industries"
1. Initial research
OVERVIEW=$(infsh app run tavily/search-assistant --input "{"query": "$TOPIC overview"}") STATISTICS=$(infsh app run exa/search --input "{"query": "$TOPIC statistics data"}") OPINIONS=$(infsh app run tavily/search-assistant --input "{"query": "$TOPIC expert opinions"}")
2. Generate comprehensive report
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Generate a comprehensive research report on: $TOPIC
Research Data: == Overview == $OVERVIEW
== Statistics == $STATISTICS
== Expert Opinions == $OPINIONS
Format as a professional report with:
- Executive Summary
- Key Findings
- Data Analysis
- Expert Perspectives
- Conclusion
- Sources" }"
Quick Answer with Sources
Use Exa Answer for direct factual questions
infsh app run exa/answer --input '{ "question": "What is the current market cap of NVIDIA?" }'
Best Practices
- Query Optimization
Bad: Too vague
"AI news"
Good: Specific and contextual
"latest developments in large language models January 2024"
- Context Management
Summarize long search results before sending to LLM
SEARCH=$(infsh app run tavily/search-assistant --input '{"query": "..."}')
If too long, summarize first
SUMMARY=$(infsh app run openrouter/claude-haiku-45 --input "{ "prompt": "Summarize these search results in bullet points: $SEARCH" }")
Then use summary for analysis
infsh app run openrouter/claude-sonnet-45 --input "{ "prompt": "Based on this research summary, provide insights: $SUMMARY" }"
- Source Attribution
Always ask the LLM to cite sources:
infsh app run openrouter/claude-sonnet-45 --input '{ "prompt": "... Always cite sources in Source Name format." }'
- Iterative Research
First pass: broad search
INITIAL=$(infsh app run tavily/search-assistant --input '{"query": "topic overview"}')
Second pass: dive deeper based on findings
DEEP=$(infsh app run tavily/search-assistant --input '{"query": "specific aspect from initial search"}')
Pipeline Templates
Agent Research Tool
#!/bin/bash
research.sh - Reusable research function
research() { local query="$1"
Search
local results=$(infsh app run tavily/search-assistant --input "{"query": "$query"}")
Analyze
infsh app run openrouter/claude-haiku-45 --input "{ "prompt": "Summarize: $results" }" }
research "your query here"
Related Skills
Web search tools
npx skills add inferencesh/skills@web-search
LLM models
npx skills add inferencesh/skills@llm-models
Content pipelines
npx skills add inferencesh/skills@ai-content-pipeline
Full platform skill
npx skills add inferencesh/skills@inference-sh
Browse all apps: infsh app list
Documentation
-
Adding Tools to Agents - Agent tool integration
-
Building a Research Agent - Full guide