LangChain - Build LLM Applications with Agents & RAG
The most popular framework for building LLM-powered applications.
When to use LangChain
Use LangChain when:
-
Building agents with tool calling and reasoning (ReAct pattern)
-
Implementing RAG (retrieval-augmented generation) pipelines
-
Need to swap LLM providers easily (OpenAI, Anthropic, Google)
-
Creating chatbots with conversation memory
-
Rapid prototyping of LLM applications
-
Production deployments with LangSmith observability
Metrics:
-
119,000+ GitHub stars
-
272,000+ repositories use LangChain
-
500+ integrations (models, vector stores, tools)
-
3,800+ contributors
Use alternatives instead:
-
LlamaIndex: RAG-focused, better for document Q&A
-
LangGraph: Complex stateful workflows, more control
-
Haystack: Production search pipelines
-
Semantic Kernel: Microsoft ecosystem
Quick start
Installation
Core library (Python 3.10+)
pip install -U langchain
With OpenAI
pip install langchain-openai
With Anthropic
pip install langchain-anthropic
Common extras
pip install langchain-community # 500+ integrations pip install langchain-chroma # Vector store
Basic LLM usage
from langchain_anthropic import ChatAnthropic
Initialize model
llm = ChatAnthropic(model="claude-sonnet-4-5-20250929")
Simple completion
response = llm.invoke("Explain quantum computing in 2 sentences") print(response.content)
Create an agent (ReAct pattern)
from langchain.agents import create_agent from langchain_anthropic import ChatAnthropic
Define tools
def get_weather(city: str) -> str: """Get current weather for a city.""" return f"It's sunny in {city}, 72°F"
def search_web(query: str) -> str: """Search the web for information.""" return f"Search results for: {query}"
Create agent (<10 lines!)
agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[get_weather, search_web], system_prompt="You are a helpful assistant. Use tools when needed." )
Run agent
result = agent.invoke({"messages": [{"role": "user", "content": "What's the weather in Paris?"}]}) print(result["messages"][-1].content)
Core concepts
- Models - LLM abstraction
from langchain_openai import ChatOpenAI from langchain_anthropic import ChatAnthropic from langchain_google_genai import ChatGoogleGenerativeAI
Swap providers easily
llm = ChatOpenAI(model="gpt-4o") llm = ChatAnthropic(model="claude-sonnet-4-5-20250929") llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash-exp")
Streaming
for chunk in llm.stream("Write a poem"): print(chunk.content, end="", flush=True)
- Chains - Sequential operations
from langchain.chains import LLMChain from langchain.prompts import PromptTemplate
Define prompt template
prompt = PromptTemplate( input_variables=["topic"], template="Write a 3-sentence summary about {topic}" )
Create chain
chain = LLMChain(llm=llm, prompt=prompt)
Run chain
result = chain.run(topic="machine learning")
- Agents - Tool-using reasoning
ReAct (Reasoning + Acting) pattern:
from langchain.agents import create_tool_calling_agent, AgentExecutor from langchain.tools import Tool
Define custom tool
calculator = Tool( name="Calculator", func=lambda x: eval(x), description="Useful for math calculations. Input: valid Python expression." )
Create agent with tools
agent = create_tool_calling_agent( llm=llm, tools=[calculator, search_web], prompt="Answer questions using available tools" )
Create executor
agent_executor = AgentExecutor(agent=agent, tools=[calculator], verbose=True)
Run with reasoning
result = agent_executor.invoke({"input": "What is 25 * 17 + 142?"})
- Memory - Conversation history
from langchain.memory import ConversationBufferMemory from langchain.chains import ConversationChain
Add memory to track conversation
memory = ConversationBufferMemory()
conversation = ConversationChain( llm=llm, memory=memory, verbose=True )
Multi-turn conversation
conversation.predict(input="Hi, I'm Alice") conversation.predict(input="What's my name?") # Remembers "Alice"
RAG (Retrieval-Augmented Generation)
Basic RAG pipeline
from langchain_community.document_loaders import WebBaseLoader from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_openai import OpenAIEmbeddings from langchain_chroma import Chroma from langchain.chains import RetrievalQA
1. Load documents
loader = WebBaseLoader("https://docs.python.org/3/tutorial/") docs = loader.load()
2. Split into chunks
text_splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=200 ) splits = text_splitter.split_documents(docs)
3. Create embeddings and vector store
vectorstore = Chroma.from_documents( documents=splits, embedding=OpenAIEmbeddings() )
4. Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
5. Create QA chain
qa_chain = RetrievalQA.from_chain_type( llm=llm, retriever=retriever, return_source_documents=True )
6. Query
result = qa_chain({"query": "What are Python decorators?"}) print(result["result"]) print(f"Sources: {result['source_documents']}")
Conversational RAG with memory
from langchain.chains import ConversationalRetrievalChain
RAG with conversation memory
qa = ConversationalRetrievalChain.from_llm( llm=llm, retriever=retriever, memory=ConversationBufferMemory( memory_key="chat_history", return_messages=True ) )
Multi-turn RAG
qa({"question": "What is Python used for?"}) qa({"question": "Can you elaborate on web development?"}) # Remembers context
Advanced agent patterns
Structured output
from langchain_core.pydantic_v1 import BaseModel, Field
Define schema
class WeatherReport(BaseModel): city: str = Field(description="City name") temperature: float = Field(description="Temperature in Fahrenheit") condition: str = Field(description="Weather condition")
Get structured response
structured_llm = llm.with_structured_output(WeatherReport) result = structured_llm.invoke("What's the weather in SF? It's 65F and sunny") print(result.city, result.temperature, result.condition)
Parallel tool execution
from langchain.agents import create_tool_calling_agent
Agent automatically parallelizes independent tool calls
agent = create_tool_calling_agent( llm=llm, tools=[get_weather, search_web, calculator] )
This will call get_weather("Paris") and get_weather("London") in parallel
result = agent.invoke({ "messages": [{"role": "user", "content": "Compare weather in Paris and London"}] })
Streaming agent execution
Stream agent steps
for step in agent_executor.stream({"input": "Research AI trends"}): if "actions" in step: print(f"Tool: {step['actions'][0].tool}") if "output" in step: print(f"Output: {step['output']}")
Common patterns
Multi-document QA
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
Load multiple documents
docs = [ loader.load("https://docs.python.org"), loader.load("https://docs.numpy.org") ]
QA with source citations
chain = load_qa_with_sources_chain(llm, chain_type="stuff") result = chain({"input_documents": docs, "question": "How to use numpy arrays?"}) print(result["output_text"]) # Includes source citations
Custom tools with error handling
from langchain.tools import tool
@tool def risky_operation(query: str) -> str: """Perform a risky operation that might fail.""" try: # Your operation here result = perform_operation(query) return f"Success: {result}" except Exception as e: return f"Error: {str(e)}"
Agent handles errors gracefully
agent = create_agent(model=llm, tools=[risky_operation])
LangSmith observability
import os
Enable tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true" os.environ["LANGCHAIN_API_KEY"] = "your-api-key" os.environ["LANGCHAIN_PROJECT"] = "my-project"
All chains/agents automatically traced
agent = create_agent(model=llm, tools=[calculator]) result = agent.invoke({"input": "Calculate 123 * 456"})
View traces at smith.langchain.com
Vector stores
Chroma (local)
from langchain_chroma import Chroma
vectorstore = Chroma.from_documents( documents=docs, embedding=OpenAIEmbeddings(), persist_directory="./chroma_db" )
Pinecone (cloud)
from langchain_pinecone import PineconeVectorStore
vectorstore = PineconeVectorStore.from_documents( documents=docs, embedding=OpenAIEmbeddings(), index_name="my-index" )
FAISS (similarity search)
from langchain_community.vectorstores import FAISS
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings()) vectorstore.save_local("faiss_index")
Load later
vectorstore = FAISS.load_local("faiss_index", OpenAIEmbeddings())
Document loaders
Web pages
from langchain_community.document_loaders import WebBaseLoader loader = WebBaseLoader("https://example.com")
PDFs
from langchain_community.document_loaders import PyPDFLoader loader = PyPDFLoader("paper.pdf")
GitHub
from langchain_community.document_loaders import GithubFileLoader loader = GithubFileLoader(repo="user/repo", file_filter=lambda x: x.endswith(".py"))
CSV
from langchain_community.document_loaders import CSVLoader loader = CSVLoader("data.csv")
Text splitters
Recursive (recommended for general text)
from langchain.text_splitter import RecursiveCharacterTextSplitter splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=200, separators=["\n\n", "\n", " ", ""] )
Code-aware
from langchain.text_splitter import PythonCodeTextSplitter splitter = PythonCodeTextSplitter(chunk_size=500)
Semantic (by meaning)
from langchain_experimental.text_splitter import SemanticChunker splitter = SemanticChunker(OpenAIEmbeddings())
Best practices
-
Start simple - Use create_agent() for most cases
-
Enable streaming - Better UX for long responses
-
Add error handling - Tools can fail, handle gracefully
-
Use LangSmith - Essential for debugging agents
-
Optimize chunk size - 500-1000 chars for RAG
-
Version prompts - Track changes in production
-
Cache embeddings - Expensive, cache when possible
-
Monitor costs - Track token usage with LangSmith
Performance benchmarks
Operation Latency Notes
Simple LLM call ~1-2s Depends on provider
Agent with 1 tool ~3-5s ReAct reasoning overhead
RAG retrieval ~0.5-1s Vector search + LLM
Embedding 1000 docs ~10-30s Depends on model
LangChain vs LangGraph
Feature LangChain LangGraph
Best for Quick agents, RAG Complex workflows
Abstraction level High Low
Code to start <10 lines ~30 lines
Control Simple Full control
Stateful workflows Limited Native
Cyclic graphs No Yes
Human-in-loop Basic Advanced
Use LangGraph when:
-
Need stateful workflows with cycles
-
Require fine-grained control
-
Building multi-agent systems
-
Production apps with complex logic
References
-
Agents Guide - ReAct, tool calling, streaming
-
RAG Guide - Document loaders, retrievers, QA chains
-
Integration Guide - Vector stores, LangSmith, deployment
Resources
-
GitHub: https://github.com/langchain-ai/langchain ⭐ 119,000+
-
API Reference: https://reference.langchain.com/python
-
LangSmith: https://smith.langchain.com (observability)
-
Version: 0.3+ (stable)
-
License: MIT