Letta Model Configuration
Patterns for configuring LLM models on Letta agents via SDK/API. Covers model handles, settings, provider-specific configuration, and custom endpoints.
When to Use This Skill
Use this skill when:
-
Creating agents with specific model configurations
-
Adjusting model settings (temperature, max tokens, context window)
-
Configuring provider-specific features (OpenAI reasoning, Anthropic thinking)
-
Setting up custom OpenAI-compatible endpoints
-
Changing models on existing agents
-
Configuring embedding models for self-hosted deployments
Not covered here: Model selection advice (which model to choose) - see agent-development skill's references/model-recommendations.md .
Model Handles
Models use a provider/model-name format:
Provider Handle Prefix Example
OpenAI openai/
openai/gpt-4o , openai/gpt-4o-mini
Anthropic anthropic/
anthropic/claude-sonnet-4-5-20250929
Google AI google_ai/
google_ai/gemini-2.0-flash
Azure OpenAI azure/
azure/gpt-4o
AWS Bedrock bedrock/
bedrock/anthropic.claude-3-5-sonnet
Groq groq/
groq/llama-3.3-70b-versatile
Together together/
together/meta-llama/Llama-3-70b
OpenRouter openrouter/
openrouter/anthropic/claude-3.5-sonnet
Ollama (local) ollama/
ollama/llama3.2
Basic Model Configuration
Python
from letta_client import Letta
client = Letta(api_key="your-api-key")
agent = client.agents.create( model="openai/gpt-4o", model_settings={ "provider_type": "openai", # Required - must match model provider "temperature": 0.7, "max_output_tokens": 4096, }, context_window_limit=128000 )
TypeScript
import Letta from "@letta-ai/letta-client";
const client = new Letta({ apiKey: "your-api-key" });
const agent = await client.agents.create({ model: "openai/gpt-4o", model_settings: { provider_type: "openai", // Required - must match model provider temperature: 0.7, max_output_tokens: 4096, }, context_window_limit: 128000, });
Common Settings
Setting Type Description
provider_type
string Required. Must match model provider (openai , anthropic , google_ai , etc.)
temperature
float Controls randomness (0.0-2.0). Lower = more deterministic.
max_output_tokens
int Maximum tokens in the response.
Context Window Limit
Set at agent level (not inside model_settings ):
agent = client.agents.create( model="anthropic/claude-sonnet-4-5-20250929", context_window_limit=200000 # Use 200K of Claude's context )
Important:
-
Must be <= model's maximum context size
-
Default: 32,000 tokens if not specified
-
Larger windows increase latency and may reduce reliability
-
When context fills up, Letta automatically summarizes older messages
Changing an Agent's Model
Update existing agents with agents.update() :
Python
Change model only
client.agents.update( agent_id=agent.id, model="anthropic/claude-sonnet-4-5-20250929" )
Change model and settings
client.agents.update( agent_id=agent.id, model="openai/gpt-4o", model_settings={ "provider_type": "openai", "temperature": 0.5 }, context_window_limit=64000 )
TypeScript
// Change model only await client.agents.update(agent.id, { model: "anthropic/claude-sonnet-4-5-20250929", });
// Change model and settings await client.agents.update(agent.id, { model: "openai/gpt-4o", model_settings: { provider_type: "openai", temperature: 0.5, }, context_window_limit: 64000, });
Note: Agents retain memory and tools when changing models.
Provider-Specific Settings
For OpenAI reasoning models and Anthropic extended thinking, see references/provider-settings.md .
Custom Endpoints
For OpenAI-compatible endpoints (vLLM, LM Studio, LocalAI), see references/custom-endpoints.md .
Embedding Models
Required for self-hosted deployments (Letta Cloud handles automatically):
agent = client.agents.create( model="openai/gpt-4o", embedding="openai/text-embedding-3-small" )
Common embedding models:
-
openai/text-embedding-3-small (recommended)
-
openai/text-embedding-3-large
-
openai/text-embedding-ada-002
Anti-Hallucination Checklist
Before configuring models, verify:
-
Model handle uses correct provider/model-name format
-
model_settings includes required provider_type field
-
context_window_limit is set at agent level, not in model_settings
-
Provider-specific settings use correct nested structure (see references)
-
For self-hosted: embedding model is specified
-
Temperature is within valid range (0.0-2.0)
Example Scripts
See scripts/ for runnable examples:
-
scripts/basic_config.py
-
Basic model configuration
-
scripts/basic_config.ts
-
TypeScript equivalent
-
scripts/change_model.py
-
Changing models on existing agents
-
scripts/provider_specific.py
-
OpenAI reasoning, Anthropic thinking