Opik Optimizer

Purpose

Design, run, and interpret Opik Optimizer workflows for prompts, tools, and model parameters with consistent dataset/metric wiring and reproducible evaluation.

When to use

Use this skill when a user asks for:

Choosing and configuring Opik Optimizer algorithms for prompt/agent optimization.
Writing ChatPrompt-based optimization runs and custom metric functions.
Optimizing with tools (function calling or MCP), selected prompt roles, or prompt segments.
Tuning LLM call parameters with optimize_parameter.
Comparing optimizer outputs and interpreting OptimizationResult.

Workflow

Select optimizer strategy (MetaPromptOptimizer, FewShotBayesianOptimizer, HRPO, etc.) based on the target optimization goal.
Build prompt/dataset/metric wiring and validate placeholder-field alignment.
Run prompt, tool, or parameter optimization with explicit controls (n_threads, n_samples, max_trials, seed).
Inspect OptimizationResult and compare score deltas against initial baselines.
Summarize recommendations, risks, and next experiments.

Inputs

Target optimization objective (prompt/tool/parameter) and success metric.
Dataset source and expected schema fields.
Model/provider constraints and runtime limits.
Optional scope constraints (optimize_prompts segments, tool fields, project names).

Outputs

Optimizer run configuration and rationale.
Result interpretation (score, initial_score, history trends).
Recommended next changes and follow-up experiment plan.

Use the reference files in this skill for details before implementing code:

references/algorithms.md
references/prompt_agent_workflow.md
references/example_patterns.md

Opik Optimizer quickstart

Install and import:

pip install opik-optimizer

from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer
from opik_optimizer import datasets

Build a prompt and metric:

from opik.evaluation.metrics import LevenshteinRatio

prompt = ChatPrompt(
    system="You are a concise answerer.",
    user="{question}",
)

def metric(dataset_item: dict, output: str) -> float:
    return LevenshteinRatio().score(
        reference=dataset_item["answer"],
        output=output,
    ).value

Load dataset and run:

dataset = datasets.hotpot(count=30)

result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=metric,
    n_samples=20,
    max_trials=10,
)
result.display()

Core workflow you should follow

Pick optimizer class:
- Few-shot examples + Bayesian selection: FewShotBayesianOptimizer
- LLM meta-reasoning: MetaPromptOptimizer
- Genetic + MOO / LLM crossover: EvolutionaryOptimizer
- Hierarchical reflective diagnostics: HierarchicalReflectiveOptimizer (HRPO)
- Pareto-based genetic strategy: GepaOptimizer
- Parameter tuning only: ParameterOptimizer
Define a single ChatPrompt (or dict of prompts for multi-prompt cases).
Provide a dataset from opik_optimizer.datasets.
Provide metric callable with signature (dataset_item, llm_output) -> float (or ScoreResult/list of ScoreResult).
Set optimizer controls (n_threads, n_samples, max_trials, seed, etc.).
Run one of:
- optimize_prompt(...) for prompt/system behavior changes.
- optimize_parameter(...) for model-call hyperparameters.
Inspect OptimizationResult (score, initial_score, history, optimization_id, get_optimized_parameters).

Key execution details to enforce

Prefer explicit project_name for Opik tracking if you are using org-level observability.
Keep placeholders in prompts aligned with dataset fields (for example {question}).
Start with optimize_prompts="system" or "user" when scope should be constrained.
Keep model names in MetaPrompt/reasoning calls provider-compatible for your account.
Validate multimodal input payloads by preserving non-empty content segments only.
For small datasets, use n_samples and n_samples_strategy carefully; over-allocation auto-falls back to full set.

Tooling and segment-based control

Tools can be optimized with MCP/function schema fields, not only by changing prompt wording.
For fine-grained text updates, use optimize_prompts values and helper functions from prompt_segments:
- extract_prompt_segments(ChatPrompt) to inspect stable segment IDs.
- apply_segment_updates(ChatPrompt, updates) for deterministic edits.
Tool optimization is distinct from prompt optimization.

Runnable examples live upstream in the Opik repo:

https://github.com/comet-ml/opik/tree/main/sdks/opik_optimizer/src/opik_optimizer

If you need local runnable scripts, vendor the upstream examples into a scripts/ folder and keep references one level deep.

Common mistakes to avoid

Passing empty dataset or mismatched placeholder names.
Mixing deprecated constructor arg num_threads with n_threads.
Assuming tool optimization is the same as agent function-calling optimization.
Running ParameterOptimizer.optimize_prompt (it raises and should not be used).

Next actions

For in-depth behavior and per-class parameter tables: references/algorithms.md
For exact optimize_prompt signatures, prompts, tool constraints, and result usage: references/prompt_agent_workflow.md
For pattern examples and source-backed workflows: references/example_patterns.md

opik-optimizer

Safety Notice

Copy this and send it to your AI assistant to learn

Opik Optimizer

Purpose

When to use

Workflow

Inputs

Outputs

Opik Optimizer quickstart

Core workflow you should follow

Key execution details to enforce

Tooling and segment-based control

Common mistakes to avoid

Next actions

Source Transparency

Related Skills

technical-documentation

technical-integrations

technical-skill-finder

technical-deslop