Exa Observability
Overview
Monitor Exa AI search API performance, result quality, and cost efficiency. Key metrics include search latency (Exa neural search typically takes 500-2000ms), result relevance (measured by click-through or downstream usage), search volume by type (neural vs keyword vs auto), per-search cost tracking, and cache hit rates for repeated queries.
Prerequisites
-
Exa API integration in production
-
Metrics backend (Prometheus, Datadog, or equivalent)
-
Request logging infrastructure
Instructions
Step 1: Instrument the Exa Client
import Exa from 'exa-js';
async function trackedSearch(exa: Exa, query: string, options: any) { const start = performance.now(); try { const results = await exa.search(query, options); const duration = performance.now() - start; emitHistogram('exa_search_duration_ms', duration, { type: options.type || 'auto' }); emitCounter('exa_searches_total', 1, { type: options.type || 'auto', status: 'success' }); emitGauge('exa_results_count', results.results.length, { type: options.type || 'auto' }); return results; } catch (err: any) { emitCounter('exa_searches_total', 1, { status: 'error', code: err.status }); throw err; } }
Step 2: Track Result Quality
// Measure whether search results are actually used by downstream consumers function trackResultUsage(searchId: string, resultIndex: number, action: 'clicked' | 'used_in_context' | 'discarded') { emitCounter('exa_result_usage', 1, { action, position: String(resultIndex) }); // Results at position 0-2 should have high usage; if not, query needs tuning }
Step 3: Monitor Search Budget
set -euo pipefail
Check remaining search quota
curl -s https://api.exa.ai/v1/usage
-H "x-api-key: $EXA_API_KEY" |
jq '{searches_today, searches_this_month, monthly_limit, budget_remaining_pct: (1 - .searches_this_month / .monthly_limit) * 100}'
Step 4: Configure Alerts
groups:
- name: exa
rules:
- alert: ExaHighLatency expr: histogram_quantile(0.95, rate(exa_search_duration_ms_bucket[5m])) > 3000 # 3000: 3 seconds in ms annotations: { summary: "Exa search P95 latency exceeds 3 seconds" }
- alert: ExaBudgetLow expr: exa_monthly_searches_remaining < 1000 # 1000: 1 second in ms annotations: { summary: "Exa monthly search budget nearly exhausted" }
- alert: ExaLowResultQuality expr: rate(exa_result_usage{action="discarded"}[1h]) / rate(exa_result_usage[1h]) > 0.5 annotations: { summary: "Over 50% of Exa search results being discarded" }
- alert: ExaApiErrors expr: rate(exa_searches_total{status="error"}[5m]) > 0.1 annotations: { summary: "Exa API errors detected" }
Step 5: Build a Search Efficiency Dashboard
Key panels: search volume by type (neural/keyword/auto), latency p50/p95, results per search distribution, result usage rate (used vs discarded), daily cost tracking, and cache hit rate. Low result counts with high latency indicate poorly formed queries.
Error Handling
Issue Cause Solution
429 Too Many Requests
Rate limit exceeded Implement exponential backoff and request queue
Zero results returned Query too specific or domain filter too narrow Broaden query, remove includeDomains filter
Latency spike to 5s+ Neural search on complex query Use type: "keyword" for simpler lookups
Monthly budget exhausted Uncapped search volume Add application-level search budget tracking
Examples
Basic usage: Apply exa observability to a standard project setup with default configuration options.
Advanced scenario: Customize exa observability for production environments with multiple constraints and team-specific requirements.
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale
Resources
-
Official monitoring documentation
-
Community best practices and patterns
-
Related skills in this plugin pack