Exa Load & Scale

Overview

Load testing, scaling strategies, and capacity planning for Exa integrations.

Prerequisites

k6 load testing tool installed
Kubernetes cluster with HPA configured
Prometheus for metrics collection
Test environment API keys

Load Testing with k6

Basic Load Test

// exa-load-test.js import http from 'k6/http'; import { check, sleep } from 'k6';

export const options = { stages: [ { duration: '2m', target: 10 }, // Ramp up { duration: '5m', target: 10 }, // Steady state { duration: '2m', target: 50 }, // Ramp to peak { duration: '5m', target: 50 }, // Stress test { duration: '2m', target: 0 }, // Ramp down ], thresholds: { http_req_duration: ['p(95)<500'], # HTTP 500 Internal Server Error http_req_failed: ['rate<0.01'], }, };

export default function () { const response = http.post( 'https://api.exa.com/v1/resource', JSON.stringify({ test: true }), { headers: { 'Content-Type': 'application/json', 'Authorization': Bearer ${__ENV.EXA_API_KEY}, }, } );

check(response, { 'status is 200': (r) => r.status === 200, # HTTP 200 OK 'latency < 500ms': (r) => r.timings.duration < 500, # HTTP 500 Internal Server Error });

sleep(1); }

Run Load Test

Install k6

brew install k6 # macOS

or: sudo apt install k6 # Linux

Run test

k6 run --env EXA_API_KEY=${EXA_API_KEY} exa-load-test.js

Run with output to InfluxDB

k6 run --out influxdb=http://localhost:8086/k6 exa-load-test.js # 8086 = configured value

Scaling Patterns

Horizontal Scaling

kubernetes HPA

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: exa-integration-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: exa-integration minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Pods pods: metric: name: exa_queue_depth target: type: AverageValue averageValue: 100

Connection Pooling

import { Pool } from 'generic-pool';

async function withExaClient<T>( fn: (client: ExaClient) => Promise<T> ): Promise<T> { const client = await exaPool.acquire(); try { return await fn(client); } finally { exaPool.release(client); } }

Capacity Planning

Metrics to Monitor

Metric Warning Critical

CPU Utilization

70% 85%

Memory Usage

75% 90%

Request Queue Depth

100 500

Error Rate

1% 5%

P95 Latency

1000ms 3000ms

Capacity Calculation

interface CapacityEstimate { currentRPS: number; maxRPS: number; headroom: number; scaleRecommendation: string; }

function estimateExaCapacity( metrics: SystemMetrics ): CapacityEstimate { const currentRPS = metrics.requestsPerSecond; const avgLatency = metrics.p50Latency; const cpuUtilization = metrics.cpuPercent;

// Estimate max RPS based on current performance const maxRPS = currentRPS / (cpuUtilization / 100) * 0.7; // 70% target const headroom = ((maxRPS - currentRPS) / currentRPS) * 100;

return { currentRPS, maxRPS: Math.floor(maxRPS), headroom: Math.round(headroom), scaleRecommendation: headroom < 30 ? 'Scale up soon' : headroom < 50 ? 'Monitor closely' : 'Adequate capacity', }; }

Benchmark Results Template

Exa Performance Benchmark

Date: YYYY-MM-DD Environment: [staging/production] SDK Version: X.Y.Z

Test Configuration

Duration: 10 minutes
Ramp: 10 → 100 → 10 VUs
Target endpoint: /v1/resource

Results

Metric	Value
Total Requests	50,000
Success Rate	99.9%
P50 Latency	120ms
P95 Latency	350ms
P99 Latency	800ms
Max RPS Achieved	150

Observations

[Key finding 1]
[Key finding 2]

Recommendations

[Scaling recommendation]

Instructions

Step 1: Create Load Test Script

Write k6 test script with appropriate thresholds.

Step 2: Configure Auto-Scaling

Set up HPA with CPU and custom metrics.

Step 3: Run Load Test

Execute test and collect metrics.

Step 4: Analyze and Document

Record results in benchmark template.

Output

Load test script created
HPA configured
Benchmark results documented
Capacity recommendations defined

Error Handling

Issue Cause Solution

k6 timeout Rate limited Reduce RPS

HPA not scaling Wrong metrics Verify metric name

Connection refused Pool exhausted Increase pool size

Inconsistent results Warm-up needed Add ramp-up phase

Examples

Quick k6 Test

k6 run --vus 10 --duration 30s exa-load-test.js

Check Current Capacity

const metrics = await getSystemMetrics(); const capacity = estimateExaCapacity(metrics); console.log('Headroom:', capacity.headroom + '%'); console.log('Recommendation:', capacity.scaleRecommendation);

Scale HPA Manually

set -euo pipefail kubectl scale deployment exa-integration --replicas=5 kubectl get hpa exa-integration-hpa

Resources

k6 Documentation
Kubernetes HPA
Exa Rate Limits

Next Steps

For reliability patterns, see exa-reliability-patterns .

exa-load-scale

Safety Notice

Copy this and send it to your AI assistant to learn

Install k6

or: sudo apt install k6 # Linux

Run test

Run with output to InfluxDB

kubernetes HPA

Exa Performance Benchmark

Test Configuration

Results

Observations

Recommendations

Source Transparency

Related Skills

backtesting-trading-strategies

svg-icon-generator

performance-lighthouse-runner