langchain-observability

LangChain Observability

Contents

Overview
Prerequisites
Instructions
Output
Error Handling
Examples
Resources

Overview

Set up comprehensive observability for LangChain applications with LangSmith, OpenTelemetry, Prometheus, and structured logging.

Prerequisites

LangChain application in staging/production
LangSmith account (optional but recommended)
Prometheus/Grafana infrastructure
OpenTelemetry collector (optional)

Instructions

Step 1: Enable LangSmith Tracing

Set LANGCHAIN_TRACING_V2=true and configure LANGCHAIN_API_KEY and LANGCHAIN_PROJECT environment variables. All chains are automatically traced.

Step 2: Add Prometheus Metrics

Create a PrometheusCallback handler that tracks langchain_llm_requests_total , langchain_llm_latency_seconds , and langchain_llm_tokens_total counters/histograms.

Step 3: Integrate OpenTelemetry

Use OTLPSpanExporter with a custom OpenTelemetryCallback to add spans for chain and LLM operations with parent-child relationships.

Step 4: Configure Structured Logging

Use structlog with a StructuredLoggingCallback to emit JSON logs for all LLM start/end/error events.

Step 5: Set Up Grafana Dashboard

Create panels for request rate, P95 latency, token usage, and error rate using Prometheus queries.

Step 6: Configure Alerting Rules

Define Prometheus alerts for high error rate (>5%), high latency (P95 >5s), and token budget exceeded.

See detailed implementation for complete callback code, dashboard JSON, and alert rules.

Output

LangSmith tracing enabled
Prometheus metrics exported
OpenTelemetry spans
Structured logging
Grafana dashboard and alerts

Error Handling

Issue Cause Solution

Missing metrics Callback not attached Pass callback to LLM constructor

Trace gaps Missing context propagation Check parent span handling

Alert storms Thresholds too sensitive Tune for duration and thresholds

Examples

Basic usage: Apply langchain observability to a standard project setup with default configuration options.

Advanced scenario: Customize langchain observability for production environments with multiple constraints and team-specific requirements.

Resources

LangSmith Documentation
OpenTelemetry Python
Prometheus Python Client

Next Steps

Use langchain-incident-runbook for incident response procedures.

langchain-observability

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

tracking-crypto-prices

aggregating-crypto-news

tracking-crypto-derivatives

tracking-crypto-portfolio