telemetry

Access the CTO observability stack for logs, metrics, and dashboards.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "telemetry" with this command: npx skills add 5dlabs/cto/5dlabs-cto-telemetry

Telemetry Skill

Access the CTO observability stack for logs, metrics, and dashboards.

When to Use

  • Querying pod logs via Loki

  • Checking metrics via Prometheus

  • Viewing dashboards in Grafana

  • Debugging agent failures

  • Monitoring Play workflow health

Stack Overview

Service Port Purpose

Prometheus 9090 Metrics collection and querying

Loki 3100 Log aggregation (like Prometheus for logs)

Grafana 3000 Dashboards and visualization

Port Forwards (Required for Local Access)

kubectl port-forward svc/prometheus-server -n observability 9090:80 kubectl port-forward svc/loki-gateway -n observability 3100:80 kubectl port-forward svc/grafana -n observability 3000:80

MCP Tools Available

Prometheus Tools

Tool Purpose

prometheus_query

Instant query (current value)

prometheus_query_range

Range query (time series)

prometheus_labels

List all label names

prometheus_series

Find series matching labels

Loki Tools

Tool Purpose

loki_query

Query logs with LogQL

loki_labels

List all label names

loki_label_values

Get values for a label

Grafana Tools

Tool Purpose

grafana_search_dashboards

Find dashboards by name

grafana_get_dashboard

Get dashboard definition

grafana_query_prometheus

Query Prometheus via Grafana

grafana_query_loki_logs

Query Loki via Grafana

grafana_list_alert_rules

List configured alerts

Loki (Logs)

LogQL Basics

All logs from CTO namespace

{namespace="cto"}

Filter by pod name

{namespace="cto", pod=~"coderun-.*"}

Search for errors

{namespace="cto"} |= "error"

JSON parsing

{namespace="cto"} | json | level="error"

Regex filter

{namespace="cto"} |~ "tool.*mismatch"

Common Queries for CTO

All CodeRun pod logs

{namespace="cto", app="coderun"}

Morgan intake logs

{namespace="cto", pod=~"intake-.*"}

Play workflow logs

{namespace="cto", pod=~"play-.*"}

Errors only

{namespace="cto"} |= "error" | json

Tool inventory issues (A10)

{namespace="cto"} |~ "tool.*(mismatch|missing)"

MCP initialization failures (A12)

{namespace="cto"} |~ "mcp.*failed"

Config issues (A11)

{namespace="cto"} |~ "cto-config.*(missing|invalid)"

Via MCP Tool

loki_query(query='{namespace="cto"} |= "error"', limit=100)

Via curl

curl -G "http://localhost:3100/loki/api/v1/query_range"
--data-urlencode 'query={namespace="cto"} |= "error"'
--data-urlencode 'limit=100' | jq

Prometheus (Metrics)

PromQL Basics

Current CPU usage

container_cpu_usage_seconds_total{namespace="cto"}

Memory usage

container_memory_usage_bytes{namespace="cto"}

Rate of requests

rate(http_requests_total{namespace="cto"}[5m])

Pod restarts

kube_pod_container_status_restarts_total{namespace="cto"}

Common Queries for CTO

CodeRun pod count

count(kube_pod_info{namespace="cto", pod=~"coderun-.*"})

Memory by pod

container_memory_usage_bytes{namespace="cto", container!=""}

CPU by pod

rate(container_cpu_usage_seconds_total{namespace="cto"}[5m])

OOM killed containers

kube_pod_container_status_last_terminated_reason{namespace="cto", reason="OOMKilled"}

Pod restart count

sum(kube_pod_container_status_restarts_total{namespace="cto"}) by (pod)

Via MCP Tool

prometheus_query(query='count(kube_pod_info{namespace="cto"})')

Via curl

curl "http://localhost:9090/api/v1/query"
--data-urlencode 'query=kube_pod_info{namespace="cto"}' | jq

Grafana (Dashboards)

Access

Common Dashboards

Dashboard Purpose

Kubernetes / Pods Pod resource usage

Loki / Logs Log explorer

CTO Overview Platform health (if configured)

Via MCP Tool

grafana_search_dashboards(query="kubernetes") grafana_get_dashboard(uid="abc123")

kubectl Alternatives

When MCP tools aren't available, use kubectl directly:

Stream Logs

All CTO pods

kubectl logs -n cto -l app.kubernetes.io/part-of=cto -f --tail=100

Specific CodeRun

kubectl logs -n cto -l app=coderun -f

With grep

kubectl logs -n cto -l app=coderun -f | grep -E "error|mismatch|failed"

Get Pod Status

kubectl get pods -n cto -o wide kubectl describe pod -n cto <pod-name>

Events

kubectl get events -n cto --sort-by='.lastTimestamp'

Healer Integration

Healer uses Loki to watch for patterns:

// From crates/healer/src/scanner.rs // Patterns that trigger alerts: "tool\s+inventory\s+mismatch" // A10 "cto-config.*(missing|invalid)" // A11 "mcp.*failed\s+to\s+initialize" // A12

Query Healer-Relevant Logs

All Healer detection patterns

{namespace="cto"} |~ "tool.mismatch|cto-config.(missing|invalid)|mcp.*failed"

Troubleshooting

No Logs Appearing

  • Check port forward is running: lsof -i :3100

  • Verify Loki pods: kubectl get pods -n observability -l app=loki

  • Check label: loki_labels() to see available labels

Metrics Not Found

  • Check port forward: lsof -i :9090

  • Verify Prometheus: kubectl get pods -n observability -l app=prometheus

  • List metrics: prometheus_labels() or browse http://localhost:9090/targets

Grafana Not Loading

  • Check port forward: lsof -i :3000

  • Verify pod: kubectl get pods -n observability -l app.kubernetes.io/name=grafana

Reference

  • Loki LogQL Documentation

  • Prometheus PromQL Documentation

  • Grafana Documentation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

expo-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

better-auth-expo

No summary provided by upstream source.

Repository SourceNeeds Review
General

elysia-llm-docs

No summary provided by upstream source.

Repository SourceNeeds Review
General

frontend-excellence

No summary provided by upstream source.

Repository SourceNeeds Review