Sage Router

HTTP server on :8788 that routes chat requests to the optimal provider based on intent classification.

Endpoints

POST /v1/chat/completions — OpenAI-compatible; routes automatically
POST /v1/messages — Anthropic Messages API compatible; translates to/from OpenAI format internally
GET /health — Provider status, model lists, routing debug

Any Anthropic-compatible tool (Cursor, Aider, Claude Code, Zed, Continue, OpenHands) can point at http://localhost:8788 as the API base URL. Both streaming and non-streaming are supported.

Active Providers

Providers are discovered from ~/.openclaw/openclaw.json at startup.

Rules:

skips the router's own sage-router provider entry to avoid recursion
resolves ${ENV_VAR} values for baseUrl and apiKey
includes OpenClaw gateway openai-codex as a virtual provider when the auth profile exists
recognizes Google Gemini providers from generativelanguage.googleapis.com
auto-discovers Google models when the provider exists but models is empty in openclaw.json
normalizes anthropic or Anthropic-hosted anthropic-messages providers onto the local Dario proxy at localhost:3456
starts the Dario user service when Anthropic compatibility is needed and the service is not already running; in Docker, the image bundles @askalf/dario and autostarts dario proxy when credentials are mounted at /root/.dario
supports temporary provider suppression via SAGE_ROUTER_DISABLED_PROVIDERS=name1,name2

GET /health shows:

configured: all discovered providers
providers: reachable providers with model lists
disabled: providers suppressed by env

Routing Logic

The router does not perform mid-stream switching. Once a request is sent to a provider, the full response is returned or the attempt fails. If it fails, the next candidate in the chain is tried sequentially. There is no partial-output fallback or streaming handoff between providers.

Flow:

detect intent from the latest user message
estimate complexity from prompt length
score every reachable (provider, model) pair globally — not per-provider — from openclaw.json
in local-first, operate as local-strict: reject centralized Internet API providers and only allow local/LAN/Tailnet endpoints plus approved decentralized providers such as Darkbloom, with Ollama :cloud models excluded
for GENERAL, blend static heuristics with persisted empirical latency stats by provider and model
rank candidates by API type, model-name hints, complexity, and measured latency
attempt the top SAGE_ROUTER_MAX_PROVIDER_ATTEMPTS candidates in order
sage-router provider (the router itself, model auto) is scored as a low-priority recursive fallback, never preferred

Intent scoring is generic, for example:

code and analysis strongly favor Anthropic/OpenAI-style reasoning models
general/realtime requests prefer fast direct providers first
general traffic learns from real successful request latency over time, with light exploration for cold providers/models
complex prompts boost larger reasoning models and penalize mini/haiku-class models

Intent is detected by keyword matching on the latest user message. Complexity is estimated by word count.

API

GET /health — JSON with reachable providers, configured providers, and disabled providers
POST /v1/chat/completions — OpenAI-compatible; routes automatically

Notes

openai-codex is kept as an optional bridge, not a required first hop.
Anthropic compatibility is provided through Dario, so anthropic can stay in openclaw.json while routing locally through dario.
The repo systemd unit is template-style and expects local machine values in ~/.config/sage-router/sage-router.env.
Empirical latency memory is persisted at ~/.cache/sage-router/latency-stats.json by default.
When the OpenClaw gateway model-set path is unhealthy, the helper falls back to running without provider/model overrides instead of failing hard.
If any provider starts misbehaving, suppress it with SAGE_ROUTER_DISABLED_PROVIDERS instead of editing the router.
GitHub workflows now include CI syntax checks and CodeQL analysis for Python + JavaScript.
See BRANCH_PROTECTION.md for the exact required-check setup on GitHub.
provider-profiles.json includes a grok-sso template for the OpenClaw xAI auth plugin's local SuperGrok-backed proxy.

Install

Install the user service from the repo copy:

mkdir -p ~/.config/systemd/user ~/.config/sage-router
cp systemd/sage-router.service ~/.config/systemd/user/sage-router.service
cp systemd/sage-router.env.example ~/.config/sage-router/sage-router.env
# edit ~/.config/sage-router/sage-router.env for your machine
systemctl --user daemon-reload
systemctl --user enable --now sage-router.service

Notes:

the repo unit is now env-driven and does not hardcode your home path, Node version, or workspace location
set SAGE_ROUTER_HOME to the actual repo path on your machine
optionally set SAGE_ROUTER_PATH_PREFIX if your Python, Node, or Dario bins are not already on PATH

If an Anthropic provider is detected and Dario is not installed yet, install Dario first:

GitHub: https://github.com/askalf/dario

Service

systemctl --user status sage-router
systemctl --user restart sage-router
journalctl --user -u sage-router -f   # live logs

Docker production notes

Docker image includes Node, Python, Sage Router, and @askalf/dario.
Mount host Dario credentials as ~/.dario:/root/.dario for Anthropic-compatible Claude routing.
Enable llama.cpp classifier sidecar with docker compose --profile classifier up -d and SAGE_ROUTER_INTENT_CLASSIFIER_ENABLED=1.
Production classifier flags: SAGE_ROUTER_INTENT_CLASSIFIER_PROVIDER=llamacpp, SAGE_ROUTER_INTENT_CLASSIFIER_BASE_URL=http://llamacpp-classifier:8080, SAGE_ROUTER_INTENT_CLASSIFIER_MODEL=classifier.

sage-router

Safety Notice

Copy this and send it to your AI assistant to learn

Sage Router

Endpoints

Active Providers

Routing Logic

API

Notes

Install

Service

Docker production notes

Source Transparency

Related Skills

Verified Agent Identity Masud

MacOS LaunchDaemon Scheduler

autospec

autonomous-agent