Sage Router
HTTP server on :8788 that routes chat requests to the optimal provider based on intent classification.
Endpoints
POST /v1/chat/completions— OpenAI-compatible; routes automaticallyPOST /v1/messages— Anthropic Messages API compatible; translates to/from OpenAI format internallyGET /health— Provider status, model lists, routing debug
Any Anthropic-compatible tool (Cursor, Aider, Claude Code, Zed, Continue, OpenHands) can point at http://localhost:8788 as the API base URL. Both streaming and non-streaming are supported.
Active Providers
Providers are discovered from ~/.openclaw/openclaw.json at startup.
Rules:
- skips the router's own
sage-routerprovider entry to avoid recursion - resolves
${ENV_VAR}values forbaseUrlandapiKey - includes OpenClaw gateway
openai-codexas a virtual provider when the auth profile exists - recognizes Google Gemini providers from
generativelanguage.googleapis.com - auto-discovers Google models when the provider exists but
modelsis empty inopenclaw.json - normalizes
anthropicor Anthropic-hostedanthropic-messagesproviders onto the local Dario proxy atlocalhost:3456 - starts the Dario user service when Anthropic compatibility is needed and the service is not already running; in Docker, the image bundles
@askalf/darioand autostartsdario proxywhen credentials are mounted at/root/.dario - supports temporary provider suppression via
SAGE_ROUTER_DISABLED_PROVIDERS=name1,name2
GET /health shows:
configured: all discovered providersproviders: reachable providers with model listsdisabled: providers suppressed by env
Routing Logic
The router does not perform mid-stream switching. Once a request is sent to a provider, the full response is returned or the attempt fails. If it fails, the next candidate in the chain is tried sequentially. There is no partial-output fallback or streaming handoff between providers.
Flow:
- detect intent from the latest user message
- estimate complexity from prompt length
- score every reachable (provider, model) pair globally — not per-provider — from
openclaw.json - in
local-first, operate as local-strict: reject centralized Internet API providers and only allow local/LAN/Tailnet endpoints plus approved decentralized providers such as Darkbloom, with Ollama:cloudmodels excluded - for
GENERAL, blend static heuristics with persisted empirical latency stats by provider and model - rank candidates by API type, model-name hints, complexity, and measured latency
- attempt the top
SAGE_ROUTER_MAX_PROVIDER_ATTEMPTScandidates in order sage-routerprovider (the router itself, modelauto) is scored as a low-priority recursive fallback, never preferred
Intent scoring is generic, for example:
- code and analysis strongly favor Anthropic/OpenAI-style reasoning models
- general/realtime requests prefer fast direct providers first
- general traffic learns from real successful request latency over time, with light exploration for cold providers/models
- complex prompts boost larger reasoning models and penalize mini/haiku-class models
Intent is detected by keyword matching on the latest user message. Complexity is estimated by word count.
API
GET /health— JSON with reachable providers, configured providers, and disabled providersPOST /v1/chat/completions— OpenAI-compatible; routes automatically
Notes
openai-codexis kept as an optional bridge, not a required first hop.- Anthropic compatibility is provided through Dario, so
anthropiccan stay inopenclaw.jsonwhile routing locally throughdario. - The repo
systemdunit is template-style and expects local machine values in~/.config/sage-router/sage-router.env. - Empirical latency memory is persisted at
~/.cache/sage-router/latency-stats.jsonby default. - When the OpenClaw gateway model-set path is unhealthy, the helper falls back to running without provider/model overrides instead of failing hard.
- If any provider starts misbehaving, suppress it with
SAGE_ROUTER_DISABLED_PROVIDERSinstead of editing the router. - GitHub workflows now include CI syntax checks and CodeQL analysis for Python + JavaScript.
- See
BRANCH_PROTECTION.mdfor the exact required-check setup on GitHub. provider-profiles.jsonincludes agrok-ssotemplate for the OpenClaw xAI auth plugin's local SuperGrok-backed proxy.
Install
Install the user service from the repo copy:
mkdir -p ~/.config/systemd/user ~/.config/sage-router
cp systemd/sage-router.service ~/.config/systemd/user/sage-router.service
cp systemd/sage-router.env.example ~/.config/sage-router/sage-router.env
# edit ~/.config/sage-router/sage-router.env for your machine
systemctl --user daemon-reload
systemctl --user enable --now sage-router.service
Notes:
- the repo unit is now env-driven and does not hardcode your home path, Node version, or workspace location
- set
SAGE_ROUTER_HOMEto the actual repo path on your machine - optionally set
SAGE_ROUTER_PATH_PREFIXif your Python, Node, or Dario bins are not already on PATH
If an Anthropic provider is detected and Dario is not installed yet, install Dario first:
- GitHub: https://github.com/askalf/dario
Service
systemctl --user status sage-router
systemctl --user restart sage-router
journalctl --user -u sage-router -f # live logs
Docker production notes
- Docker image includes Node, Python, Sage Router, and
@askalf/dario. - Mount host Dario credentials as
~/.dario:/root/.dariofor Anthropic-compatible Claude routing. - Enable llama.cpp classifier sidecar with
docker compose --profile classifier up -dandSAGE_ROUTER_INTENT_CLASSIFIER_ENABLED=1. - Production classifier flags:
SAGE_ROUTER_INTENT_CLASSIFIER_PROVIDER=llamacpp,SAGE_ROUTER_INTENT_CLASSIFIER_BASE_URL=http://llamacpp-classifier:8080,SAGE_ROUTER_INTENT_CLASSIFIER_MODEL=classifier.