tokenranger

Install, configure, and operate the TokenRanger OpenClaw plugin. Use when you want to reduce cloud LLM token costs by 50-80% via local Ollama context compression, or when diagnosing TokenRanger sidecar issues.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tokenranger" with this command: npx skills add synchronic1/tokenranger

TokenRanger

TokenRanger compresses session context through a local Ollama SLM before sending to cloud LLMs — reducing input token costs by 50–80% per turn with graceful fallthrough if anything goes wrong.


When to Load This Skill

  • User asks to install, configure, or troubleshoot TokenRanger
  • User wants to reduce token costs or enable context compression
  • User runs /tokenranger commands and needs help interpreting output
  • User wants to switch compression strategy (GPU/CPU/off)
  • User asks about upgrading or uninstalling TokenRanger

How It Works

User message → OpenClaw gateway
  → before_agent_start hook
  → Turn 1: skip (full fidelity)
  → Turn 2+: send history to localhost:8100/compress
  → FastAPI sidecar runs LangChain LCEL chain via Ollama
  → Compressed summary prepended to context
  → Cloud LLM receives compressed context instead of full history

Inference strategy is auto-selected by GPU availability:

StrategyTriggerModelApproach
fullGPU availablemistral:7bDeep semantic summarization
lightCPU onlyphi3.5:3bExtractive bullet points
passthroughOllama unreachableTruncate to last 20 lines

Install

Step 1 — Install the plugin

openclaw plugins install openclaw-plugin-tokenranger

To pin an exact version:

openclaw plugins install openclaw-plugin-tokenranger@1.0.0 --pin

Step 2 — First-time setup

openclaw tokenranger setup

This pulls Ollama models, creates the Python venv, installs FastAPI/LangChain deps, and registers the sidecar as a system service (systemd on Linux, launchd on macOS).

Step 3 — Restart gateway

openclaw gateway restart

Step 4 — Verify

openclaw tokenranger

Should show current settings and sidecar status (reachable / unreachable).


Configuration

Set config values with:

openclaw config set plugins.entries.tokenranger.config.<key> <value>
openclaw gateway restart
KeyDefaultDescription
serviceUrlhttp://127.0.0.1:8100TokenRanger sidecar URL
timeoutMs10000Max wait before fallthrough
minPromptLength500Min chars before compressing
ollamaUrlhttp://127.0.0.1:11434Ollama API URL
preferredModelmistral:7bModel for GPU strategy
compressionStrategyautoauto / full / light / passthrough
inferenceModeautoauto / cpu / gpu / remote

Force CPU-only mode:

openclaw config set plugins.entries.tokenranger.config.compressionStrategy light
openclaw config set plugins.entries.tokenranger.config.inferenceMode cpu
openclaw gateway restart

Commands

CommandDescription
/tokenrangerShow current settings and sidecar health
/tokenranger mode gpuForce GPU (full) compression
/tokenranger mode cpuForce CPU (light) compression
/tokenranger mode offDisable compression (passthrough)
/tokenranger modelList available Ollama models
/tokenranger toggleEnable / disable the plugin

Upgrading

# Check for updates (dry run)
openclaw plugins update tokenranger --dry-run

# Apply update
openclaw plugins update tokenranger
openclaw tokenranger setup   # re-runs setup if sidecar deps changed
openclaw gateway restart

To pin a specific version:

openclaw plugins install openclaw-plugin-tokenranger@2026.3.1 --pin
openclaw tokenranger setup
openclaw gateway restart

List all published versions:

npm view openclaw-plugin-tokenranger versions --json

Uninstalling

openclaw plugins uninstall tokenranger
openclaw gateway restart

Remove the sidecar service manually:

# Linux
systemctl --user stop tokenranger && systemctl --user disable tokenranger
rm ~/.config/systemd/user/tokenranger.service

# macOS
launchctl unload ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist
rm ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist

Troubleshooting

Sidecar unreachable after setup:

# Linux
systemctl --user status tokenranger
journalctl --user -u tokenranger -n 50

# macOS
launchctl list | grep tokenranger
cat ~/Library/Logs/tokenranger.log

# Manual start (any platform)
~/.openclaw/extensions/tokenranger/service/start.sh

Ollama not found:

curl http://127.0.0.1:11434/api/tags
# If not running:
ollama serve

Compression not reducing tokens:

  • Check minPromptLength — default 500 chars; short conversations are skipped by design
  • Run /tokenranger to confirm strategy is not passthrough
  • Check sidecar logs for errors

Graceful degradation: TokenRanger never blocks a message. Any failure → silent fallthrough to uncompressed cloud LLM call.


Performance Reference

5-turn Discord benchmark (GPU, mistral:7b-instruct):

TurnInput tokensCompressedReduction
273212582.9%
31,18015087.3%
41,68521287.4%
52,02827786.3%

Cumulative: 5,866 → 885 tokens (84.9% reduction)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a local Ollama model to intelligently compress messages and summarize hist...

Registry SourceRecently Updated
5890Profile unavailable
General

TurboQuant Optimizer

Optimizes OpenClaw token usage via multi-level compression, semantic deduplication, and adaptive token budgeting to reduce API costs and memory footprint.

Registry SourceRecently Updated
870Profile unavailable
General

Context Slim

See exactly what's eating your context window. Analyzes prompts, conversations, and system instructions to show where every token goes. Actionable compressio...

Registry SourceRecently Updated
3380Profile unavailable
General

Session Context Compressor

Compress OpenClaw session context to reduce token usage and extend session lifetime. Uses NLP summarization (Sumy) to intelligently compact conversation history while preserving essential context. Triggers on mentions of session compression, token reduction, context cleanup, or when session size exceeds safe thresholds (~300KB). Use when (1) OpenClaw approaches 50% context limit, (2) Sessions are slowing down due to large context, (3) Reducing API costs from excessive token consumption, (4) Extending session lifetime without forced reboots.

Registry SourceRecently Updated
9320Profile unavailable