3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a local Ollama model to intelligently compress messages and summarize history. Same quality, fewer tokens, lower bills.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "3-Layer Token Compressor — Cut AI API Costs 40-60%" with this command: npx skills add theshadowrose/token-compressor

3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a free local Ollama model to do the compression work — your paid API only sees the condensed result.

Runtime Requirements

Requirement	Details
Ollama	Must be running locally (default: `localhost:11434`)
Local model	A small model for compression (e.g. `llama3.1:8b`). Configurable via `compressionModel` option.
Node.js	14+

Ollama is required at runtime. The compressor sends prompts to your local model — not to any external API.

What This Skill Sends to the Local Model

This skill sends the following to your local Ollama model:

Operation	System prompt	User prompt
Message compression	`You are a text compression tool. Output only what is asked, nothing else.`	Your message + instruction to compress
History summarization	Same	Old conversation turns + instruction to summarize

No data is sent to external APIs. All compression happens locally.

Side Effects

Type	Description
NETWORK	HTTP to `localhost:11434` only — your local Ollama instance
MEMORY	Response cache stored in-memory (Map, configurable size/TTL)
DISK	None — cache is not persisted to disk

Setup

const TokenCompressor = require('./src/token-compressor');

const compressor = new TokenCompressor({
  ollamaHost: 'localhost',      // default
  ollamaPort: 11434,            // default
  compressionModel: 'llama3.1:8b',  // default — any Ollama model works
  maxUncompressedTurns: 10,     // keep last N turns verbatim
  cacheMaxSize: 100,
  cacheTTL: 3600000             // 1 hour
});

See README.md for full API documentation and usage examples.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

TokenRanger

Install, configure, and operate the TokenRanger OpenClaw plugin. Use when you want to reduce cloud LLM token costs by 50-80% via local Ollama context compres...

Registry SourceRecently Updated

3210Profile unavailable

General

Token Tamer — AI API Cost Control

Monitor, budget, and optimize AI API spending across any provider. Tracks every call, enforces budgets, detects waste, provides optimization recommendations.

Registry SourceRecently Updated

2450Profile unavailable

General

Λ-Compression — 90% - 98% Lossless Reasoning Compression

Physics-based lossless compression for AI output — prose AND structured data. Strips 60-98% of tokens with zero information loss. Prose mode compresses reaso...

Registry SourceRecently Updated

1720Profile unavailable

Automation

Tokenoptimizer

Reduce OpenClaw AI costs by 97%. Haiku model routing, free Ollama heartbeats, prompt caching, and budget controls. Go from $1,500/month to $50/month in 5 min...

Registry SourceRecently Updated

9.2K24Profile unavailable