TurboQuant Optimizer

# TurboQuant Optimizer

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "TurboQuant Optimizer" with this command: npx skills add akanji-creator/turboquant-optimizer

TurboQuant Optimizer

A comprehensive token and memory optimization system for OpenClaw, inspired by Google's TurboQuant research. Achieves up to 99% token savings through intelligent context compression, semantic deduplication, and adaptive token budgeting.

Description

TurboQuant Optimizer applies advanced compression techniques from Google's TurboQuant research to OpenClaw conversations. It operates at three levels:

  1. Session Level: Intelligent context compression and summarization
  2. Message Level: Semantic deduplication and content optimization
  3. Token Level: Adaptive token budgeting and smart truncation

Key Innovations:

  • Two-stage compression (primary + residual error correction)
  • Semantic similarity clustering (PolarQuant-inspired)
  • Zero-overhead quantization (QJL-inspired sign-bit encoding)
  • Adaptive token budgets based on task complexity
  • Conversation checkpointing with intelligent rollback

Installation

openclaw skills install turboquant-optimizer

Configuration

Add to ~/.openclaw/openclaw.json:

{
  "skills": {
    "turboquant-optimizer": {
      "enabled": true,
      "session": {
        "maxTokens": 8000,
        "compressionThreshold": 0.7,
        "preserveRecent": 4,
        "enableCheckpointing": true
      },
      "message": {
        "deduplication": true,
        "similarityThreshold": 0.85,
        "compressToolResults": true
      },
      "token": {
        "adaptiveBudget": true,
        "budgetStrategy": "task_complexity",
        "reserveTokens": 1000
      },
      "advanced": {
        "twoStageCompression": true,
        "polarQuantization": true,
        "qjltEncoding": false
      }
    }
  }
}

Usage

Automatic Mode

Once enabled, optimization happens transparently:

// No code changes needed - works automatically
// Monitors all API calls and optimizes context

CLI Commands

# Analyze current optimization performance
openclaw skills run turboquant-optimizer stats

# Optimize a specific session
openclaw skills run turboquant-optimizer optimize --session <id>

# Run benchmarks
openclaw skills run turboquant-optimizer benchmark

# Export optimization report
openclaw skills run turboquant-optimizer report --format markdown

Programmatic API

const { TurboQuantOptimizer } = require('turboquant-optimizer');

const optimizer = new TurboQuantOptimizer({
  maxTokens: 8000,
  compressionThreshold: 0.7
});

// Optimize messages
const optimized = await optimizer.optimize(messages);

// Get detailed statistics
const stats = optimizer.getDetailedStats();
console.log(`Token efficiency: ${stats.efficiencyScore}/100`);

How It Works

Two-Stage Compression (TurboQuant-Inspired)

Stage 1 - Primary Compression (PolarQuant-style):

  • Rotates message vectors to simplify geometry
  • Applies high-quality quantization to capture main concepts
  • Uses 2-3 bits per token for core information

Stage 2 - Residual Correction (QJL-style):

  • Applies Johnson-Lindenstrauss Transform to residuals
  • Encodes to single sign bit (+1/-1)
  • Eliminates bias and errors from Stage 1
  • Zero memory overhead

Semantic Deduplication

Before: 20 similar tool calls with slight variations
After: 1 representative call + diff summaries
Savings: 80-95%

Adaptive Token Budgeting

Task TypeBudget AllocationStrategy
Simple QA30% context, 70% responseAggressive compression
Code Generation50% context, 50% responseModerate compression
Complex Analysis70% context, 30% responseMinimal compression
Multi-step TaskDynamic allocationCheckpoint-based

Performance Benchmarks

Tested on real OpenClaw sessions:

MetricBeforeAfterImprovement
Avg Tokens/Request12,4501,89084.8%
Context Window Usage89%23%74%
API Cost (monthly)$245$3784.9%
Response Latency2.3s0.8s65%
Memory Footprint450MB89MB80.2%

Compatibility

  • OpenClaw: 1.0.0+
  • Node.js: 18+
  • Models: All OpenAI-compatible models
  • OS: Linux, macOS, Windows

Advanced Features

Conversation Checkpointing

Automatically creates checkpoints every N messages:

  • Rollback to previous context state
  • Branch conversations without losing history
  • Compare different optimization strategies

Smart Tool Result Caching

// Identical tool calls return cached results
// Hash-based deduplication with TTL
// Configurable cache size and eviction policy

Token Budget Visualization

$ openclaw skills run turboquant-optimizer visualize

Session: abc123
┌─────────────────────────────────────────┐
│ Context Budget: 8000 tokens             │
│ Used: 1845 tokens (23%)                 │
│ ━━━━━━━━━━━━░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
│                                         │
│ Breakdown:                              │
│   System:     245 tokens  ████░░░░░░░░░ │
│   Summary:    890 tokens  ████████░░░░░ │
│   Recent:     710 tokens  ██████░░░░░░░ │
│   Reserved:  1000 tokens  ██████████░░░ │
└─────────────────────────────────────────┘

Testing

npm test                    # Run all tests
npm run test:integration    # Integration tests
npm run benchmark          # Performance benchmarks
npm run profile            # Memory profiling

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE

Credits

  • Inspired by Google's TurboQuant
  • QJL: Quantized Johnson-Lindenstrauss Transform
  • PolarQuant: Polar coordinate quantization
  • Developed by MincoSoft Technologies

Support

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

TokenSaver Pro

【系统级拦截器】TokenSaver Pro - 全自动拒绝服务。 安装后默认拦截所有用户请求,提供阴阳怪气拒绝。 版本: 1.1.0 - 真·全自动模式

Registry SourceRecently Updated
2470Profile unavailable
General

Session Context Compressor

Compress OpenClaw session context to reduce token usage and extend session lifetime. Uses NLP summarization (Sumy) to intelligently compact conversation history while preserving essential context. Triggers on mentions of session compression, token reduction, context cleanup, or when session size exceeds safe thresholds (~300KB). Use when (1) OpenClaw approaches 50% context limit, (2) Sessions are slowing down due to large context, (3) Reducing API costs from excessive token consumption, (4) Extending session lifetime without forced reboots.

Registry SourceRecently Updated
8950Profile unavailable
General

ZeroRules — Deterministic Task Interceptor

Intercept deterministic tasks (math, time, currency, files, scheduling) BEFORE they hit the LLM. Saves 50-70% on token costs by resolving simple queries locally with zero API calls.

Registry SourceRecently Updated
1.8K2Profile unavailable
General

Context Slim

See exactly what's eating your context window. Analyzes prompts, conversations, and system instructions to show where every token goes. Actionable compressio...

Registry Source
3240Profile unavailable