autooptimise

Autonomously optimise any OpenClaw skill using a benchmark-driven experiment loop. Scores skill outputs 0-10 across 4 dimensions, identifies the lowest-scoring pattern, proposes a targeted SKILL.md change, re-tests, and keeps or discards based on measured improvement. Use when asked to: optimise my [skill] skill, run autooptimise on [skill], benchmark my [skill] skill, improve my skill overnight.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "autooptimise" with this command: npx skills add wealthvisionai-source/autooptimise

autooptimise

Autonomous benchmark-driven skill optimisation for OpenClaw. Inspired by Andrej Karpathy's autoresearch — the same modify → test → score → keep/discard loop, applied to agent skill quality instead of GPU training.

Trigger Phrases

  • "optimise my weather skill"
  • "run autooptimise on [skill-name]"
  • "benchmark my [skill-name] skill"
  • "improve my skill overnight"

Key Files

FilePurpose
benchmark/tasks.jsonTest task suite (prompts + expected qualities)
benchmark/scorer.mdLLM judge scoring rubric
runner/run_experiment.mdAutonomous loop instructions (load this next)
runner/experiment_log.mdAuto-created run log (gitignored)

How to Run

  1. Read runner/run_experiment.md — it contains the full loop instructions
  2. Confirm the target skill with the user if not specified
  3. Execute the loop (max 3 iterations)
  4. Present proposed changes for human approval — never auto-apply

Scoring

Use the best available LLM judge model (prefer a strong reasoning model). Score each task 0–10 on:

  • Accuracy — correct answer / correct tool called
  • Conciseness — no padding, no unnecessary text
  • Tool usage — right tool, right parameters
  • Formatting — output matches expected format

Full rubric: benchmark/scorer.md

Safety Rules

  • Never auto-apply changes. Always present a diff and wait for explicit human approval.
  • Never modify benchmark/tasks.json or benchmark/scorer.md during a run.
  • Never exceed 3 iterations per run in v0.1.
  • Log every action to runner/experiment_log.md.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Skill Factory

Build and publish OpenClaw skills from recurring pain points. Scans .learnings/ for errors that hit 3+ recurrences, scaffolds skills from them, and publishes...

Registry SourceRecently Updated
2080Profile unavailable
Automation

ADK Skill Patterns

5 proven agent skill design patterns (Tool Wrapper, Generator, Reviewer, Inversion, Pipeline) from Google's ADK. Build reliable, composable skills with templ...

Registry SourceRecently Updated
1180Profile unavailable
Automation

agents

No summary provided by upstream source.

Repository SourceNeeds Review
General

Multi-Skill-Eval | 集成化技能评估系统

集成化多方法技能评估系统。整合静态分析(skill-assessment)、Rubric质量打分(skill-evaluator)和自主基准测试(skill-eval)。用于全面评估、对比、审计或改进OpenClaw技能。覆盖文档完整性、代码质量、25项Rubric打分、多模型基准测试。 触发词(中文): 评估技...

Registry SourceRecently Updated
871Profile unavailable