autooptimise

Autonomously optimise any OpenClaw skill using a benchmark-driven experiment loop. Scores skill outputs 0-10 across 4 dimensions, identifies the lowest-scoring pattern, proposes a targeted SKILL.md change, re-tests, and keeps or discards based on measured improvement. Use when asked to: optimise my [skill] skill, run autooptimise on [skill], benchmark my [skill] skill, improve my skill overnight.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "autooptimise" with this command: npx skills add wealthvisionai-source/autooptimise

autooptimise

Autonomous benchmark-driven skill optimisation for OpenClaw. Inspired by Andrej Karpathy's autoresearch — the same modify → test → score → keep/discard loop, applied to agent skill quality instead of GPU training.

Trigger Phrases

"optimise my weather skill"
"run autooptimise on [skill-name]"
"benchmark my [skill-name] skill"
"improve my skill overnight"

Key Files

File	Purpose
`benchmark/tasks.json`	Test task suite (prompts + expected qualities)
`benchmark/scorer.md`	LLM judge scoring rubric
`runner/run_experiment.md`	Autonomous loop instructions (load this next)
`runner/experiment_log.md`	Auto-created run log (gitignored)

How to Run

Read runner/run_experiment.md — it contains the full loop instructions
Confirm the target skill with the user if not specified
Execute the loop (max 3 iterations)
Present proposed changes for human approval — never auto-apply

Scoring

Use the best available LLM judge model (prefer a strong reasoning model). Score each task 0–10 on:

Accuracy — correct answer / correct tool called
Conciseness — no padding, no unnecessary text
Tool usage — right tool, right parameters
Formatting — output matches expected format

Full rubric: benchmark/scorer.md

Safety Rules

Never auto-apply changes. Always present a diff and wait for explicit human approval.
Never modify benchmark/tasks.json or benchmark/scorer.md during a run.
Never exceed 3 iterations per run in v0.1.
Log every action to runner/experiment_log.md.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

Automation

Skill Factory

Build and publish OpenClaw skills from recurring pain points. Scans .learnings/ for errors that hit 3+ recurrences, scaffolds skills from them, and publishes...

Registry SourceRecently Updated

2080Profile unavailable

Automation

ADK Skill Patterns

5 proven agent skill design patterns (Tool Wrapper, Generator, Reviewer, Inversion, Pipeline) from Google's ADK. Build reliable, composable skills with templ...

Registry SourceRecently Updated

1180Profile unavailable

Automation

agents

No summary provided by upstream source.

Repository SourceNeeds Review

2.6K-elevenlabs

General

Multi-Skill-Eval | 集成化技能评估系统

集成化多方法技能评估系统。整合静态分析(skill-assessment)、Rubric质量打分(skill-evaluator)和自主基准测试(skill-eval)。用于全面评估、对比、审计或改进OpenClaw技能。覆盖文档完整性、代码质量、25项Rubric打分、多模型基准测试。触发词(中文): 评估技...

Registry SourceRecently Updated

871Profile unavailable