openclaw-self-healing

4-tier autonomous self-healing system for OpenClaw Gateway with persistent learning, reasoning logs, and multi-channel alerts. Features Claude Code as Level 3 emergency doctor for AI-powered diagnosis and repair.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "openclaw-self-healing" with this command: npx skills add ramsbaby/openclaw-self-healing/ramsbaby-openclaw-self-healing-openclaw-self-healing

OpenClaw Self-Healing System

"The system that heals itself — or calls for help when it can't."

A 4-tier autonomous self-healing system for OpenClaw Gateway.

Architecture

Level 1: Watchdog (180s)     → Process monitoring (OpenClaw built-in)
Level 2: Health Check (300s) → HTTP 200 + 3 retries
Level 3: Claude Recovery     → 30min AI-powered diagnosis 🧠
Level 4: Discord Alert       → Human escalation

What's Special (v2.0)

World's first Claude Code as Level 3 emergency doctor
Persistent Learning - Automatic recovery documentation (symptom → cause → solution → prevention)
Reasoning Logs - Explainable AI decision-making process
Multi-Channel Alerts - Discord + Telegram support
Metrics Dashboard - Success rate, recovery time, trending analysis
Production-tested (verified recovery Feb 5-6, 2026)
macOS LaunchAgent integration

Quick Setup

1. Install Dependencies

brew install tmux
npm install -g @anthropic-ai/claude-code

2. Configure Environment

# Copy template to OpenClaw config directory
cp .env.example ~/.openclaw/.env

# Edit and add your Discord webhook (optional)
nano ~/.openclaw/.env

3. Install Scripts

# Copy scripts
cp scripts/*.sh ~/openclaw/scripts/
chmod +x ~/openclaw/scripts/*.sh

# Install LaunchAgent
cp launchagent/com.openclaw.healthcheck.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.openclaw.healthcheck.plist

4. Verify

# Check Health Check is running
launchctl list | grep openclaw.healthcheck

# View logs
tail -f ~/openclaw/memory/healthcheck-$(date +%Y-%m-%d).log

Scripts

Script	Level	Description
`gateway-healthcheck.sh`	2	HTTP 200 check + 3 retries + escalation
`emergency-recovery.sh`	3	Claude Code PTY session for AI diagnosis (v1)
`emergency-recovery-v2.sh`	3	Enhanced with learning + reasoning logs (v2) ⭐
`emergency-recovery-monitor.sh`	4	Discord/Telegram notification on failure
`metrics-dashboard.sh`	-	Visualize recovery statistics (NEW)

Configuration

All settings via environment variables in ~/.openclaw/.env:

Variable	Default	Description
`DISCORD_WEBHOOK_URL`	(none)	Discord webhook for alerts
`OPENCLAW_GATEWAY_URL`	`http://localhost:18789/`	Gateway health check URL
`HEALTH_CHECK_MAX_RETRIES`	`3`	Restart attempts before escalation
`EMERGENCY_RECOVERY_TIMEOUT`	`1800`	Claude recovery timeout (30 min)

Testing

Test Level 2 (Health Check)

# Run manually
bash ~/openclaw/scripts/gateway-healthcheck.sh

# Expected output:
# ✅ Gateway healthy

Test Level 3 (Claude Recovery)

# Inject a config error (backup first!)
cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak

# Wait for Health Check to detect and escalate (~8 min)
tail -f ~/openclaw/memory/emergency-recovery-*.log

License

MIT License - do whatever you want with it.

Built by @ramsbaby + Jarvis 🦞

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

Coding

arxiv-paper-writer

Use this skill whenever the user wants Claude Code to write, scaffold, compile, debug, or review an arXiv-style academic paper, especially survey papers with LaTeX, BibTeX citations, TikZ figures, tables, and PDF output. This skill should trigger for requests like writing a full paper, creating an arXiv paper project, turning a research topic into a LaTeX manuscript, reproducing the Paper-Write-Skill-Test agent-survey workflow, or setting up a Windows/Linux Claude Code paper-writing loop.

Archived SourceRecently Updated

--16miku

Coding

cli-proxy-troubleshooting

排查 CLI Proxy API（codex-api-proxy）的配置、认证、模型注册和请求问题。适用场景包括：(1) AI 请求报错 unknown provider for model, (2) 模型列表中缺少预期模型, (3) codex-api-key/auth-dir 配置不生效, (4) CLI Proxy 启动后 AI 无法调用, (5) 认证成功但请求失败或超时。包含源码级排查方法：模型注册表架构、认证加载链路、 SanitizeCodexKeys 规则、常见错误的真实根因。

Archived SourceRecently Updated

--17329971

Coding

visual-summary-analysis

Performs AI analysis on input video clips/image content and generates a smooth, natural scene description. | 视觉摘要智述技能，对传入的视频片段/图片内容进行AI分析，生成一段通顺自然的场景描述内容

Archived SourceRecently Updated

--18072937735

Coding

frontend-skill

全能高级前端研发工程师技能。擅长AI时代前沿技术栈（React最新 + shadcn/ui + Tailwind CSS v4 + TypeScript + Next.js），精通动效库与交互特效开发。采用Glue Code风格快速实现代码，强调高质量产品体验与高度友好的UI视觉规范。在组件调用、交互特效、全局Theme上保持高度规范：绝不重复造轮子，相同逻辑出现两次即封装为组件。具备安全意识，防范各类注入攻击。开发页面具有高度自适应能力，响应式设计贯穿始终。当用户无特殊技术栈要求时，默认采用主流前沿技术栈。

Archived SourceRecently Updated

--000sonic