VMware AIops
AI-powered VMware vCenter and ESXi operations tool. Manage your entire VMware infrastructure using natural language through any AI coding assistant.
Need read-only monitoring only? Use VMware-Monitor — an independent repository with code-level safety (zero destructive code in the codebase). Install:
clawhub install vmware-monitor
When to Use This Skill
- Query VM, host, datastore, cluster, and network inventory
- Check health status, active alarms, hardware sensors, and event logs
- Perform VM lifecycle operations: power on/off, create, delete, snapshot, clone, migrate
- Deploy VMs from OVA, templates, linked clones, or batch specs
- Browse datastores and discover ISO/OVA/VMDK images
- Monitor vSAN health, capacity, disk groups, and performance
- Access Aria Operations (VCF Operations) for historical metrics, anomaly detection, and capacity planning
- Manage vSphere Kubernetes Service (VKS) clusters
- Run scheduled scanning with webhook notifications (Slack, Discord)
Quick Install
All install methods fetch from the same source: github.com/zw008/VMware-AIops (MIT licensed). We recommend reviewing the source code before installing.
# Via Skills.sh (fetches from GitHub)
npx skills add zw008/VMware-AIops
# Via ClawHub (fetches from ClawHub registry snapshot of GitHub)
clawhub install vmware-aiops
# Via PyPI (recommended for version pinning)
uv tool install vmware-aiops==0.6.0
Claude Code
/plugin marketplace add zw008/VMware-AIops
/plugin install vmware-ops
/vmware-ops:vmware-aiops
Usage Mode
Choose the best mode based on your environment:
| Scenario | Recommended Mode | Why |
|---|---|---|
| Cloud models (Claude, GPT-4o, Gemini) | MCP or CLI | Both work well; MCP gives structured JSON I/O |
| Local/small models (Ollama, Llama, Qwen <32B) | CLI | Lower token cost (~2K vs ~10K), higher accuracy — small models struggle with 31 MCP tool schemas |
| Token-sensitive workflows | CLI | CLI via SKILL.md uses ~2K tokens; MCP loads ~10K tokens of tool definitions into every conversation |
| Automated pipelines / Agent chaining | MCP | Structured JSON input/output, type-safe parameters, no shell parsing |
Calling Priority
- MCP-native tools (Claude Code, Cursor): MCP first, CLI fallback
- Local models / Token-sensitive: CLI first (MCP not needed)
- All other tools: CLI first
Tip: For token-sensitive scenarios, use CLI mode — the AI reads SKILL.md (~2K tokens) and calls commands via Bash. MCP mode loads all 31 tool schemas (~10K tokens) into context on every conversation turn.
CLI Examples
# Activate venv first
source /path/to/VMware-AIops/.venv/bin/activate
# Inventory
vmware-aiops inventory vms --target home-esxi
vmware-aiops inventory hosts --target home-esxi
# Health
vmware-aiops health alarms --target home-esxi
# VM operations
vmware-aiops vm info my-vm --target home-esxi
vmware-aiops vm power-on my-vm --target home-esxi
MCP Mode (Optional)
For Claude Code / Cursor users who prefer structured tool calls, add to ~/.claude/settings.json:
{
"mcpServers": {
"vmware-aiops": {
"command": "/path/to/VMware-AIops/.venv/bin/python",
"args": ["-m", "mcp_server"],
"cwd": "/path/to/VMware-AIops",
"env": {
"VMWARE_AIOPS_CONFIG": "~/.vmware-aiops/config.yaml"
}
}
}
}
MCP exposes 31 tools across 6 categories. All accept optional target parameter.
| Category | Tools |
|---|---|
| Inventory | list_virtual_machines, list_esxi_hosts, list_all_datastores, list_all_clusters |
| Health | get_alarms, get_events, vm_info |
| VM Lifecycle | vm_power_on, vm_power_off, vm_set_ttl, vm_cancel_ttl, vm_list_ttl, vm_clean_slate |
| Deployment | deploy_vm_from_ova, deploy_vm_from_template, deploy_linked_clone, attach_iso_to_vm, convert_vm_to_template, batch_clone_vms, batch_linked_clone_vms, batch_deploy_from_spec |
| Guest Operations | vm_guest_exec, vm_guest_upload, vm_guest_download |
| Plan → Apply | vm_create_plan, vm_apply_plan, vm_rollback_plan, vm_list_plans |
| Datastore | browse_datastore, scan_datastore_images, list_cached_images |
list_virtual_machines auto-compacts when inventory exceeds 50 VMs (returns compact fields only). Use limit or fields to override.
Architecture
User (Natural Language)
↓
AI Tool (Claude Code / Aider / Gemini / Codex / Cursor / Trae / Kimi)
↓
├─ CLI mode (default): vmware-aiops CLI ──→ pyVmomi ──→ vSphere API
│
└─ MCP mode (optional): MCP Server (stdio) ──→ pyVmomi ──→ vSphere API
↓
vCenter Server ──→ ESXi Clusters ──→ VMs
or
ESXi Standalone ──→ VMs
Capabilities
1. Inventory
| Feature | vCenter | ESXi | Details |
|---|---|---|---|
| List VMs | ✅ | ✅ | Name, power state, CPU, memory, guest OS, IP |
| List Hosts | ✅ | ⚠️ Self only | CPU cores, memory, ESXi version, VM count, uptime |
| List Datastores | ✅ | ✅ | Capacity, free/used, type (VMFS/NFS), usage % |
| List Clusters | ✅ | ❌ | Host count, DRS/HA status |
| List Networks | ✅ | ✅ | Network name, associated VM count |
2. Health & Monitoring
| Feature | vCenter | ESXi | Details |
|---|---|---|---|
| Active Alarms | ✅ | ✅ | Severity, alarm name, entity, timestamp |
| Event/Log Query | ✅ | ✅ | Filter by time range, severity; 50+ event types |
| Hardware Sensors | ✅ | ✅ | Temperature, voltage, fan status |
| Host Services | ✅ | ✅ | hostd, vpxa running/stopped status |
Monitored Event Types:
| Category | Events |
|---|---|
| VM Failures | VmFailedToPowerOnEvent, VmDiskFailedEvent, VmFailoverFailed |
| Host Issues | HostConnectionLostEvent, HostShutdownEvent, HostIpChangedEvent |
| Storage | DatastoreCapacityIncreasedEvent, SCSI high latency |
| HA/DRS | DasHostFailedEvent, DrsVmMigratedEvent, DrsSoftRuleViolationEvent |
| Auth | UserLoginSessionEvent, BadUsernameSessionEvent |
3. VM Lifecycle
| Operation | Command | Confirmation | vCenter | ESXi |
|---|---|---|---|---|
| Power On | vm power-on <name> | — | ✅ | ✅ |
| Graceful Shutdown | vm power-off <name> | Double | ✅ | ✅ |
| Force Power Off | vm power-off <name> --force | Double | ✅ | ✅ |
| Reset | vm reset <name> | — | ✅ | ✅ |
| Suspend | vm suspend <name> | — | ✅ | ✅ |
| VM Info | vm info <name> | — | ✅ | ✅ |
| Create VM | vm create <name> --cpu --memory --disk | — | ✅ | ✅ |
| Delete VM | vm delete <name> | Double | ✅ | ✅ |
| Reconfigure | vm reconfigure <name> --cpu --memory | Double | ✅ | ✅ |
| Create Snapshot | vm snapshot-create <name> --name <snap> | — | ✅ | ✅ |
| List Snapshots | vm snapshot-list <name> | — | ✅ | ✅ |
| Revert Snapshot | vm snapshot-revert <name> --name <snap> | — | ✅ | ✅ |
| Delete Snapshot | vm snapshot-delete <name> --name <snap> | — | ✅ | ✅ |
| Clone VM | vm clone <name> --new-name <new> | — | ✅ | ✅ |
| vMotion | vm migrate <name> --to-host <host> | — | ✅ | ❌ |
| Set TTL | vm set-ttl <name> --minutes <n> | — | ✅ | ✅ |
| Cancel TTL | vm cancel-ttl <name> | — | ✅ | ✅ |
| List TTLs | vm list-ttl | — | ✅ | ✅ |
| Clean Slate | vm clean-slate <name> [--snapshot baseline] | Double | ✅ | ✅ |
| Guest Exec | vm guest-exec <name> --cmd /bin/bash --args "-c 'whoami'" | — | ✅ | ✅ |
| Guest Upload | vm guest-upload <name> --local f.sh --guest /tmp/f.sh | — | ✅ | ✅ |
| Guest Download | vm guest-download <name> --guest /var/log/syslog --local ./syslog | — | ✅ | ✅ |
Guest Operations require VMware Tools running inside the guest OS.
Plan → Apply (Multi-step Operations)
For complex operations involving 2+ steps or 2+ VMs, use the plan/apply workflow:
| Step | MCP Tool / CLI | Description |
|---|---|---|
| 1. Create Plan | vm_create_plan | Validates actions, checks targets in vSphere, generates plan with rollback info |
| 2. Review | — | AI shows plan to user: steps, affected VMs, irreversible warnings |
| 3. Apply | vm_apply_plan | Executes sequentially; stops on failure |
| 4. Rollback (if failed) | vm_rollback_plan | Asks user, then reverses executed steps (skips irreversible) |
Plans are stored in ~/.vmware-aiops/plans/, deleted on success, auto-cleaned after 24h.
4. VM Deployment & Provisioning
| Operation | Command | Speed | vCenter | ESXi |
|---|---|---|---|---|
| Deploy from OVA | deploy ova <path> --name <vm> | Minutes | ✅ | ✅ |
| Deploy from Template | deploy template <tmpl> --name <vm> | Minutes | ✅ | ✅ |
| Linked Clone | deploy linked-clone --source <vm> --snapshot <snap> --name <new> | Seconds | ✅ | ✅ |
| Attach ISO | deploy iso <vm> --iso "[ds] path/to.iso" | Instant | ✅ | ✅ |
| Convert to Template | deploy mark-template <vm> | Instant | ✅ | ✅ |
| Batch Clone | deploy batch-clone --source <vm> --count <n> | Minutes | ✅ | ✅ |
| Batch Deploy (YAML) | deploy batch spec.yaml | Auto | ✅ | ✅ |
5. Datastore Browser
| Feature | vCenter | ESXi | Details |
|---|---|---|---|
| Browse Files | ✅ | ✅ | List files/folders in any datastore path |
| Scan Images | ✅ | ✅ | Discover ISO, OVA, OVF, VMDK across all datastores |
| Local Cache | ✅ | ✅ | Registry at ~/.vmware-aiops/image_registry.json |
6. vSAN Management
| Feature | Details |
|---|---|
| Health Check | Cluster-wide health summary, per-group test results |
| Capacity | Total/free/used capacity with projections |
| Disk Groups | Cache SSD + capacity disks per host |
| Performance | IOPS, latency, throughput per cluster/host/VM |
Requires pyVmomi 8.0.3+ (vSAN SDK merged). For older versions, install the standalone vSAN Management SDK.
7. Aria Operations (VCF Operations)
| Feature | Details |
|---|---|
| Historical Metrics | Time-series CPU, memory, disk, network with months of history |
| Anomaly Detection | ML-based dynamic baselines and anomaly alerts |
| Capacity Planning | What-if analysis, time-to-exhaustion, forecasting |
| Right-sizing | CPU/memory recommendations per VM |
| Intelligent Alerts | Root cause analysis, remediation recommendations |
REST API at
/suite-api/. Auth:vRealizeOpsToken. Rebranded as VCF Operations in VCF 9.0.
8. vSphere Kubernetes Service (VKS)
| Feature | Details |
|---|---|
| List Clusters | Tanzu Kubernetes clusters with phase status |
| Cluster Health | InfrastructureReady, ControlPlaneAvailable, WorkersAvailable |
| Scale Workers | Adjust MachineDeployment replicas |
| Node Status | Machine status, ready/unhealthy counts |
Kubernetes-native API via kubectl/kubeconfig. VKS 3.6+ uses Cluster API specification.
9. Scheduled Scanning & Notifications
| Feature | Details |
|---|---|
| Daemon | APScheduler-based, configurable interval (default 15 min) |
| Multi-target Scan | Sequentially scan all configured vCenter/ESXi targets |
| Scan Content | Alarms + Events + Host logs (hostd, vmkernel, vpxd) |
| Log Analysis | Regex pattern matching: error, fail, critical, panic, timeout |
| Webhook | Slack, Discord, or any HTTP endpoint |
Safety Features
| Feature | Details |
|---|---|
| Plan → Confirm → Execute → Log | Structured workflow: show current state, confirm changes, execute, audit log |
| Double Confirmation | All destructive ops (power-off, delete, reconfigure, snapshot-revert/delete, clone, migrate) require 2 sequential confirmations — no bypass flags |
| Rejection Logging | Declined confirmations are recorded in the audit trail for security review |
| Audit Trail | All operations logged to ~/.vmware-aiops/audit.log (JSONL) with before/after state |
| Input Validation | VM name length/format, CPU (1-128), memory (128-1048576 MB), disk (1-65536 GB) validated before execution |
| Password Protection | .env file loading, never in command line or shell history; file permission check at startup |
| SSL Self-signed Support | disableSslCertValidation — only for ESXi hosts with self-signed certificates in isolated lab/home environments. Production environments should use CA-signed certificates with full TLS verification enabled. |
| Task Waiting | All async operations wait for completion and report result |
| State Validation | Pre-operation checks (VM exists, power state correct) |
Version Compatibility
| vSphere Version | Support | Notes |
|---|---|---|
| 8.0 / 8.0U1-U3 | ✅ Full | CreateSnapshot_Task deprecated → use CreateSnapshotEx_Task |
| 7.0 / 7.0U1-U3 | ✅ Full | All APIs supported |
| 6.7 | ✅ Compatible | Backward-compatible, tested |
| 6.5 | ✅ Compatible | Backward-compatible, tested |
pyVmomi auto-negotiates the API version during SOAP handshake — no manual configuration needed.
Supported AI Platforms
| Platform | Status | Config File |
|---|---|---|
| Claude Code | ✅ Native Skill | plugins/.../SKILL.md |
| Gemini CLI | ✅ Extension | gemini-extension/GEMINI.md |
| OpenAI Codex CLI | ✅ Skill + AGENTS.md | codex-skill/AGENTS.md |
| Aider | ✅ Conventions | codex-skill/AGENTS.md |
| Continue CLI | ✅ Rules | codex-skill/AGENTS.md |
| Trae IDE | ✅ Rules | trae-rules/project_rules.md |
| Kimi Code CLI | ✅ Skill | kimi-skill/SKILL.md |
| MCP Server | ✅ MCP Protocol | mcp_server/ |
| Python CLI | ✅ Standalone | N/A |
MCP Server — Local Agent Compatibility
The MCP server works with any MCP-compatible agent via stdio transport. Config templates in examples/mcp-configs/:
| Agent | Local Models | Config Template |
|---|---|---|
| Goose (Block) | ✅ Ollama, LM Studio | goose.json |
| LocalCowork (Liquid AI) | ✅ Fully offline | localcowork.json |
| mcp-agent (LastMile AI) | ✅ Ollama, vLLM | mcp-agent.yaml |
| VS Code Copilot | — | vscode-copilot.json |
| Cursor | — | cursor.json |
| Continue | ✅ Ollama | continue.yaml |
| Claude Code | — | claude-code.json |
# Example: Aider + Ollama (fully local, no cloud API)
aider --conventions codex-skill/AGENTS.md --model ollama/qwen2.5-coder:32b
CLI Reference
# Diagnostics
vmware-aiops doctor [--skip-auth]
# MCP Config Generator
vmware-aiops mcp-config generate --agent <goose|cursor|claude-code|continue|vscode-copilot|localcowork|mcp-agent>
vmware-aiops mcp-config list
# Inventory
vmware-aiops inventory vms [--target <name>] [--limit <n>] [--sort-by name|cpu|memory_mb|power_state] [--power-state poweredOn|poweredOff]
vmware-aiops inventory hosts [--target <name>]
vmware-aiops inventory datastores [--target <name>]
vmware-aiops inventory clusters [--target <name>]
# Health
vmware-aiops health alarms [--target <name>]
vmware-aiops health events [--hours 24] [--severity warning]
# VM Operations
vmware-aiops vm info <vm-name>
vmware-aiops vm power-on <vm-name>
vmware-aiops vm power-off <vm-name> [--force]
vmware-aiops vm create <name> [--cpu <n>] [--memory <mb>] [--disk <gb>]
vmware-aiops vm delete <vm-name> [--confirm]
vmware-aiops vm reconfigure <vm-name> [--cpu <n>] [--memory <mb>]
vmware-aiops vm snapshot-create <vm-name> --name <snap-name>
vmware-aiops vm snapshot-list <vm-name>
vmware-aiops vm snapshot-revert <vm-name> --name <snap-name>
vmware-aiops vm snapshot-delete <vm-name> --name <snap-name>
vmware-aiops vm clone <vm-name> --new-name <name>
vmware-aiops vm migrate <vm-name> --to-host <host>
vmware-aiops vm set-ttl <vm-name> --minutes <n>
vmware-aiops vm cancel-ttl <vm-name>
vmware-aiops vm list-ttl
vmware-aiops vm clean-slate <vm-name> [--snapshot baseline]
# Guest Operations (requires VMware Tools)
vmware-aiops vm guest-exec <vm-name> --cmd /bin/bash --args "-c 'ls -la /tmp'" --user root
vmware-aiops vm guest-upload <vm-name> --local ./script.sh --guest /tmp/script.sh --user root
vmware-aiops vm guest-download <vm-name> --guest /var/log/syslog --local ./syslog.txt --user root
# Plan → Apply (multi-step operations)
vmware-aiops plan list
# Deploy
vmware-aiops deploy ova <path> --name <vm-name> [--datastore <ds>] [--network <net>]
vmware-aiops deploy template <template-name> --name <vm-name> [--datastore <ds>]
vmware-aiops deploy linked-clone --source <vm> --snapshot <snap> --name <new-name>
vmware-aiops deploy iso <vm-name> --iso "[datastore] path/file.iso"
vmware-aiops deploy mark-template <vm-name>
vmware-aiops deploy batch-clone --source <vm> --count <n> [--prefix <prefix>]
vmware-aiops deploy batch <spec.yaml>
# Datastore
vmware-aiops datastore browse <ds-name> [--path <subdir>]
vmware-aiops datastore scan-images [--target <name>]
vmware-aiops datastore images [--type ova|iso|vmdk] [--ds <name>]
# vSAN
vmware-aiops vsan health [--target <name>]
vmware-aiops vsan capacity [--target <name>]
vmware-aiops vsan disks [--target <name>]
vmware-aiops vsan performance [--hours 1]
# Aria Operations
vmware-aiops ops alerts [--severity critical]
vmware-aiops ops metrics <resource-name> [--hours 24]
vmware-aiops ops recommendations [--target <name>]
vmware-aiops ops capacity <cluster-name>
# VKS (Kubernetes)
vmware-aiops vks clusters [--namespace default]
vmware-aiops vks health <cluster-name>
vmware-aiops vks scale <machine-deployment> --replicas <n>
vmware-aiops vks nodes <cluster-name>
# Scanning & Daemon
vmware-aiops scan now [--target <name>]
vmware-aiops daemon start
vmware-aiops daemon stop
vmware-aiops daemon status
Setup
# 1. Install from PyPI (source: github.com/zw008/VMware-AIops)
uv tool install vmware-aiops
# 2. Verify installation source
vmware-aiops --version # confirms installed version
# 3. Configure
mkdir -p ~/.vmware-aiops
vmware-aiops init # generates config.yaml and .env templates
chmod 600 ~/.vmware-aiops/.env
# Edit ~/.vmware-aiops/config.yaml and .env with your target details
What Gets Installed
The vmware-aiops package installs a Python CLI binary and its dependencies (pyVmomi, Click, Rich, APScheduler, python-dotenv). No background services, daemons, or system-level changes are made during installation. The scheduled scanner (daemon start) only runs when explicitly started by the user.
Development Install
git clone https://github.com/zw008/VMware-AIops.git
cd VMware-AIops
uv venv && source .venv/bin/activate
uv pip install -e .
Security
-
Source Code: Fully open source at github.com/zw008/VMware-AIops (MIT). The
uvinstaller fetches thevmware-aiopspackage from PyPI, which is built from this GitHub repository. We recommend reviewing the source code and commit history before deploying in production. -
TLS Verification: Enabled by default. The
disableSslCertValidationoption exists solely for ESXi hosts using self-signed certificates in isolated lab/home environments. In production, always use CA-signed certificates with full TLS verification. -
Credentials & Config: This skill requires the following secrets, all stored in
~/.vmware-aiops/.env(chmod 600, loaded viapython-dotenv):VSPHERE_USER— vCenter/ESXi service account usernameVSPHERE_PASSWORD— service account password- (Optional) Webhook URLs for Slack/Discord notifications
The config file
~/.vmware-aiops/config.yamlstores only target hostnames, ports, and a reference to the.envfile — it does not contain passwords or tokens. The env varVMWARE_AIOPS_CONFIGpoints to this YAML file. -
Webhook Data Scope: Webhook notifications are disabled by default. When enabled, they send infrastructure health summaries (alarm counts, event types, host status) to user-configured URLs only (Slack, Discord, or any HTTP endpoint you control). No data is sent to third-party services. Webhook payloads contain no credentials, IPs, or personally identifiable information — only aggregated alert metadata.
-
Prompt Injection Protection: All vSphere-sourced content (event messages, host logs) is truncated, stripped of control characters, and wrapped in boundary markers (
[VSPHERE_EVENT]/[VSPHERE_HOST_LOG]) before output to prevent prompt injection when consumed by LLM agents. -
Least Privilege: Use a dedicated vCenter service account with minimal permissions. For monitoring-only use cases, prefer the read-only VMware-Monitor skill which has zero destructive code paths.
License
MIT