VMware AIops

AI-powered VMware vCenter and ESXi operations tool. Manage your entire VMware infrastructure using natural language through any AI coding assistant.

Need read-only monitoring only? Use VMware-Monitor — an independent repository with code-level safety (zero destructive code in the codebase). Install: clawhub install vmware-monitor

When to Use This Skill

Query VM, host, datastore, cluster, and network inventory
Check health status, active alarms, hardware sensors, and event logs
Perform VM lifecycle operations: power on/off, create, delete, snapshot, clone, migrate
Deploy VMs from OVA, templates, linked clones, or batch specs
Browse datastores and discover ISO/OVA/VMDK images
Monitor vSAN health, capacity, disk groups, and performance
Access Aria Operations (VCF Operations) for historical metrics, anomaly detection, and capacity planning
Manage vSphere Kubernetes Service (VKS) clusters
Run scheduled scanning with webhook notifications (Slack, Discord)

Quick Install

All install methods fetch from the same source: github.com/zw008/VMware-AIops (MIT licensed). We recommend reviewing the source code before installing.

# Via Skills.sh (fetches from GitHub)
npx skills add zw008/VMware-AIops

# Via ClawHub (fetches from ClawHub registry snapshot of GitHub)
clawhub install vmware-aiops

# Via PyPI (recommended for version pinning)
uv tool install vmware-aiops==0.6.0

Claude Code

/plugin marketplace add zw008/VMware-AIops
/plugin install vmware-ops
/vmware-ops:vmware-aiops

Usage Mode

Choose the best mode based on your environment:

Scenario	Recommended Mode	Why
Cloud models (Claude, GPT-4o, Gemini)	MCP or CLI	Both work well; MCP gives structured JSON I/O
Local/small models (Ollama, Llama, Qwen <32B)	CLI	Lower token cost (~2K vs ~10K), higher accuracy — small models struggle with 31 MCP tool schemas
Token-sensitive workflows	CLI	CLI via SKILL.md uses ~2K tokens; MCP loads ~10K tokens of tool definitions into every conversation
Automated pipelines / Agent chaining	MCP	Structured JSON input/output, type-safe parameters, no shell parsing

Calling Priority

MCP-native tools (Claude Code, Cursor): MCP first, CLI fallback
Local models / Token-sensitive: CLI first (MCP not needed)
All other tools: CLI first

Tip: For token-sensitive scenarios, use CLI mode — the AI reads SKILL.md (~2K tokens) and calls commands via Bash. MCP mode loads all 31 tool schemas (~10K tokens) into context on every conversation turn.

CLI Examples

# Activate venv first
source /path/to/VMware-AIops/.venv/bin/activate

# Inventory
vmware-aiops inventory vms --target home-esxi
vmware-aiops inventory hosts --target home-esxi

# Health
vmware-aiops health alarms --target home-esxi

# VM operations
vmware-aiops vm info my-vm --target home-esxi
vmware-aiops vm power-on my-vm --target home-esxi

MCP Mode (Optional)

For Claude Code / Cursor users who prefer structured tool calls, add to ~/.claude/settings.json:

{
  "mcpServers": {
    "vmware-aiops": {
      "command": "/path/to/VMware-AIops/.venv/bin/python",
      "args": ["-m", "mcp_server"],
      "cwd": "/path/to/VMware-AIops",
      "env": {
        "VMWARE_AIOPS_CONFIG": "~/.vmware-aiops/config.yaml"
      }
    }
  }
}

MCP exposes 31 tools across 6 categories. All accept optional target parameter.

Category	Tools
Inventory	`list_virtual_machines`, `list_esxi_hosts`, `list_all_datastores`, `list_all_clusters`
Health	`get_alarms`, `get_events`, `vm_info`
VM Lifecycle	`vm_power_on`, `vm_power_off`, `vm_set_ttl`, `vm_cancel_ttl`, `vm_list_ttl`, `vm_clean_slate`
Deployment	`deploy_vm_from_ova`, `deploy_vm_from_template`, `deploy_linked_clone`, `attach_iso_to_vm`, `convert_vm_to_template`, `batch_clone_vms`, `batch_linked_clone_vms`, `batch_deploy_from_spec`
Guest Operations	`vm_guest_exec`, `vm_guest_upload`, `vm_guest_download`
Plan → Apply	`vm_create_plan`, `vm_apply_plan`, `vm_rollback_plan`, `vm_list_plans`
Datastore	`browse_datastore`, `scan_datastore_images`, `list_cached_images`

list_virtual_machines auto-compacts when inventory exceeds 50 VMs (returns compact fields only). Use limit or fields to override.

Architecture

User (Natural Language)
  ↓
AI Tool (Claude Code / Aider / Gemini / Codex / Cursor / Trae / Kimi)
  ↓
  ├─ CLI mode (default): vmware-aiops CLI ──→ pyVmomi ──→ vSphere API
  │
  └─ MCP mode (optional): MCP Server (stdio) ──→ pyVmomi ──→ vSphere API
  ↓
vCenter Server ──→ ESXi Clusters ──→ VMs
    or
ESXi Standalone ──→ VMs

Capabilities

1. Inventory

Feature	vCenter	ESXi	Details
List VMs	✅	✅	Name, power state, CPU, memory, guest OS, IP
List Hosts	✅	⚠️ Self only	CPU cores, memory, ESXi version, VM count, uptime
List Datastores	✅	✅	Capacity, free/used, type (VMFS/NFS), usage %
List Clusters	✅	❌	Host count, DRS/HA status
List Networks	✅	✅	Network name, associated VM count

2. Health & Monitoring

Feature	vCenter	ESXi	Details
Active Alarms	✅	✅	Severity, alarm name, entity, timestamp
Event/Log Query	✅	✅	Filter by time range, severity; 50+ event types
Hardware Sensors	✅	✅	Temperature, voltage, fan status
Host Services	✅	✅	hostd, vpxa running/stopped status

Monitored Event Types:

Category	Events
VM Failures	`VmFailedToPowerOnEvent`, `VmDiskFailedEvent`, `VmFailoverFailed`
Host Issues	`HostConnectionLostEvent`, `HostShutdownEvent`, `HostIpChangedEvent`
Storage	`DatastoreCapacityIncreasedEvent`, SCSI high latency
HA/DRS	`DasHostFailedEvent`, `DrsVmMigratedEvent`, `DrsSoftRuleViolationEvent`
Auth	`UserLoginSessionEvent`, `BadUsernameSessionEvent`

3. VM Lifecycle

Operation	Command	Confirmation	vCenter	ESXi
Power On	`vm power-on <name>`	—	✅	✅
Graceful Shutdown	`vm power-off <name>`	Double	✅	✅
Force Power Off	`vm power-off <name> --force`	Double	✅	✅
Reset	`vm reset <name>`	—	✅	✅
Suspend	`vm suspend <name>`	—	✅	✅
VM Info	`vm info <name>`	—	✅	✅
Create VM	`vm create <name> --cpu --memory --disk`	—	✅	✅
Delete VM	`vm delete <name>`	Double	✅	✅
Reconfigure	`vm reconfigure <name> --cpu --memory`	Double	✅	✅
Create Snapshot	`vm snapshot-create <name> --name <snap>`	—	✅	✅
List Snapshots	`vm snapshot-list <name>`	—	✅	✅
Revert Snapshot	`vm snapshot-revert <name> --name <snap>`	—	✅	✅
Delete Snapshot	`vm snapshot-delete <name> --name <snap>`	—	✅	✅
Clone VM	`vm clone <name> --new-name <new>`	—	✅	✅
vMotion	`vm migrate <name> --to-host <host>`	—	✅	❌
Set TTL	`vm set-ttl <name> --minutes <n>`	—	✅	✅
Cancel TTL	`vm cancel-ttl <name>`	—	✅	✅
List TTLs	`vm list-ttl`	—	✅	✅
Clean Slate	`vm clean-slate <name> [--snapshot baseline]`	Double	✅	✅
Guest Exec	`vm guest-exec <name> --cmd /bin/bash --args "-c 'whoami'"`	—	✅	✅
Guest Upload	`vm guest-upload <name> --local f.sh --guest /tmp/f.sh`	—	✅	✅
Guest Download	`vm guest-download <name> --guest /var/log/syslog --local ./syslog`	—	✅	✅

Guest Operations require VMware Tools running inside the guest OS.

Plan → Apply (Multi-step Operations)

For complex operations involving 2+ steps or 2+ VMs, use the plan/apply workflow:

Step	MCP Tool / CLI	Description
1. Create Plan	`vm_create_plan`	Validates actions, checks targets in vSphere, generates plan with rollback info
2. Review	—	AI shows plan to user: steps, affected VMs, irreversible warnings
3. Apply	`vm_apply_plan`	Executes sequentially; stops on failure
4. Rollback (if failed)	`vm_rollback_plan`	Asks user, then reverses executed steps (skips irreversible)

Plans are stored in ~/.vmware-aiops/plans/, deleted on success, auto-cleaned after 24h.

4. VM Deployment & Provisioning

Operation	Command	Speed	vCenter	ESXi
Deploy from OVA	`deploy ova <path> --name <vm>`	Minutes	✅	✅
Deploy from Template	`deploy template <tmpl> --name <vm>`	Minutes	✅	✅
Linked Clone	`deploy linked-clone --source <vm> --snapshot <snap> --name <new>`	Seconds	✅	✅
Attach ISO	`deploy iso <vm> --iso "[ds] path/to.iso"`	Instant	✅	✅
Convert to Template	`deploy mark-template <vm>`	Instant	✅	✅
Batch Clone	`deploy batch-clone --source <vm> --count <n>`	Minutes	✅	✅
Batch Deploy (YAML)	`deploy batch spec.yaml`	Auto	✅	✅

5. Datastore Browser

Feature	vCenter	ESXi	Details
Browse Files	✅	✅	List files/folders in any datastore path
Scan Images	✅	✅	Discover ISO, OVA, OVF, VMDK across all datastores
Local Cache	✅	✅	Registry at `~/.vmware-aiops/image_registry.json`

6. vSAN Management

Feature	Details
Health Check	Cluster-wide health summary, per-group test results
Capacity	Total/free/used capacity with projections
Disk Groups	Cache SSD + capacity disks per host
Performance	IOPS, latency, throughput per cluster/host/VM

Requires pyVmomi 8.0.3+ (vSAN SDK merged). For older versions, install the standalone vSAN Management SDK.

7. Aria Operations (VCF Operations)

Feature	Details
Historical Metrics	Time-series CPU, memory, disk, network with months of history
Anomaly Detection	ML-based dynamic baselines and anomaly alerts
Capacity Planning	What-if analysis, time-to-exhaustion, forecasting
Right-sizing	CPU/memory recommendations per VM
Intelligent Alerts	Root cause analysis, remediation recommendations

REST API at /suite-api/. Auth: vRealizeOpsToken. Rebranded as VCF Operations in VCF 9.0.

8. vSphere Kubernetes Service (VKS)

Feature	Details
List Clusters	Tanzu Kubernetes clusters with phase status
Cluster Health	InfrastructureReady, ControlPlaneAvailable, WorkersAvailable
Scale Workers	Adjust MachineDeployment replicas
Node Status	Machine status, ready/unhealthy counts

Kubernetes-native API via kubectl/kubeconfig. VKS 3.6+ uses Cluster API specification.

9. Scheduled Scanning & Notifications

Feature	Details
Daemon	APScheduler-based, configurable interval (default 15 min)
Multi-target Scan	Sequentially scan all configured vCenter/ESXi targets
Scan Content	Alarms + Events + Host logs (hostd, vmkernel, vpxd)
Log Analysis	Regex pattern matching: error, fail, critical, panic, timeout
Webhook	Slack, Discord, or any HTTP endpoint

Safety Features

Feature	Details
Plan → Confirm → Execute → Log	Structured workflow: show current state, confirm changes, execute, audit log
Double Confirmation	All destructive ops (power-off, delete, reconfigure, snapshot-revert/delete, clone, migrate) require 2 sequential confirmations — no bypass flags
Rejection Logging	Declined confirmations are recorded in the audit trail for security review
Audit Trail	All operations logged to `~/.vmware-aiops/audit.log` (JSONL) with before/after state
Input Validation	VM name length/format, CPU (1-128), memory (128-1048576 MB), disk (1-65536 GB) validated before execution
Password Protection	`.env` file loading, never in command line or shell history; file permission check at startup
SSL Self-signed Support	`disableSslCertValidation` — only for ESXi hosts with self-signed certificates in isolated lab/home environments. Production environments should use CA-signed certificates with full TLS verification enabled.
Task Waiting	All async operations wait for completion and report result
State Validation	Pre-operation checks (VM exists, power state correct)

Version Compatibility

vSphere Version	Support	Notes
8.0 / 8.0U1-U3	✅ Full	`CreateSnapshot_Task` deprecated → use `CreateSnapshotEx_Task`
7.0 / 7.0U1-U3	✅ Full	All APIs supported
6.7	✅ Compatible	Backward-compatible, tested
6.5	✅ Compatible	Backward-compatible, tested

pyVmomi auto-negotiates the API version during SOAP handshake — no manual configuration needed.

Supported AI Platforms

Platform	Status	Config File
Claude Code	✅ Native Skill	`plugins/.../SKILL.md`
Gemini CLI	✅ Extension	`gemini-extension/GEMINI.md`
OpenAI Codex CLI	✅ Skill + AGENTS.md	`codex-skill/AGENTS.md`
Aider	✅ Conventions	`codex-skill/AGENTS.md`
Continue CLI	✅ Rules	`codex-skill/AGENTS.md`
Trae IDE	✅ Rules	`trae-rules/project_rules.md`
Kimi Code CLI	✅ Skill	`kimi-skill/SKILL.md`
MCP Server	✅ MCP Protocol	`mcp_server/`
Python CLI	✅ Standalone	N/A

MCP Server — Local Agent Compatibility

The MCP server works with any MCP-compatible agent via stdio transport. Config templates in examples/mcp-configs/:

Agent	Local Models	Config Template
Goose (Block)	✅ Ollama, LM Studio	`goose.json`
LocalCowork (Liquid AI)	✅ Fully offline	`localcowork.json`
mcp-agent (LastMile AI)	✅ Ollama, vLLM	`mcp-agent.yaml`
VS Code Copilot	—	`vscode-copilot.json`
Cursor	—	`cursor.json`
Continue	✅ Ollama	`continue.yaml`
Claude Code	—	`claude-code.json`

# Example: Aider + Ollama (fully local, no cloud API)
aider --conventions codex-skill/AGENTS.md --model ollama/qwen2.5-coder:32b

CLI Reference

# Diagnostics
vmware-aiops doctor [--skip-auth]

# MCP Config Generator
vmware-aiops mcp-config generate --agent <goose|cursor|claude-code|continue|vscode-copilot|localcowork|mcp-agent>
vmware-aiops mcp-config list

# Inventory
vmware-aiops inventory vms [--target <name>] [--limit <n>] [--sort-by name|cpu|memory_mb|power_state] [--power-state poweredOn|poweredOff]
vmware-aiops inventory hosts [--target <name>]
vmware-aiops inventory datastores [--target <name>]
vmware-aiops inventory clusters [--target <name>]

# Health
vmware-aiops health alarms [--target <name>]
vmware-aiops health events [--hours 24] [--severity warning]

# VM Operations
vmware-aiops vm info <vm-name>
vmware-aiops vm power-on <vm-name>
vmware-aiops vm power-off <vm-name> [--force]
vmware-aiops vm create <name> [--cpu <n>] [--memory <mb>] [--disk <gb>]
vmware-aiops vm delete <vm-name> [--confirm]
vmware-aiops vm reconfigure <vm-name> [--cpu <n>] [--memory <mb>]
vmware-aiops vm snapshot-create <vm-name> --name <snap-name>
vmware-aiops vm snapshot-list <vm-name>
vmware-aiops vm snapshot-revert <vm-name> --name <snap-name>
vmware-aiops vm snapshot-delete <vm-name> --name <snap-name>
vmware-aiops vm clone <vm-name> --new-name <name>
vmware-aiops vm migrate <vm-name> --to-host <host>
vmware-aiops vm set-ttl <vm-name> --minutes <n>
vmware-aiops vm cancel-ttl <vm-name>
vmware-aiops vm list-ttl
vmware-aiops vm clean-slate <vm-name> [--snapshot baseline]

# Guest Operations (requires VMware Tools)
vmware-aiops vm guest-exec <vm-name> --cmd /bin/bash --args "-c 'ls -la /tmp'" --user root
vmware-aiops vm guest-upload <vm-name> --local ./script.sh --guest /tmp/script.sh --user root
vmware-aiops vm guest-download <vm-name> --guest /var/log/syslog --local ./syslog.txt --user root

# Plan → Apply (multi-step operations)
vmware-aiops plan list

# Deploy
vmware-aiops deploy ova <path> --name <vm-name> [--datastore <ds>] [--network <net>]
vmware-aiops deploy template <template-name> --name <vm-name> [--datastore <ds>]
vmware-aiops deploy linked-clone --source <vm> --snapshot <snap> --name <new-name>
vmware-aiops deploy iso <vm-name> --iso "[datastore] path/file.iso"
vmware-aiops deploy mark-template <vm-name>
vmware-aiops deploy batch-clone --source <vm> --count <n> [--prefix <prefix>]
vmware-aiops deploy batch <spec.yaml>

# Datastore
vmware-aiops datastore browse <ds-name> [--path <subdir>]
vmware-aiops datastore scan-images [--target <name>]
vmware-aiops datastore images [--type ova|iso|vmdk] [--ds <name>]

# vSAN
vmware-aiops vsan health [--target <name>]
vmware-aiops vsan capacity [--target <name>]
vmware-aiops vsan disks [--target <name>]
vmware-aiops vsan performance [--hours 1]

# Aria Operations
vmware-aiops ops alerts [--severity critical]
vmware-aiops ops metrics <resource-name> [--hours 24]
vmware-aiops ops recommendations [--target <name>]
vmware-aiops ops capacity <cluster-name>

# VKS (Kubernetes)
vmware-aiops vks clusters [--namespace default]
vmware-aiops vks health <cluster-name>
vmware-aiops vks scale <machine-deployment> --replicas <n>
vmware-aiops vks nodes <cluster-name>

# Scanning & Daemon
vmware-aiops scan now [--target <name>]
vmware-aiops daemon start
vmware-aiops daemon stop
vmware-aiops daemon status

Setup

# 1. Install from PyPI (source: github.com/zw008/VMware-AIops)
uv tool install vmware-aiops

# 2. Verify installation source
vmware-aiops --version  # confirms installed version

# 3. Configure
mkdir -p ~/.vmware-aiops
vmware-aiops init  # generates config.yaml and .env templates
chmod 600 ~/.vmware-aiops/.env
# Edit ~/.vmware-aiops/config.yaml and .env with your target details

What Gets Installed

The vmware-aiops package installs a Python CLI binary and its dependencies (pyVmomi, Click, Rich, APScheduler, python-dotenv). No background services, daemons, or system-level changes are made during installation. The scheduled scanner (daemon start) only runs when explicitly started by the user.

Development Install

git clone https://github.com/zw008/VMware-AIops.git
cd VMware-AIops
uv venv && source .venv/bin/activate
uv pip install -e .

Security

Source Code: Fully open source at github.com/zw008/VMware-AIops (MIT). The uv installer fetches the vmware-aiops package from PyPI, which is built from this GitHub repository. We recommend reviewing the source code and commit history before deploying in production.
TLS Verification: Enabled by default. The disableSslCertValidation option exists solely for ESXi hosts using self-signed certificates in isolated lab/home environments. In production, always use CA-signed certificates with full TLS verification.
Credentials & Config: This skill requires the following secrets, all stored in ~/.vmware-aiops/.env (chmod 600, loaded via python-dotenv):
- VSPHERE_USER — vCenter/ESXi service account username
- VSPHERE_PASSWORD — service account password
- (Optional) Webhook URLs for Slack/Discord notifications
The config file ~/.vmware-aiops/config.yaml stores only target hostnames, ports, and a reference to the .env file — it does not contain passwords or tokens. The env var VMWARE_AIOPS_CONFIG points to this YAML file.
Webhook Data Scope: Webhook notifications are disabled by default. When enabled, they send infrastructure health summaries (alarm counts, event types, host status) to user-configured URLs only (Slack, Discord, or any HTTP endpoint you control). No data is sent to third-party services. Webhook payloads contain no credentials, IPs, or personally identifiable information — only aggregated alert metadata.
Prompt Injection Protection: All vSphere-sourced content (event messages, host logs) is truncated, stripped of control characters, and wrapped in boundary markers ([VSPHERE_EVENT]/[VSPHERE_HOST_LOG]) before output to prevent prompt injection when consumed by LLM agents.
Least Privilege: Use a dedicated vCenter service account with minimal permissions. For monitoring-only use cases, prefer the read-only VMware-Monitor skill which has zero destructive code paths.

License

MIT

vmware-aiops

Safety Notice

Copy this and send it to your AI assistant to learn