VMware AIops
Disclaimer: This is a community-maintained open-source project and is not affiliated with, endorsed by, or sponsored by VMware, Inc. or Broadcom Inc. "VMware" and "vSphere" are trademarks of Broadcom. Source code is publicly auditable at github.com/zw008/VMware-AIops under the MIT license.
VMware family entry point — AI-powered VM lifecycle, deployment, and alarm management — 34 MCP tools.
Start here: install vmware-aiops first, then add modules as needed. Run
vmware-aiops hub statusto see which family members are installed. Family: vmware-monitor (inventory/health), vmware-storage (iSCSI/vSAN), vmware-vks (Tanzu Kubernetes), vmware-nsx (NSX networking), vmware-nsx-security (DFW/firewall), vmware-aria (metrics/alerts/capacity), vmware-avi (AVI/ALB/AKO). | vmware-pilot (workflow orchestration) | vmware-policy (audit/policy)
What This Skill Does
| Category | Tools | Count |
|---|---|---|
| VM Lifecycle | power on/off, TTL auto-delete, clean slate | 6 |
| Deployment | OVA, template, linked clone, batch clone/deploy | 8 |
| Guest Ops | exec commands, upload/download files, provision | 5 |
| Plan/Apply | multi-step planning with rollback | 4 |
| Cluster | create, delete, HA/DRS config, add/remove hosts | 6 |
| Datastore | browse files, scan for images | 2 |
| Alarm Management | list alarms, acknowledge, reset | 3 |
Quick Install
uv tool install vmware-aiops
vmware-aiops doctor
vmware-aiops hub status # see which family members are installed
VMware Family — Install What You Need
vmware-aiops is the entry point. Add modules for additional capabilities:
| Module | Install | Adds |
|---|---|---|
| vmware-monitor | uv tool install vmware-monitor | Read-only inventory, alarms, events |
| vmware-storage | uv tool install vmware-storage | iSCSI, vSAN, datastore management |
| vmware-vks | uv tool install vmware-vks | Tanzu Kubernetes (vSphere 8.x+) |
| vmware-nsx | uv tool install vmware-nsx-mgmt | NSX networking: segments, gateways, NAT |
| vmware-nsx-security | uv tool install vmware-nsx-security | DFW microsegmentation, security groups |
| vmware-aria | uv tool install vmware-aria | Aria Ops metrics, alerts, capacity |
| vmware-avi | uv tool install vmware-avi | AVI load balancer, ALB, AKO, Ingress |
Each module stays independent — small tool count keeps local models (Ollama, Qwen) accurate.
When to Use This Skill
- Power on/off, create, delete, snapshot, clone, or migrate VMs
- Deploy VMs from OVA, templates, linked clones, or batch specs
- Run commands or transfer files inside a VM (Guest Operations)
- Create/configure clusters (HA/DRS)
- Browse datastores for deployable images
- Plan and execute multi-step operations with rollback
- List, acknowledge, and reset vCenter triggered alarms
Use companion skills for:
- Inventory, health, alarms, VM info →
vmware-monitor - iSCSI, vSAN, datastore management →
vmware-storage - Tanzu Kubernetes (Supervisor, Namespace, TKC) →
vmware-vks - Load balancing, AVI/ALB, AKO, Ingress →
vmware-avi
Related Skills — Skill Routing
| User Intent | Recommended Skill |
|---|---|
| Read-only monitoring, zero risk | vmware-monitor (uv tool install vmware-monitor) |
| Storage: iSCSI, vSAN, datastores | vmware-storage (uv tool install vmware-storage) |
| VM lifecycle, deployment, guest ops | vmware-aiops ← this skill |
| Tanzu Kubernetes (vSphere 8.x+) | vmware-vks (uv tool install vmware-vks) |
| NSX networking: segments, gateways, NAT | vmware-nsx (uv tool install vmware-nsx-mgmt) |
| NSX security: DFW rules, security groups | vmware-nsx-security (uv tool install vmware-nsx-security) |
| Aria Ops: metrics, alerts, capacity | vmware-aria (uv tool install vmware-aria) |
| Multi-step workflows with approval | vmware-pilot |
| Load balancer, AVI, ALB, AKO, Ingress | vmware-avi (uv tool install vmware-avi) |
| Audit log query | vmware-policy (vmware-audit CLI) |
Common Workflows
Diagnostic investigations: Before remediating any "why is X slow / failing / down" issue, follow
references/investigation-protocol.md. It enforces the four root-cause completeness criteria (falsifiability / sufficiency / necessity / mechanism) and the up-to-three-rounds deepening loop. Only invoke L3+ write tools after the four criteria are satisfied AND the user has approved a remediation plan.
Deploy a Lab Environment
Pre-flight (judgment, not blind sequence):
- Free space: target datastore must have ≥ OVA size × 2 (delta files + thin-provision overhead). If multiple datastores qualify, prefer one with lowest current IOPS pressure (cross-check
vmware-ariaif available). - Name hygiene: prefix with date or owner (
lab-2026-04-30-alice) so the TTL cleanup audit trail is meaningful. - TTL: always set. 480 min for a single test session, 7200 min for a week-long sandbox. Never deploy a "lab" VM without a TTL — that is how datastores fill up at 3 AM.
- Snapshot timing: take the baseline after provisioning succeeds, not before — a pre-provision snapshot is just an empty checkpoint.
Steps:
vmware-aiops datastore browse <ds> --pattern "*.ova"→ confirm image present and sizevmware-aiops deploy ova <path> --name <date>-<owner>-<purpose> --datastore <ds>vmware-aiops vm guest-exec <name> --cmd /usr/bin/python3 --args "setup.py" --user admin→ if exit ≠ 0, stop, do not snapshot a half-provisioned VMvmware-aiops vm snapshot-create <name> --name baseline(only if multi-iteration testing; skip for one-shot)vmware-aiops vm set-ttl <name> --minutes 480
Batch Clone for Testing
Pre-flight:
- Source VM state: powered-off is safest. If powered-on, VMware Tools must be running and quiesce-capable, else clones may have inconsistent disk state.
- Capacity math:
free_space ≥ source.size × count × 1.2(full clone) or≥ count × 2 GB(linked clone, delta-only). - Decision rule: count > 10 → use linked clones (
deploy linked-clone); seconds vs minutes per clone, ~100× less storage. Tradeoff: linked clones depend on source snapshot — deleting the snapshot breaks all children. - Network exhaustion: each clone gets a unique MAC from the vSphere pool; if you batch > 200, verify pool capacity in advance.
- TTL: every clone must have one. Use the plan's metadata to track ownership.
Steps:
vm_create_planwith clone + reconfigure + set-ttl steps grouped per VM (atomic per clone)- Review the plan with the user — surface count, datastore, irreversible warnings
vm_apply_plan— stops on first failure (intentional, do not auto-resume)- On failure:
vm_rollback_plan→ reverses completed clones; manually verify rollback before retrying
Migrate VM to Another Host
Pre-flight (ALL must pass before issuing migrate):
- CPU compatibility: target host CPU family must match source, OR cluster must be in EVC mode. Live migration across mismatched CPUs fails mid-flight and may leave the VM stunned.
- Network parity: every portgroup the VM uses must exist on the target host's vSwitch with the same VLAN. Missing portgroup → vNICs disconnected post-migration.
- Storage visibility: target host must see all of the VM's datastores; otherwise this is a Storage vMotion, not a host migration — different (slower) operation.
- Affinity rules: if the VM is pinned to source by a DRS host-affinity rule, migration silently violates intent. Check
cluster infofirst. - Hardware passthrough: VMs with PCI passthrough (GPU, USB) cannot live-migrate — schedule a cold migration window.
Steps:
- Verify VM state and current host via
vmware-monitor vm info <name> - Verify target host: same cluster, EVC compatible, has required networks/datastores
vmware-aiops vm migrate <name> --to-host <target>— wait for task completion, do not assume success on return- Post-check:
vm infoconfirms new host AND power state unchanged AND vNICs connected
Usage Mode
| Scenario | Recommended | Why |
|---|---|---|
| Local/small models (Ollama, Qwen) | CLI | ~2K tokens vs ~8K for MCP |
| Cloud models (Claude, GPT-4o) | Either | MCP gives structured JSON I/O |
| Automated pipelines | MCP | Type-safe parameters, structured output |
MCP Tools (34 — 20 read, 14 write)
| Category | Tools | R/W |
|---|---|---|
| VM Lifecycle (6) | vm_list_ttl | Read |
vm_power_on, vm_power_off, vm_set_ttl, vm_cancel_ttl, vm_clean_slate | Write | |
| Deployment (8) | deploy_vm_from_ova, deploy_vm_from_template, deploy_linked_clone, attach_iso_to_vm, convert_vm_to_template, batch_clone_vms, batch_linked_clone_vms, batch_deploy_from_spec | Write |
| Guest Ops (5) | vm_guest_exec_output, vm_guest_download | Read |
vm_guest_exec, vm_guest_upload, vm_guest_provision | Write | |
| Plan/Apply (4) | vm_list_plans, vm_create_plan | Read |
vm_apply_plan, vm_rollback_plan | Write | |
| Datastore (2) | browse_datastore, scan_datastore_images | Read |
| Cluster (6) | cluster_info | Read |
cluster_create, cluster_delete, cluster_add_host, cluster_remove_host, cluster_configure | Write | |
| Alarm Management (3) | list_vcenter_alarms | Read |
acknowledge_vcenter_alarm, reset_vcenter_alarm | Write |
Read/write split: 20 tools are read-only, 14 modify state. All write tools require explicit parameters and are audit-logged. Destructive operations (delete, force power-off) require double confirmation.
CLI Quick Reference
# VM operations
vmware-aiops vm power-on <name> [--target <t>]
vmware-aiops vm power-off <name> [--force]
vmware-aiops vm create <name> --cpu 4 --memory 8192 --disk 100
vmware-aiops vm delete <name>
vmware-aiops vm clone <name> --new-name <new>
vmware-aiops vm migrate <name> --to-host <host>
# Guest operations (requires VMware Tools)
vmware-aiops vm guest-exec <name> --cmd <script-path> --args "<args>" --user <username>
vmware-aiops vm guest-upload <name> --local ./script.sh --guest /tmp/script.sh --user <username>
# Deploy
vmware-aiops deploy ova <path> --name <vm> --datastore <ds>
vmware-aiops deploy linked-clone --source <vm> --snapshot <snap> --name <new>
# Cluster
vmware-aiops cluster create <name> --ha --drs
vmware-aiops cluster info <name>
# Datastore
vmware-aiops datastore browse <ds> --pattern "*.ova"
# Alarm management
vmware-aiops alarm list [--target <t>]
vmware-aiops alarm acknowledge <entity_name> <alarm_name> [--target <t>]
vmware-aiops alarm reset <entity_name> <alarm_name> [--target <t>]
# Family
vmware-aiops hub status # show installed family members + install commands
Full CLI reference: see
references/cli-reference.md
Troubleshooting
"VM not found" error
VM names are case-sensitive in vSphere. Use exact name from vmware-monitor inventory vms.
Guest exec returns empty output
Use vm_guest_exec_output instead of vm_guest_exec — it auto-captures stdout/stderr. Basic vm_guest_exec only returns exit code.
Deploy OVA times out
Large OVA files (>10GB) may exceed the default 120s timeout. The upload happens via HTTP NFC lease — ensure network between the machine running vmware-aiops and ESXi is stable.
Plan apply fails mid-way
Run vmware-aiops plan list to see failed plan status. Ask user if they want to rollback with vm_rollback_plan. Irreversible steps (delete_vm) are skipped during rollback.
Connection refused / SSL error
- Verify target is reachable:
vmware-aiops doctor - For self-signed certs: set
disableSslCertValidation: truein config.yaml (lab environments only)
Setup
uv tool install vmware-aiops
mkdir -p ~/.vmware-aiops
vmware-aiops init # generates config.yaml and .env templates
chmod 600 ~/.vmware-aiops/.env
All tools are automatically audited via vmware-policy. Audit logs:
vmware-audit log --last 20
Full setup guide, security details, and AI platform compatibility: see
references/setup-guide.md
Audit & Safety
All operations are automatically audited via vmware-policy (@vmware_tool decorator):
- Every tool call logged to
~/.vmware/audit.db(SQLite, framework-agnostic) - Policy rules enforced via
~/.vmware/rules.yaml(deny rules, maintenance windows, risk levels) - Risk classification: each tool tagged as low/medium/high/critical
- View recent operations:
vmware-audit log --last 20 - View denied operations:
vmware-audit log --status denied
vmware-policy is automatically installed as a dependency — no manual setup needed.