pilotty

Automates terminal TUI applications (vim, htop, lazygit, dialog) through managed PTY sessions. Use when the user needs to interact with terminal apps, edit files in vim/nano, navigate TUI menus, click terminal buttons/checkboxes, or automate CLI workflows with interactive prompts.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pilotty" with this command: npx skills add msmps/pilotty/msmps-pilotty-pilotty

Terminal Automation with pilotty

CRITICAL: Argument Positioning

All flags (--name, -s, --format, etc.) MUST come BEFORE positional arguments:

# CORRECT - flags before command/arguments
pilotty spawn --name myapp vim file.txt
pilotty key -s myapp Enter
pilotty snapshot -s myapp --format text

# WRONG - flags after command (they get passed to the app, not pilotty!)
pilotty spawn vim file.txt --name myapp   # FAILS: --name goes to vim
pilotty key Enter -s myapp                # FAILS: -s goes nowhere useful

This is the #1 cause of agent failures. When in doubt: flags first, then command/args.


Quick start

pilotty spawn vim file.txt        # Start TUI app in managed session
pilotty wait-for "file.txt"       # Wait for app to be ready
pilotty snapshot                  # Get screen state with UI elements
pilotty key i                     # Enter insert mode
pilotty type "Hello, World!"      # Type text
pilotty key Escape                # Exit insert mode
pilotty kill                      # End session

Core workflow

  1. Spawn: pilotty spawn <command> starts the app in a background PTY
  2. Wait: pilotty wait-for <text> ensures the app is ready
  3. Snapshot: pilotty snapshot returns screen state with detected UI elements
  4. Understand: Parse elements[] to identify buttons, inputs, toggles
  5. Interact: Use keyboard commands (key, type) to navigate and interact
  6. Re-snapshot: Check content_hash to detect screen changes

Commands

Session management

pilotty spawn <command>           # Start TUI app (e.g., pilotty spawn htop)
pilotty spawn --name myapp <cmd>  # Start with custom session name (--name before command)
pilotty kill                      # Kill default session
pilotty kill -s myapp             # Kill specific session
pilotty list-sessions             # List all active sessions
pilotty daemon                    # Manually start daemon (usually auto-starts)
pilotty shutdown                  # Stop daemon and all sessions
pilotty examples                  # Show end-to-end workflow example

Screen capture

pilotty snapshot                  # Full JSON with text content and elements
pilotty snapshot --format compact # JSON without text field
pilotty snapshot --format text    # Plain text with cursor indicator
pilotty snapshot -s myapp         # Snapshot specific session

# Wait for screen to change (eliminates need for sleep!)
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH           # Block until screen changes
pilotty snapshot --await-change $HASH --settle 50  # Wait for 50ms stability

Input

pilotty type "hello"              # Type text at cursor
pilotty type -s myapp "text"      # Type in specific session

pilotty key Enter                 # Press Enter
pilotty key Ctrl+C                # Send interrupt
pilotty key Escape                # Send Escape
pilotty key Tab                   # Send Tab
pilotty key F1                    # Function key
pilotty key Alt+F                 # Alt combination
pilotty key Up                    # Arrow key
pilotty key -s myapp Ctrl+S       # Key in specific session

# Key sequences (space-separated, sent in order)
pilotty key "Ctrl+X m"            # Emacs chord: Ctrl+X then m
pilotty key "Escape : w q Enter"  # vim :wq sequence
pilotty key "a b c" --delay 50    # Send a, b, c with 50ms delay
pilotty key -s myapp "Tab Tab Enter"  # Sequence in specific session

Interaction

pilotty click 5 10                # Click at row 5, col 10
pilotty click -s myapp 10 20      # Click in specific session
pilotty scroll up                 # Scroll up 1 line
pilotty scroll down 5             # Scroll down 5 lines
pilotty scroll up 10 -s myapp     # Scroll in specific session

Terminal control

pilotty resize 120 40             # Resize terminal to 120 cols x 40 rows
pilotty resize 80 24 -s myapp     # Resize specific session

pilotty wait-for "Ready"          # Wait for text to appear (30s default)
pilotty wait-for "Error" -r       # Wait for regex pattern
pilotty wait-for "Done" -t 5000   # Wait with 5s timeout
pilotty wait-for "~" -s editor    # Wait in specific session

Global options

OptionDescription
-s, --session <name>Target specific session (default: "default")
--format <fmt>Snapshot format: full, compact, text
-t, --timeout <ms>Timeout for wait-for and await-change (default: 30000)
-r, --regexTreat wait-for pattern as regex
--name <name>Session name for spawn command
--delay <ms>Delay between keys in a sequence (default: 0, max: 10000)
--await-change <hash>Block snapshot until content_hash differs
--settle <ms>Wait for screen to be stable for this many ms (default: 0)

Environment variables

PILOTTY_SESSION="mysession"       # Default session name
PILOTTY_SOCKET_DIR="/tmp/pilotty" # Override socket directory
RUST_LOG="debug"                  # Enable debug logging

Snapshot Output

The snapshot command returns structured JSON with detected UI elements:

{
  "snapshot_id": 42,
  "size": { "cols": 80, "rows": 24 },
  "cursor": { "row": 5, "col": 10, "visible": true },
  "text": "Settings:\n  [x] Notifications  [ ] Dark mode\n  [Save]  [Cancel]",
  "elements": [
    { "kind": "toggle", "row": 1, "col": 2, "width": 3, "text": "[x]", "confidence": 1.0, "checked": true },
    { "kind": "toggle", "row": 1, "col": 20, "width": 3, "text": "[ ]", "confidence": 1.0, "checked": false },
    { "kind": "button", "row": 2, "col": 2, "width": 6, "text": "[Save]", "confidence": 0.8 },
    { "kind": "button", "row": 2, "col": 10, "width": 8, "text": "[Cancel]", "confidence": 0.8 }
  ],
  "content_hash": 12345678901234567890
}

Use --format text for a plain text view with cursor indicator:

--- Terminal 80x24 | Cursor: (5, 10) ---
bash-3.2$ [_]

The [_] shows cursor position. Use the text content to understand screen state and navigate with keyboard commands.


Element Detection

pilotty automatically detects interactive UI elements in terminal applications. Elements provide read-only context to help understand UI structure.

Element Kinds

KindDetection PatternsConfidenceFields
toggle[x], [ ], [*], , 1.0checked: bool
buttonInverse video, [OK], <Cancel>, (Submit)1.0 / 0.8focused: bool (if true)
inputCursor position, ____ underscores1.0 / 0.6focused: bool (if true)

Element Fields

FieldTypeDescription
kindstringElement type: button, input, or toggle
rownumberRow position (0-based from top)
colnumberColumn position (0-based from left)
widthnumberWidth in terminal cells (CJK chars = 2)
textstringText content of the element
confidencenumberDetection confidence (0.0-1.0)
focusedboolWhether element has focus (only present if true)
checkedboolToggle state (only present for toggles)

Confidence Levels

ConfidenceMeaning
1.0High confidence: Cursor position, inverse video, checkbox patterns
0.8Medium confidence: Bracket patterns [OK], <Cancel>
0.6Lower confidence: Underscore input fields ____

Wait for Screen Changes (Recommended)

Stop guessing sleep durations! Use --await-change to wait for the screen to actually update:

# Capture baseline hash
HASH=$(pilotty snapshot | jq '.content_hash')

# Perform action
pilotty key Enter

# Wait for screen to change (blocks until hash differs)
pilotty snapshot --await-change $HASH

# Or wait for screen to stabilize (for apps that render progressively)
pilotty snapshot --await-change $HASH --settle 100

Flags:

FlagDescription
--await-change <HASH>Block until content_hash differs from this value
--settle <MS>After change detected, wait for screen to be stable for MS
-t, --timeout <MS>Maximum wait time (default: 30000)

Why this is better than sleep:

  • sleep 1 is a guess - too short causes race conditions, too long slows automation
  • --await-change waits exactly as long as needed - no more, no less
  • --settle handles apps that render progressively (show partial, then complete)

Waiting for Streaming AI Responses

When interacting with AI-powered TUIs (like opencode, etc.) that stream responses, you need a longer --settle time since the screen keeps updating as tokens arrive:

# 1. Capture hash before sending prompt
HASH=$(pilotty snapshot -s myapp | jq -r '.content_hash')

# 2. Type prompt and submit
pilotty type -s myapp "write me a poem about ai agents"
pilotty key -s myapp Enter

# 3. Wait for streaming response to complete
#    - Use longer settle (2-3s) since AI apps pause between chunks
#    - Extend timeout for long responses (60s+)
pilotty snapshot -s myapp --await-change "$HASH" --settle 3000 -t 60000

# 4. Response may be scrolled - scroll up if needed to see full output
pilotty scroll -s myapp up 10
pilotty snapshot -s myapp --format text

Key parameters for streaming:

  • --settle 2000-3000: AI responses have pauses between chunks; 2-3 seconds ensures streaming is truly done
  • -t 60000: Extend timeout beyond the 30s default for longer generations
  • The settle timer resets on each screen change, so it naturally waits until streaming stops

Manual Change Detection

For manual polling (not recommended), use content_hash directly:

# Get initial state
SNAP1=$(pilotty snapshot)
HASH1=$(echo "$SNAP1" | jq -r '.content_hash')

# Perform action
pilotty key Tab

# Check if screen changed
SNAP2=$(pilotty snapshot)
HASH2=$(echo "$SNAP2" | jq -r '.content_hash')

if [ "$HASH1" != "$HASH2" ]; then
  echo "Screen changed - re-analyze elements"
fi

Using Elements Effectively

Elements are read-only context for understanding the UI. Use keyboard navigation for reliable interaction:

# 1. Get snapshot to understand UI structure
pilotty snapshot | jq '.elements'
# Output shows toggles (checked/unchecked) and buttons with positions

# 2. Navigate and interact with keyboard (reliable approach)
pilotty key Tab          # Move to next element
pilotty key Space        # Toggle checkbox
pilotty key Enter        # Activate button

# 3. Verify state changed
pilotty snapshot | jq '.elements[] | select(.kind == "toggle")'

Key insight: Use elements to understand WHAT is on screen, use keyboard to interact with it.


Navigation Approach

pilotty uses keyboard-first navigation, just like a human would:

# 1. Take snapshot to see the screen
pilotty snapshot --format text

# 2. Navigate using keyboard
pilotty key Tab           # Move to next element
pilotty key Enter         # Activate/select
pilotty key Escape        # Cancel/back
pilotty key Up            # Move up in list/menu
pilotty key Space         # Toggle checkbox

# 3. Type text when needed
pilotty type "search term"
pilotty key Enter

# 4. Click at coordinates for mouse-enabled TUIs
pilotty click 5 10        # Click at row 5, col 10

Key insight: Parse the snapshot text and elements to understand what's on screen, then use keyboard commands to navigate. This works reliably across all TUI applications.


Example: Edit file with vim

# 1. Spawn vim
pilotty spawn --name editor vim /tmp/hello.txt

# 2. Wait for vim to load and capture baseline hash
pilotty wait-for -s editor "hello.txt"
HASH=$(pilotty snapshot -s editor | jq '.content_hash')

# 3. Enter insert mode
pilotty key -s editor i

# 4. Type content
pilotty type -s editor "Hello from pilotty!"

# 5. Wait for screen to update, then exit (no sleep needed!)
pilotty snapshot -s editor --await-change $HASH --settle 50
pilotty key -s editor "Escape : w q Enter"

# 6. Verify session ended
pilotty list-sessions

Alternative using individual keys:

pilotty key -s editor Escape
pilotty type -s editor ":wq"
pilotty key -s editor Enter

Example: Dialog checklist interaction

# 1. Spawn dialog checklist (--name before command)
pilotty spawn --name opts dialog --checklist "Select features:" 12 50 4 \
    "notifications" "Push notifications" on \
    "darkmode" "Dark mode theme" off \
    "autosave" "Auto-save documents" on \
    "telemetry" "Usage analytics" off

# 2. Wait for dialog to render (use await-change, not sleep!)
pilotty snapshot -s opts --settle 200  # Wait for initial render to stabilize

# 3. Get snapshot and examine elements, capture hash
SNAP=$(pilotty snapshot -s opts)
echo "$SNAP" | jq '.elements[] | select(.kind == "toggle")'
HASH=$(echo "$SNAP" | jq '.content_hash')

# 4. Navigate to "darkmode" and toggle it
pilotty key -s opts Down      # Move to second option
pilotty key -s opts Space     # Toggle it on

# 5. Wait for change and verify
pilotty snapshot -s opts --await-change $HASH | jq '.elements[] | select(.kind == "toggle") | {text, checked}'

# 6. Confirm selection
pilotty key -s opts Enter

# 7. Clean up
pilotty kill -s opts

Example: Form filling with elements

# 1. Spawn a form application
pilotty spawn --name form my-form-app

# 2. Get snapshot to understand form structure
pilotty snapshot -s form | jq '.elements'
# Shows inputs, toggles, and buttons with positions for click command

# 3. Tab to first input (likely already focused)
pilotty type -s form "myusername"

# 4. Tab to password field
pilotty key -s form Tab
pilotty type -s form "mypassword"

# 5. Tab to remember me and toggle
pilotty key -s form Tab
pilotty key -s form Space

# 6. Tab to Login and activate
pilotty key -s form Tab
pilotty key -s form Enter

# 7. Check result
pilotty snapshot -s form --format text

Example: Monitor with htop

# 1. Spawn htop
pilotty spawn --name monitor htop

# 2. Wait for display
pilotty wait-for -s monitor "CPU"

# 3. Take snapshot to see current state
pilotty snapshot -s monitor --format text

# 4. Send commands
pilotty key -s monitor F9    # Kill menu
pilotty key -s monitor q     # Quit

# 5. Kill session
pilotty kill -s monitor

Example: Interact with AI TUI (opencode, etc.)

AI-powered TUIs stream responses, requiring special handling:

# 1. Spawn the AI app
pilotty spawn --name ai opencode

# 2. Wait for the prompt to be ready
pilotty wait-for -s ai "Ask anything" -t 15000

# 3. Capture baseline hash
HASH=$(pilotty snapshot -s ai | jq -r '.content_hash')

# 4. Type prompt and submit
pilotty type -s ai "explain the architecture of this codebase"
pilotty key -s ai Enter

# 5. Wait for streaming response to complete
#    - settle=3000: Wait 3s of no changes to ensure streaming is done
#    - timeout=60000: Allow up to 60s for long responses
pilotty snapshot -s ai --await-change "$HASH" --settle 3000 -t 60000 --format text

# 6. If response is long and scrolled, scroll up to see full output
pilotty scroll -s ai up 20
pilotty snapshot -s ai --format text

# 7. Clean up
pilotty kill -s ai

Gotchas with AI apps:

  • Use --settle 2000-3000 because AI responses pause between chunks
  • Extend timeout with -t 60000 for complex prompts
  • Long responses may scroll the terminal; use scroll up to see the beginning
  • The settle timer resets on each screen update, so it waits for true completion

Sessions

Each session is isolated with its own:

  • PTY (pseudo-terminal)
  • Screen buffer
  • Child process
# Run multiple apps (--name must come before the command)
pilotty spawn --name monitoring htop
pilotty spawn --name editor vim file.txt

# Target specific session
pilotty snapshot -s monitoring
pilotty key -s editor Ctrl+S

# List all
pilotty list-sessions

# Kill specific
pilotty kill -s editor

The first session spawned without --name is automatically named default.

Important: The --name flag must come before the command. Everything after the command is passed as arguments to that command.

Daemon Architecture

pilotty uses a background daemon for session management:

  • Auto-start: Daemon starts on first command
  • Auto-stop: Shuts down after 5 minutes with no sessions
  • Session cleanup: Sessions removed when process exits (within 500ms)
  • Shared state: Multiple CLI calls share sessions

You rarely need to manage the daemon manually.

Error Handling

Errors include actionable suggestions:

{
  "code": "SESSION_NOT_FOUND",
  "message": "Session 'abc123' not found",
  "suggestion": "Run 'pilotty list-sessions' to see available sessions"
}
{
  "code": "SPAWN_FAILED",
  "message": "Failed to spawn process: command not found",
  "suggestion": "Check that the command exists and is in PATH"
}

Common Patterns

Reliable action + wait (recommended)

# The pattern: capture hash, act, await change
HASH=$(pilotty snapshot | jq '.content_hash')
pilotty key Enter
pilotty snapshot --await-change $HASH --settle 50

# This replaces fragile patterns like:
# pilotty key Enter && sleep 1 && pilotty snapshot  # BAD: guessing

Wait then act

pilotty spawn my-app
pilotty wait-for "Ready"    # Ensure app is ready
pilotty snapshot            # Then snapshot

Check state before action

pilotty snapshot --format text | grep "Error"  # Check for errors
pilotty key Enter                               # Then proceed

Check for specific element

# Check if the first toggle is checked
pilotty snapshot | jq '.elements[] | select(.kind == "toggle") | {text, checked}' | head -1

# Find element at specific position
pilotty snapshot | jq '.elements[] | select(.row == 5 and .col == 10)'

Retry on timeout

pilotty wait-for "Ready" -t 5000 || {
  pilotty snapshot --format text   # Check what's on screen
  # Adjust approach based on actual state
}

Deep-dive Documentation

For detailed patterns and edge cases, see:

ReferenceDescription
references/session-management.mdMulti-session patterns, isolation, cleanup
references/key-input.mdComplete key combinations reference
references/element-detection.mdDetection rules, confidence, patterns

Ready-to-use Templates

Executable workflow scripts:

TemplateDescription
templates/vim-workflow.shEdit file with vim, save, exit
templates/dialog-interaction.shHandle dialog/whiptail prompts
templates/multi-session.shParallel TUI orchestration
templates/element-detection.shElement detection demo

Usage:

./templates/vim-workflow.sh /tmp/myfile.txt "File content here"
./templates/dialog-interaction.sh
./templates/multi-session.sh
./templates/element-detection.sh

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

opentui

No summary provided by upstream source.

Repository SourceNeeds Review
1.4K-msmps
Coding

openclaw-version-monitor

监控 OpenClaw GitHub 版本更新,获取最新版本发布说明,翻译成中文, 并推送到 Telegram 和 Feishu。用于:(1) 定时检查版本更新 (2) 推送版本更新通知 (3) 生成中文版发布说明

Archived SourceRecently Updated
Coding

ask-claude

Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).

Archived SourceRecently Updated
Coding

ai-dating

This skill enables dating and matchmaking workflows. Use it when a user asks to make friends, find a partner, run matchmaking, or provide dating preferences/profile updates. The skill should execute `dating-cli` commands to complete profile setup, task creation/update, match checking, contact reveal, and review.

Archived SourceRecently Updated