controlling-chrome-with-surfcli

Surf Browser Automation

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "controlling-chrome-with-surfcli" with this command: npx skills add zenobi-us/dotfiles/zenobi-us-dotfiles-controlling-chrome-with-surfcli

Surf Browser Automation

Control Chrome browser via CLI or Unix socket.

CLI Quick Reference

surf --help # Full help surf <group> # Group help (tab, scroll, page, wait, dialog, emulate, form, perf, ai) surf --help-full # All commands surf --find <term> # Search tools surf --help-topic <topic> # Topic guide (refs, semantic, frames, devices, windows)

Core Workflow

1. Navigate to page

surf navigate "https://example.com"

2. Read page to get element refs

surf page.read

3. Click by ref or coordinates

surf click --ref "e1" surf click --x 100 --y 200

4. Type text

surf type --text "hello"

5. Screenshot

surf screenshot --output /tmp/shot.png

AI Assistants (No API Keys)

Query AI models using your browser's logged-in session. Must be logged into the respective service in Chrome.

ChatGPT

surf chatgpt "explain this code" surf chatgpt "summarize" --with-page # Include current page context surf chatgpt "review" --model gpt-4o # Specify model surf chatgpt "analyze" --file document.pdf # With file attachment

Gemini

surf gemini "explain quantum computing" surf gemini "summarize" --with-page # Include page context surf gemini "analyze" --file data.csv # Attach file surf gemini "a robot surfing" --generate-image /tmp/robot.png surf gemini "add sunglasses" --edit-image photo.jpg --output out.jpg surf gemini "summarize" --youtube "https://youtube.com/..." surf gemini "hello" --model gemini-2.5-flash # Models: gemini-3-pro (default), gemini-2.5-pro, gemini-2.5-flash surf gemini "wide banner" --generate-image /tmp/banner.png --aspect-ratio 16:9

Perplexity

surf perplexity "what is quantum computing" surf perplexity "explain this page" --with-page # Include page context surf perplexity "deep dive" --mode research # Research mode (Pro) surf perplexity "latest news" --model sonar # Model selection (Pro)

Grok (via x.com - requires X.com login in Chrome)

surf grok "what are the latest AI trends on X" # Search X posts surf grok "analyze @username recent activity" # Profile analysis
surf grok "summarize this page" --with-page # Include page context surf grok "find viral AI posts" --deep-search # DeepSearch mode surf grok "quick question" --model fast # Models: auto, fast, expert, thinking (default)

Grok Validation & Troubleshooting:

Validate Grok UI and check available models (no query sent)

surf grok --validate

If models changed, save discovered models to surf.json config

surf grok --validate --save-models

AI Tool Troubleshooting

When AI queries fail, check these common issues:

  • Not logged in: The error "login required" means you need to log into the service in Chrome

  • Model selection failed: The UI may have changed. Run surf grok --validate to check

  • Response timeout: Thinking models (ChatGPT o1, Grok thinking) can take 45+ seconds

  • Element not found: The service's UI changed. Check for surf-cli updates

Debugging workflow for agents:

1. Check if the service is accessible and UI is valid

surf grok --validate

2. If models mismatch, update the local settings

surf grok --validate --save-models

3. Retry with explicit model name from validation output

surf grok "query" --model <model-from-validation>

4. If still failing, try with longer timeout

surf grok "query" --timeout 600

Tab Management

surf tab.list surf tab.new "https://google.com" surf tab.switch 12345 surf tab.close 12345 surf tab.reload # Reload current tab

Named tabs (aliases)

surf tab.name myapp # Name current tab surf tab.switch myapp # Switch by name surf tab.named # List named tabs surf tab.unname myapp # Remove name

Tab groups

surf tab.group # Create/add to tab group surf tab.ungroup # Remove from group surf tab.groups # List all tab groups

Window Management

surf window.list # List all windows surf window.list --tabs # Include tab details surf window.new # New window surf window.new --url "https://example.com" # New window with URL surf window.new --incognito # New incognito window surf window.new --unfocused # Don't focus new window surf window.focus 12345 # Focus window by ID surf window.close 12345 # Close window surf window.resize --id 123 --width 1920 --height 1080 surf window.resize --id 123 --state maximized # States: normal, minimized, maximized, fullscreen

Window isolation for agents:

Create isolated window for agent work

surf window.new "https://example.com"

Returns window ID, use with subsequent commands:

surf --window-id 123 tab.list surf --window-id 123 go "https://other.com"

Input Methods

CDP method (real events) - default

surf type --text "hello" surf click --x 100 --y 200

JS method (DOM manipulation) - for contenteditable

surf type --text "hello" --selector "#input" --method js

Keys

surf key Enter surf key "cmd+a" surf key.repeat --key Tab --count 5 # Repeat key presses

Hover and drag

surf hover --ref e5 surf drag --from-x 100 --from-y 100 --to-x 200 --to-y 200

Page Inspection

surf page.read # Accessibility tree with refs + page text surf page.read --no-text # Interactive elements only (no text content) surf page.read --ref e5 # Get specific element details surf page.read --depth 3 # Limit tree depth surf page.read --compact # Minimal output for LLM efficiency surf page.text # Plain text content only surf page.state # Modals, loading state, scroll info

Semantic Element Location

Find and act on elements by role, text, or label instead of refs:

Find by ARIA role

surf locate.role button --name "Submit" --action click surf locate.role textbox --name "Email" --action fill --value "test@example.com" surf locate.role link --all # Return all matches

Find by text content

surf locate.text "Sign In" --action click surf locate.text "Accept" --exact --action click

Find form field by label

surf locate.label "Username" --action fill --value "john" surf locate.label "Password" --action fill --value "secret"

Actions: click , fill , hover , text (get text content)

Text Search

surf search "login" # Find text in page surf search "Error" --case-sensitive # Case-sensitive surf search "button" --limit 5 # Limit results surf find "login" # Alias for search

Element Inspection

surf element.styles e5 # Get computed styles by ref surf element.styles ".card" # Or by CSS selector

Returns: font, color, background, border, padding, bounding box

Scrolling

surf scroll.bottom surf scroll.top
surf scroll.to --y 500 # Scroll to Y position surf scroll.to --ref e5 # Scroll element into view surf scroll.by --y 200 # Scroll by amount surf scroll.info # Get scroll position

Waiting

surf wait 2 # Wait 2 seconds surf wait.element ".loaded" # Wait for element surf wait.network # Wait for network idle surf wait.url "/success" # Wait for URL pattern surf wait.dom --stable 100 # Wait for DOM stability surf wait.load # Wait for page load complete

Dialog Handling

surf dialog.info # Get current dialog type/message surf dialog.accept # Accept (OK) surf dialog.accept --text "response" # Accept prompt with text surf dialog.dismiss # Dismiss (Cancel)

Device/Network Emulation

Network throttling

surf emulate.network slow-3g # Presets: slow-3g, fast-3g, 4g, offline surf emulate.network reset # Disable throttling

CPU throttling

surf emulate.cpu 4 # 4x slower surf emulate.cpu 1 # Reset

Device emulation (19 presets)

surf emulate.device "iPhone 14" surf emulate.device "Pixel 7" surf emulate.device --list # List available devices

Custom viewport

surf emulate.viewport --width 1280 --height 720 surf emulate.touch --enable # Enable touch emulation

Geolocation

surf emulate.geo --lat 37.7749 --lon -122.4194 surf emulate.geo --clear

Form Automation

surf page.read # Get element refs first

Fill by ref

surf form.fill --data '[{"ref":"e1","value":"John"},{"ref":"e2","value":"john@example.com"}]'

Checkboxes: true/false

surf form.fill --data '[{"ref":"e7","value":true}]'

Dropdown selection

surf select e5 "Option A" # By value (default) surf select e5 "Option A" "Option B" # Multi-select surf select e5 --by label "Display Text" # By visible label surf select e5 --by index 2 # By index (0-based)

File Upload

surf upload --ref e5 --files "/path/to/file.txt" surf upload --ref e5 --files "/path/file1.txt,/path/file2.txt"

Iframe Handling

surf frame.list # List frames with IDs surf frame.switch "FRAME_ID" # Switch to iframe context surf frame.main # Return to main frame surf frame.js --id "FRAME_ID" --code "return document.title"

After frame.switch, subsequent commands target that frame:

surf frame.switch "iframe-1" surf page.read # Reads iframe content surf click e5 # Clicks in iframe surf frame.main # Back to main page

Network Inspection

surf network # List captured requests surf network --stream # Real-time network events surf network.get --id "req-123" # Full request details surf network.body --id "req-123" # Get response body surf network.curl --id "req-123" # Generate curl command surf network.origins # List origins with stats surf network.stats # Capture statistics surf network.export # Export all requests surf network.clear # Clear captured requests

Console

surf console # Get console messages surf console --stream # Real-time console surf console --stream --level error # Errors only

JavaScript Execution

surf js "return document.title" surf js "document.querySelector('.btn').click()"

Performance

surf perf.metrics # Current metrics snapshot surf perf.start # Start trace surf perf.stop # Stop and get results

Screenshots

surf screenshot # Auto-saves to /tmp/surf-snap-*.png surf screenshot --output /tmp/shot.png # Save to specific file surf screenshot --selector ".card" # Element only surf screenshot --full-page # Full page scroll capture surf screenshot --no-save # Return base64 only, don't save file

Zoom

surf zoom # Get current zoom level surf zoom 1.5 # Set zoom to 150% surf zoom 1 # Reset to 100%

Cookies & Storage

surf cookie.list # List cookies for current page surf cookie.list --domain .google.com surf cookie.set --name "token" --value "abc123" surf cookie.get --name "token" surf cookie.clear # Clear all cookies

History & Bookmarks

surf history --query "github" --max 20 surf bookmarks --query "docs" surf bookmark.add --url "https://..." --title "My Bookmark" surf bookmark.remove

Health Checks & Smoke Tests

surf health --url "http://localhost:3000" surf smoke --urls "http://localhost:3000" "http://localhost:3000/about" surf smoke --urls "..." --screenshot /tmp/smoke

Workflows

Execute multi-step browser automation as a single command with smart auto-waits.

Inline Workflows

Pipe-separated commands

surf do 'go "https://example.com" | click e5 | screenshot'

Multi-step login flow

surf do 'go "https://example.com/login" | type "user@example.com" --selector "#email" | type "pass" --selector "#password" | click --selector "button[type=submit]"'

Validate without executing

surf do 'go "url" | click e5' --dry-run

Named Workflows

Save workflows as JSON files in ~/.surf/workflows/ (user) or ./.surf/workflows/ (project):

List available workflows

surf workflow.list

Show workflow details

surf workflow.info my-workflow

Run by name with arguments

surf do my-workflow --email "user@example.com" --password "secret"

Validate workflow file

surf workflow.validate workflow.json

Workflow JSON Format

{ "name": "Login Flow", "description": "Automate login process", "args": { "email": { "required": true }, "password": { "required": true }, "url": { "default": "https://example.com/login" } }, "steps": [ { "tool": "navigate", "args": { "url": "%{url}" } }, { "tool": "type", "args": { "text": "%{email}", "selector": "input[name=email]" } }, { "tool": "type", "args": { "text": "%{password}", "selector": "input[name=password]" } }, { "tool": "click", "args": { "selector": "button[type=submit]" } }, { "tool": "screenshot", "args": {}, "as": "result" } ] }

Loops and Step Outputs

{ "steps": [ // Capture step output for later use { "tool": "js", "args": { "code": "return [1,2,3]" }, "as": "items" },

// Fixed iterations
{ "repeat": 5, "steps": [
  { "tool": "click", "args": { "ref": "e5" } }
]},

// Iterate over array
{ "each": "%{items}", "as": "item", "steps": [
  { "tool": "js", "args": { "code": "console.log('%{item}')" } }
]},

// Repeat until condition
{ "repeat": 20, "until": { "tool": "js", "args": { "code": "return done" } }, "steps": [...] }

] }

Workflow Options

--file, -f <path> # Load from JSON file --dry-run # Parse and validate without executing --on-error stop|continue # Error handling (default: stop) --step-delay <ms> # Delay between steps (default: 100, 0 to disable) --no-auto-wait # Disable automatic waits --json # Structured JSON output

Auto-waits: Commands automatically wait for completion:

  • Navigation (go , back , forward ) → waits for page load

  • Clicks, key presses, form fills → waits for DOM stability

  • Tab switches → waits for tab to load

Why use do ? Instead of 6-8 separate CLI calls with LLM orchestration between each, a workflow executes deterministically. Faster, cheaper, and more reliable.

Error Diagnostics

Auto-capture screenshot + console on failure

surf wait.element ".missing" --auto-capture --timeout 2000

Saves to /tmp/surf-error-*.png

Common Options

--tab-id <id> # Target specific tab --window-id <id> # Target specific window --json # Raw JSON output
--auto-capture # Screenshot + console on error --timeout <ms> # Override default timeout

Tips

  • First CDP operation is slow (~5-8s) - debugger attachment overhead, subsequent calls fast

  • Use refs from page.read for reliable element targeting over CSS selectors

  • JS method for contenteditable - Modern editors (ChatGPT, Claude, Notion) need --method js

  • Named tabs for workflows - tab.name app then tab.switch app

  • Auto-capture for debugging - --auto-capture saves diagnostics on failure

  • AI tools use browser session - Must be logged into the service, no API keys needed

  • Grok validation - Run surf grok --validate if queries fail to check UI changes

  • Long timeouts for thinking models - ChatGPT o1, Grok thinking can take 60+ seconds

  • Use surf do for multi-step tasks - Reduces token overhead and improves reliability

  • Dry-run workflows first - surf do '...' --dry-run validates without executing

  • Window isolation - Use window.new

  • --window-id to keep agent work separate from your browsing
  • Semantic locators - locate.role , locate.text , locate.label for more robust element finding

  • Frame context - Use frame.switch before interacting with iframe content

Socket API

For programmatic access:

echo '{"type":"tool_request","method":"execute_tool","params":{"tool":"tab.list","args":{}},"id":"1"}' | nc -U /tmp/surf.sock

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

ast-grep-code-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

codemapper

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

subagent-driven-development

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

test-driven-development

No summary provided by upstream source.

Repository SourceNeeds Review