Agent Browser
Headless browser automation CLI for AI agents. Fast Rust CLI with Node.js fallback.
Works with: Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Google Gemini, opencode.
Quick Navigation
Topic Reference
Installation installation.md
Commands commands.md
Refs refs.md
Advanced advanced.md
When to Use
-
Automating browser tasks in AI agent workflows
-
Web scraping with AI-friendly output
-
Testing web applications with LLM agents
-
Managing multiple browser sessions with isolated auth
Core Concepts
Refs (Element References)
The snapshot command returns an accessibility tree where each element has a unique ref like @e1 , @e2 :
-
Deterministic - ref points to exact element from snapshot
-
Fast - no DOM re-query needed
-
AI-friendly - LLMs can reliably parse and use refs
Architecture
Client-daemon architecture:
-
Rust CLI - parses commands, communicates with daemon
-
Daemon - runs the browser automation engine:
-
Default: Node.js daemon (Playwright)
-
Native (v0.16.0+): native Rust daemon (direct Chrome DevTools Protocol), enabled via --native , AGENT_BROWSER_NATIVE=1 , or "native": true
-
Lightpanda (v0.17.0+): Lightpanda engine, selected via --engine lightpanda or AGENT_BROWSER_ENGINE=lightpanda (implies native mode)
Daemon starts automatically and persists between commands. Startup errors are surfaced directly (v0.17.0+).
v0.8.6 improves daemon reliability by cleaning stale socket/PID files and retrying transient connection errors.
Quick Example
Navigate and get snapshot
agent-browser open example.com agent-browser snapshot # Get accessibility tree with refs agent-browser click @e2 # Click by ref from snapshot agent-browser fill @e3 "test@example.com" # Fill input by ref agent-browser get text @e1 # Get text by ref agent-browser screenshot page.png # Save screenshot agent-browser close
AI Workflow Pattern
Optimal workflow for AI agents:
1. Navigate and get snapshot
agent-browser open example.com agent-browser snapshot -i --json # AI parses tree and refs
2. AI identifies target refs from snapshot
3. Execute actions using refs
agent-browser click @e2 agent-browser fill @e3 "input text"
4. Get new snapshot if page changed
agent-browser snapshot -i --json
Headed Mode (Debugging)
agent-browser open example.com --headed
Local File Access (v0.9.1)
agent-browser open file:///path/to/doc.pdf --allow-file-access
Cursor-Aware Snapshots (v0.9.1)
agent-browser snapshot -C agent-browser snapshot --cursor
Session Persistence (v0.10.0)
Automatically save and restore cookies/localStorage across restarts with a named session:
agent-browser --session-name myapp open myapp.com agent-browser --session-name myapp open myapp.com
State management commands:
agent-browser state list agent-browser state show myapp agent-browser state rename myapp myapp-prod agent-browser state clear myapp-prod agent-browser state cleanup
Release Updates (v0.12.0–v0.14.0)
-
Added keyboard commands for raw keyboard input at the currently focused element (no selector needed).
-
Added persistent color scheme selection via --color-scheme and AGENT_BROWSER_COLOR_SCHEME .
-
Improved IPC reliability (EAGAIN/backpressure-aware writes) and lowered default Playwright timeout to 25s (configurable via AGENT_BROWSER_DEFAULT_TIMEOUT ).
-
Improved CDP reconnection and fixed state load when no browser is running.
-
Reduced --annotate warning noise when the flag isn’t explicitly passed.
New Tab Clicks (v0.10.0)
agent-browser click @e12 --new-tab
Mobile Safari (iOS)
agent-browser -p ios device list agent-browser -p ios open https://example.com --device "iPhone 15" agent-browser tap 200 400 agent-browser swipe 200 600 200 200 500
JSON Output
Use --json for machine-readable output:
agent-browser snapshot --json agent-browser get text @e1 --json agent-browser is visible @e2 --json
Critical Prohibitions
-
Do not use CSS/XPath selectors when refs are available (use @e1, @e2, etc.)
-
Do not forget to close sessions when done
-
Do not assume element positions without taking a fresh snapshot
-
Do not use old refs after page navigation or content changes (re-snapshot)
Common Commands
Navigation
agent-browser open <url> agent-browser back / forward / reload agent-browser close
Interaction
agent-browser click <sel> agent-browser click <sel> --new-tab agent-browser fill <sel> <text> agent-browser press <key> agent-browser hover <sel> agent-browser select <sel> <val> agent-browser download <sel> <path> # v0.7+
Info
agent-browser get text <sel> agent-browser get url agent-browser get title agent-browser is visible <sel>
Snapshots & Screenshots
agent-browser snapshot -i --json agent-browser screenshot [path]
Links
-
Documentation
-
Changelog
-
GitHub
-
npm