agent-browser

CLI-based browser automation using vercel-labs/agent-browser. Provides browser control via Bash commands as an alternative to playwright-mcp.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-browser" with this command: npx skills add numberone-ai/machina-meta/numberone-ai-machina-meta-agent-browser

agent-browser Skill

CLI-based browser automation using vercel-labs/agent-browser. Provides browser control via Bash commands as an alternative to playwright-mcp.

When to Use agent-browser vs playwright-mcp

Feature playwright-mcp agent-browser

Invocation MCP tool calls (mcp__playwright__* ) Bash commands (npx agent-browser )

Session persistence No (fresh context each time) Yes (--session , --session-name )

State save/load No Yes (state save/load )

Page diffing No Yes (diff snapshot , diff screenshot )

Annotated screenshots No Yes (screenshot --annotate )

Network interception No Yes (network route )

Cookie management Via browser_run_code JS Native (cookies set )

Multi-tab browser_tabs tool tab new , tab <n>

Element refs @ref from snapshot @ref from snapshot (same concept)

Best for Quick interactions, MCP-native workflows Advanced workflows, sessions, diffing, state persistence

Default: Use playwright-mcp for standard browser interactions (it's already loaded as an MCP server).

Use agent-browser when you need:

  • Session persistence across commands

  • Page diffing (snapshot or visual)

  • Annotated screenshots with element labels

  • Network interception / mocking

  • Saved authentication state

  • Cookie/storage manipulation

  • CDP connection to existing browser

Nix Environment Setup

This workspace uses Nix for reproducible tooling. The Nix-managed Playwright has Chromium at a different revision than agent-browser's bundled Playwright expects. The project-level agent-browser.json config file at the workspace root handles this automatically by setting executablePath to the Nix store Chromium.

No extra flags needed — just run npx agent-browser commands from the workspace root.

If the Chromium path changes (e.g., after niv update ), update agent-browser.json :

{ "executablePath": "/nix/store/<hash>-playwright-chromium/chrome-linux64/chrome" }

Find the current path: ls -la /nix/store/-playwright-browsers/chromium-

Core Workflow

The basic agent-browser interaction loop:

1. Open a page

npx agent-browser open https://example.com

2. Get interactive elements (snapshot with refs)

npx agent-browser snapshot -i

3. Interact using @refs from snapshot

npx agent-browser click @e3 npx agent-browser fill @e5 "search term"

4. Re-snapshot to see updated state

npx agent-browser snapshot -i

5. Take screenshot for visual verification

npx agent-browser screenshot

6. Close when done

npx agent-browser close

Snapshot Options

npx agent-browser snapshot # Full accessibility tree npx agent-browser snapshot -i # Interactive elements only (recommended) npx agent-browser snapshot -c # Compact (remove empty elements) npx agent-browser snapshot -d 3 # Limit depth to 3 levels npx agent-browser snapshot -s "#main" # Scope to CSS selector

Screenshots

npx agent-browser screenshot # Viewport screenshot npx agent-browser screenshot --full # Full page npx agent-browser screenshot --annotate # With numbered element labels npx agent-browser screenshot path/to/save.png # Save to specific path

MachinaMed Authentication

MachinaMed uses Google SSO. You must set JWT cookies before navigating to authenticated pages.

Method 1: State File (Recommended for agent-browser)

1. Generate auth cookies

(cd repos/dem2 && ./scripts/user_manager.py user token dbeal@numberone.ai --export-cookie)

Output: export AUTH_HEADER="Cookie: access_token=eyJhbG...; refresh_token=eyJhbG..."

2. Open browser and set cookies before navigating

npx agent-browser open about:blank npx agent-browser cookies set access_token "eyJhbG..." --domain localhost --path / npx agent-browser cookies set refresh_token "eyJhbG..." --domain localhost --path /

3. Now navigate to authenticated page

npx agent-browser open http://localhost:3000/markers

Method 2: Session Persistence

First time: set up auth in a named session

npx agent-browser --session-name machina open about:blank npx agent-browser --session-name machina cookies set access_token "eyJhbG..." --domain localhost --path / npx agent-browser --session-name machina cookies set refresh_token "eyJhbG..." --domain localhost --path / npx agent-browser --session-name machina open http://localhost:3000/markers

Later: reuse the session (cookies persist)

npx agent-browser --session-name machina open http://localhost:3000/markers

Method 3: State Save/Load

Save authenticated state for reuse

npx agent-browser state save machina-auth.json

Load it in a new session

npx agent-browser --state machina-auth.json open http://localhost:3000/markers

For Preview/Staging Environments

Replace localhost domain with the target:

npx agent-browser cookies set access_token "eyJhbG..." --domain preview-99.n1-machina.dev --path / npx agent-browser cookies set refresh_token "eyJhbG..." --domain preview-99.n1-machina.dev --path / npx agent-browser open https://preview-99.n1-machina.dev/markers

Session Management

Isolated sessions (separate browser contexts)

npx agent-browser --session my-test open https://example.com npx agent-browser --session my-test snapshot -i

List active sessions

npx agent-browser session list

Auto-persisting sessions (state saved/restored automatically)

npx agent-browser --session-name my-persistent open https://example.com

Page Diffing

Compare page states before and after changes:

Snapshot Diff (DOM/Accessibility Tree)

Take baseline snapshot, make changes, then diff

npx agent-browser snapshot > baseline.txt

... make changes ...

npx agent-browser diff snapshot --baseline baseline.txt

Scoped diff (only specific section)

npx agent-browser diff snapshot --selector "#main-content" --compact

Visual Diff (Pixel Comparison)

Save baseline screenshot

npx agent-browser screenshot baseline.png

After changes, compare

npx agent-browser diff screenshot --baseline baseline.png -o diff-result.png

With threshold (0-1, default 0.2)

npx agent-browser diff screenshot --baseline baseline.png -t 0.1

URL Comparison

Compare two URLs side by side

npx agent-browser diff url http://localhost:3000/markers https://preview-99.n1-machina.dev/markers

Visual comparison

npx agent-browser diff url http://localhost:3000 https://preview-99.n1-machina.dev --screenshot

Scoped comparison

npx agent-browser diff url <url1> <url2> --selector "#biomarker-list"

Network Interception

Block requests (e.g., analytics)

npx agent-browser network route "/analytics/" --abort

Mock API responses

npx agent-browser network route "/api/v1/observations/" --body '{"items": []}'

View tracked requests

npx agent-browser network requests npx agent-browser network requests --filter api

Remove routes

npx agent-browser network unroute

Common Commands Reference

Navigation

npx agent-browser open <url> # Navigate to URL npx agent-browser back # Go back npx agent-browser forward # Go forward npx agent-browser reload # Reload page npx agent-browser close # Close browser

Interaction

npx agent-browser click @e1 # Click element by ref npx agent-browser fill @e2 "text" # Clear and fill input npx agent-browser type @e2 "text" # Type into element npx agent-browser press Enter # Press key npx agent-browser select @e3 "option" # Select dropdown option npx agent-browser check @e4 # Check checkbox npx agent-browser scroll down 500 # Scroll down 500px

Element Inspection

npx agent-browser get text @e1 # Get text content npx agent-browser get html @e1 # Get innerHTML npx agent-browser get value @e1 # Get input value npx agent-browser get title # Get page title npx agent-browser get url # Get current URL npx agent-browser is visible @e1 # Check visibility

Semantic Locators

npx agent-browser find role button click # Click first button npx agent-browser find text "Submit" click # Click by text npx agent-browser find label "Email" fill "a@b.c" # Fill by label npx agent-browser find testid "login-btn" click # Click by data-testid

Waiting

npx agent-browser wait "#element" # Wait for element npx agent-browser wait 2000 # Wait 2 seconds npx agent-browser wait --text "Loading complete" # Wait for text npx agent-browser wait --load networkidle # Wait for network idle

JavaScript Execution

npx agent-browser eval "document.title" npx agent-browser eval "window.scrollTo(0, document.body.scrollHeight)"

Tabs

npx agent-browser tab # List tabs npx agent-browser tab new <url> # Open new tab npx agent-browser tab 2 # Switch to tab 2 npx agent-browser tab close # Close current tab

Cookies & Storage

npx agent-browser cookies # Get all cookies npx agent-browser cookies set <name> <value> # Set cookie npx agent-browser cookies clear # Clear cookies npx agent-browser storage local # Get localStorage npx agent-browser storage local set <key> <value> # Set localStorage

State Management

npx agent-browser state save <path> # Save auth/session state npx agent-browser state load <path> # Load saved state npx agent-browser state list # List saved states npx agent-browser state clear --all # Clear all states

Debugging

npx agent-browser console # View console messages npx agent-browser errors # View page errors npx agent-browser highlight @e1 # Highlight element npx agent-browser trace start # Start recording trace npx agent-browser trace stop # Stop and save trace

Global Flags

--session <name> # Isolated browser session --session-name <name> # Auto-persisting session --state <path> # Load storage state JSON --headed # Show browser window (visible) --json # JSON output format --full, -f # Full page screenshot --annotate # Annotated screenshot --debug # Debug output --config <path> # Config file path

Environment Variables

Variable Purpose

AGENT_BROWSER_SESSION

Session name

AGENT_BROWSER_SESSION_NAME

Auto-save/restore session

AGENT_BROWSER_STATE

Storage state JSON file

AGENT_BROWSER_PROFILE

Persistent profile path

AGENT_BROWSER_DEFAULT_TIMEOUT

Operation timeout (ms)

Related Skills

  • machina-ui: Required authentication setup for MachinaMed pages. Load machina-ui first to get JWT cookies, then use agent-browser for browser automation.

  • playwright-mcp: Default MCP-based browser automation. Use for standard interactions.

  • machina-docker: Development stack management (start/stop services before browser testing).

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

agent-browser

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.

Repository SourceNeeds Review
102.9K22.4Kvercel-labs
Coding

agent browser

No summary provided by upstream source.

Repository SourceNeeds Review
1.2K-am-will
Coding

agent browser

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

agent-browser

No summary provided by upstream source.

Repository SourceNeeds Review