agent-browser

agent-browser - Browser Automation for AI Agents

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-browser" with this command: npx skills add supercent-io/skills-template/supercent-io-skills-template-agent-browser

agent-browser - Browser Automation for AI Agents

When to use this skill

  • Open websites and automate UI actions

  • Fill forms, click controls, and verify outcomes

  • Capture screenshots/PDFs or extract content

  • Run deterministic web checks with accessibility refs

  • Execute parallel browser tasks via isolated sessions

Core workflow

Always use the deterministic ref loop:

  • agent-browser open <url>

  • agent-browser snapshot -i

  • interact with refs (@e1 , @e2 , ...)

  • agent-browser snapshot -i again after page/DOM changes

agent-browser open https://example.com/form agent-browser wait --load networkidle agent-browser snapshot -i agent-browser fill @e1 "user@example.com" agent-browser click @e2 agent-browser snapshot -i

Command patterns

Use && chaining when intermediate output is not needed.

Good chaining: open -> wait -> snapshot

agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i

Separate calls when output is needed first

agent-browser snapshot -i

parse refs

agent-browser click @e2

High-value commands:

  • Navigation: open , close

  • Snapshot: snapshot -i , snapshot -i -C , snapshot -s "#selector"

  • Interaction: click , fill , type , select , check , press

  • Verification: diff snapshot , diff screenshot --baseline <file>

  • Capture: screenshot , screenshot --annotate , pdf

  • Wait: wait --load networkidle , wait <selector|@ref|ms>

Verification patterns

Use explicit evidence after actions.

Baseline -> action -> verify structure

agent-browser snapshot -i agent-browser click @e3 agent-browser diff snapshot

Visual regression

agent-browser screenshot baseline.png agent-browser click @e5 agent-browser diff screenshot --baseline baseline.png

Safety and reliability

  • Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.

  • Prefer wait --load networkidle or selector/ref waits over fixed sleeps.

  • For multi-step JS, use eval --stdin (or base64) to avoid shell escaping breakage.

  • For concurrent tasks, isolate with --session <name> .

  • Use output controls in long pages to reduce context flooding.

  • Optional hardening in sensitive flows: domain allowlist and action policies.

Optional hardening examples:

Wrap page content with boundaries to reduce prompt-injection risk

export AGENT_BROWSER_CONTENT_BOUNDARIES=1

Limit output volume for long pages

export AGENT_BROWSER_MAX_OUTPUT=50000

Restrict navigation and network to trusted domains

export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"

Restrict allowed action types

export AGENT_BROWSER_ACTION_POLICY=./policy.json

Example policy.json :

{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}

CLI-flag equivalent:

agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com

Troubleshooting

  • command not found : install and run agent-browser install .

  • Wrong element clicked: run snapshot -i again and use fresh refs.

  • Dynamic SPA content missing: wait with --load networkidle or targeted wait selector.

  • Session collisions: assign unique --session names and close each session.

  • Large output pressure: narrow snapshots (-i , -c , -d , -s ) and extract only needed text.

References

Deep-dive docs in this skill:

  • commands

  • snapshot-refs

  • session-management

  • authentication

Related resources:

Ready templates:

  • ./templates/form-automation.sh

  • ./templates/capture-workflow.sh

Metadata

  • Version: 1.1.0

  • Last updated: 2026-02-26

  • Scope: deterministic browser automation for agent workflows

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

workflow-automation

When to use this skill

Repository Source
Automation

deployment-automation

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

git-workflow

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

copilot-coding-agent

No summary provided by upstream source.

Repository SourceNeeds Review