agent-browser

Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: web scraping, form filling, clicking, typing, drag-drop, file upload, JavaScript execution. Use for: web automation, data extraction, testing, agent browsing, research. Triggers: browser, web automation, scrape, navigate, click, fill form, screenshot, browse web, playwright, headless browser, web agent, surf internet, record video

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-browser" with this command: npx skills add inferencesh/skills@web-search

Agentic Browser

Agentic Browser

Browser automation for AI agents via inference.sh. Uses Playwright under the hood with a simple @e ref system for element interaction.

Quick Start

# Install CLI
curl -fsSL https://cli.inference.sh | sh && infsh login

# Open a page and get interactive elements
infsh app run agent-browser --function open --input '{"url": "https://example.com"}' --session new

Core Workflow

Every browser automation follows this pattern:

  1. Open - Navigate to URL, get @e refs for elements
  2. Interact - Use refs to click, fill, drag, etc.
  3. Re-snapshot - After navigation/changes, get fresh refs
  4. Close - End session (returns video if recording)
# 1. Start session
RESULT=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com/login"
}')
SESSION_ID=$(echo $RESULT | jq -r '.session_id')
# Elements: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"

# 2. Fill and submit
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "fill", "ref": "@e1", "text": "user@example.com"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "fill", "ref": "@e2", "text": "password123"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "click", "ref": "@e3"
}'

# 3. Re-snapshot after navigation
infsh app run agent-browser --function snapshot --session $SESSION_ID --input '{}'

# 4. Close when done
infsh app run agent-browser --function close --session $SESSION_ID --input '{}'

Functions

FunctionDescription
openNavigate to URL, configure browser (viewport, proxy, video recording)
snapshotRe-fetch page state with @e refs after DOM changes
interactPerform actions using @e refs (click, fill, drag, upload, etc.)
screenshotTake page screenshot (viewport or full page)
executeRun JavaScript code on the page
closeClose session, returns video if recording was enabled

Interact Actions

ActionDescriptionRequired Fields
clickClick elementref
dblclickDouble-click elementref
fillClear and type textref, text
typeType text (no clear)text
pressPress key (Enter, Tab, etc.)text
selectSelect dropdown optionref, text
hoverHover over elementref
checkCheck checkboxref
uncheckUncheck checkboxref
dragDrag and dropref, target_ref
uploadUpload file(s)ref, file_paths
scrollScroll pagedirection (up/down/left/right), scroll_amount
backGo back in history-
waitWait millisecondswait_ms
gotoNavigate to URLurl

Element Refs

Elements are returned with @e refs:

@e1 [a] "Home" href="/"
@e2 [input type="text"] placeholder="Search"
@e3 [button] "Submit"
@e4 [select] "Choose option"
@e5 [input type="checkbox"] name="agree"

Important: Refs are invalidated after navigation. Always re-snapshot after:

  • Clicking links/buttons that navigate
  • Form submissions
  • Dynamic content loading

Features

Video Recording

Record browser sessions for debugging or documentation:

# Start with recording enabled (optionally show cursor indicator)
SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "record_video": true,
  "show_cursor": true
}' | jq -r '.session_id')

# ... perform actions ...

# Close to get the video file
infsh app run agent-browser --function close --session $SESSION --input '{}'
# Returns: {"success": true, "video": <File>}

Cursor Indicator

Show a visible cursor in screenshots and video (useful for demos):

infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "show_cursor": true,
  "record_video": true
}'

The cursor appears as a red dot that follows mouse movements and shows click feedback.

Proxy Support

Route traffic through a proxy server:

infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "proxy_url": "http://proxy.example.com:8080",
  "proxy_username": "user",
  "proxy_password": "pass"
}'

File Upload

Upload files to file inputs:

infsh app run agent-browser --function interact --session $SESSION --input '{
  "action": "upload",
  "ref": "@e5",
  "file_paths": ["/path/to/file.pdf"]
}'

Drag and Drop

Drag elements to targets:

infsh app run agent-browser --function interact --session $SESSION --input '{
  "action": "drag",
  "ref": "@e1",
  "target_ref": "@e2"
}'

JavaScript Execution

Run custom JavaScript:

infsh app run agent-browser --function execute --session $SESSION --input '{
  "code": "document.querySelectorAll(\"h2\").length"
}'
# Returns: {"result": "5", "screenshot": <File>}

Deep-Dive Documentation

ReferenceDescription
references/commands.mdFull function reference with all options
references/snapshot-refs.mdRef lifecycle, invalidation rules, troubleshooting
references/session-management.mdSession persistence, parallel sessions
references/authentication.mdLogin flows, OAuth, 2FA handling
references/video-recording.mdRecording workflows for debugging
references/proxy-support.mdProxy configuration, geo-testing

Ready-to-Use Templates

TemplateDescription
templates/form-automation.shForm filling with validation
templates/authenticated-session.shLogin once, reuse session
templates/capture-workflow.shContent extraction with screenshots

Examples

Form Submission

SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com/contact"
}' | jq -r '.session_id')

# Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea], @e4 [button] "Send"

infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'

infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run agent-browser --function close --session $SESSION --input '{}'

Search and Extract

SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://google.com"
}' | jq -r '.session_id')

infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'

infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run agent-browser --function close --session $SESSION --input '{}'

Screenshot with Video

SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "record_video": true
}' | jq -r '.session_id')

# Take full page screenshot
infsh app run agent-browser --function screenshot --session $SESSION --input '{
  "full_page": true
}'

# Close and get video
RESULT=$(infsh app run agent-browser --function close --session $SESSION --input '{}')
echo $RESULT | jq '.video'

Sessions

Browser state persists within a session. Always:

  1. Start with --session new on first call
  2. Use returned session_id for subsequent calls
  3. Close session when done

Related Skills

# Web search (for research + browse)
npx skills add inferencesh/skills@web-search

# LLM models (analyze extracted content)
npx skills add inferencesh/skills@llm-models

Documentation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Marp Slide Deck Generator

Turn docs, notes, data, logs, screenshots, and code summaries into polished Marp slide decks.

Registry SourceRecently Updated
Coding

Video Factory

create raw video clips into finished MP4 videos with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. content creators and marketers use it for...

Registry SourceRecently Updated
Coding

Clip Skill

Turn a 2-minute raw video recording into 1080p edited video clips just by typing what you need. Whether it's trimming and refining short video clips for soci...

Registry SourceRecently Updated
Coding

Editor Kiss Free

edit raw video clips into edited video clips with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. social media creators use it for adding kiss...

Registry SourceRecently Updated