Browser Automation with PinchTab
PinchTab gives agents a browser they can drive through stable accessibility refs, low-token text extraction, and persistent profiles or instances. Treat it as a CLI-first browser skill; use the HTTP API only when the CLI is unavailable or you need profile-management routes that do not exist in the CLI yet.
Preferred tool surface:
- Use
pinchtabCLI commands first. - Use
curlfor profile-management routes or non-shell/API fallback flows. - Use
jqonly when you need structured parsing from JSON responses.
Safety Defaults
- Default to
http://localhosttargets. Only use a remote PinchTab server when the user explicitly provides it and, if needed, a token. - Prefer read-only operations first:
text,snap -i -c,snap -d,find,click,fill,type,press,select,hover,scroll. - Do not evaluate arbitrary JavaScript unless a simpler PinchTab command cannot answer the question.
- Do not upload local files unless the user explicitly names the file to upload and the destination flow requires it.
- Do not save screenshots, PDFs, or downloads to arbitrary paths. Use a user-specified path or a safe temporary/workspace path.
- Never use PinchTab to inspect unrelated local files, browser secrets, stored credentials, or system configuration outside the task.
Core Workflow
Every PinchTab automation follows this pattern:
- Ensure the correct server, profile, or instance is available for the task.
- Navigate with
pinchtab nav <url>orpinchtab instance navigate <instance-id> <url>. - Observe with
pinchtab snap -i -c,pinchtab snap --text, orpinchtab text, then collect the current refs such ase5. - Interact with those fresh refs using
click,fill,type,press,select,hover, orscroll. - Re-snapshot or re-read text after any navigation, submit, modal open, accordion expand, or other DOM-changing action.
Rules:
- Never act on stale refs after the page changes.
- Default to
pinchtab textwhen you need content, not layout. - Default to
pinchtab snap -i -cwhen you need actionable elements. - Use screenshots only for visual verification, UI diffs, or debugging.
- Start multi-site or parallel work by choosing the right instance or profile first.
Command Chaining
Use && only when you do not need to inspect intermediate output before deciding the next step.
Good:
pinchtab nav https://example.com && pinchtab snap -i -c
pinchtab click --wait-nav e5 && pinchtab snap -i -c
pinchtab nav https://example.com --block-images && pinchtab text
Run commands separately when you must read the snapshot output first:
pinchtab nav https://example.com
pinchtab snap -i -c
# Read refs, choose the correct e#
pinchtab click e7
pinchtab snap -i -c
Handling Authentication and State
Pick one of these five patterns before you start interacting with the site.
1. One-off public browsing
Use a temporary instance for public pages, scraping, or tasks that do not need login persistence.
pinchtab instance start
pinchtab instances
# Point CLI commands at the instance port you want to use.
PINCHTAB_URL=http://localhost:9868 pinchtab nav https://example.com
PINCHTAB_URL=http://localhost:9868 pinchtab text
2. Reuse an existing named profile
Use this for recurring tasks against the same authenticated site.
pinchtab profiles
pinchtab instance start --profile work --mode headed
PINCHTAB_URL=http://localhost:9868 pinchtab nav https://mail.google.com
If the login is already stored in that profile, you can switch to headless later:
pinchtab instance stop inst_ea2e747f
pinchtab instance start --profile work --mode headless
3. Create a dedicated auth profile over HTTP
Use this when you need a durable profile and it does not exist yet.
curl -X POST http://localhost:9867/profiles \
-H "Content-Type: application/json" \
-d '{"name":"billing","description":"Billing portal automation","useWhen":"Use for billing tasks"}'
curl -X POST http://localhost:9867/profiles/billing/start \
-H "Content-Type: application/json" \
-d '{"headless":false}'
Then target the returned port with PINCHTAB_URL.
4. Human-assisted headed login, then agent reuse
Use this for CAPTCHA, MFA, or first-time setup.
pinchtab instance start --profile work --mode headed
# Human completes login in the visible Chrome window.
PINCHTAB_URL=http://localhost:9868 pinchtab nav https://app.example.com/dashboard
PINCHTAB_URL=http://localhost:9868 pinchtab snap -i -c
Once the session is stored, reuse the same profile for later tasks.
5. Remote or non-shell agent with tokenized HTTP API
Use this when the agent cannot call the CLI directly.
curl http://localhost:9867/health
curl -X POST http://localhost:9867/instances/launch \
-H "Content-Type: application/json" \
-d '{"name":"work","headless":true}'
curl -X POST http://localhost:9868/action \
-H "Content-Type: application/json" \
-d '{"kind":"click","ref":"e5"}'
If the server is exposed beyond localhost, require a token and use a dedicated automation profile. See TRUST.md and config.md.
Essential Commands
Server and targeting
pinchtab server
pinchtab daemon
pinchtab health
pinchtab instances
pinchtab profiles
PINCHTAB_URL=http://localhost:9868 pinchtab snap -i -c
Navigation and tabs
pinchtab nav <url>
pinchtab nav <url> --new-tab
pinchtab nav <url> --tab <tab-id>
pinchtab nav <url> --block-images
pinchtab nav <url> --block-ads
pinchtab tab
pinchtab tab new <url>
pinchtab tab close <tab-id>
pinchtab instance navigate <instance-id> <url>
Observation
pinchtab snap
pinchtab snap -i
pinchtab snap -i -c
pinchtab snap -d
pinchtab snap --selector <css>
pinchtab snap --max-tokens <n>
pinchtab snap --text
pinchtab text
pinchtab text --raw
pinchtab find <query>
pinchtab find --ref-only <query>
Guidance:
snap -i -cis the default for finding actionable refs.snap -dis the default follow-up snapshot for multi-step flows.textis the default for reading articles, dashboards, reports, or confirmation messages.find --ref-onlyis useful when the page is large and you already know the semantic target.
Interaction
pinchtab click <ref>
pinchtab click --wait-nav <ref>
pinchtab click --css <selector>
pinchtab type <ref> <text>
pinchtab fill <ref|selector> <text>
pinchtab press <key>
pinchtab hover <ref>
pinchtab select <ref> <value>
pinchtab scroll <ref|pixels>
Rules:
- Prefer
fillfor deterministic form entry. - Prefer
typeonly when the site depends on keystroke events. - Prefer
click --wait-navwhen a click is expected to navigate. - Re-snapshot immediately after
click,press Enter,select, orscrollif the UI can change.
Export, debug, and verification
pinchtab screenshot
pinchtab screenshot -o /tmp/pinchtab-page.png # Format driven by extension
pinchtab screenshot -q 60
pinchtab pdf
pinchtab pdf -o /tmp/pinchtab-report.pdf
pinchtab pdf --landscape
Advanced operations: explicit opt-in only
Use these only when the task explicitly requires them and safer commands are insufficient.
pinchtab eval "document.title"
pinchtab download <url> -o /tmp/pinchtab-download.bin
pinchtab upload /absolute/path/provided-by-user.ext -s <css>
Rules:
evalis for narrow, read-only DOM inspection unless the user explicitly asks for a page mutation.downloadshould prefer a safe temporary or workspace path over an arbitrary filesystem location.uploadrequires a file path the user explicitly provided or clearly approved for the task.
HTTP API fallback
curl -X POST http://localhost:9868/navigate \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com"}'
curl "http://localhost:9868/snapshot?filter=interactive&format=compact"
curl -X POST http://localhost:9868/action \
-H "Content-Type: application/json" \
-d '{"kind":"fill","ref":"e3","text":"ada@example.com"}'
curl http://localhost:9868/text
Use the API when:
- the agent cannot shell out,
- profile creation or mutation is required,
- or you need explicit instance- and tab-scoped routes.
Common Patterns
Open a page and inspect actions
pinchtab nav https://pinchtab.com && pinchtab snap -i -c
Fill and submit a form
pinchtab nav https://example.com/login
pinchtab snap -i -c
pinchtab fill e3 "user@example.com"
pinchtab fill e4 "correct horse battery staple"
pinchtab click --wait-nav e5
pinchtab text
Search, then extract the result page cheaply
pinchtab nav https://example.com
pinchtab snap -i -c
pinchtab fill e2 "quarterly report"
pinchtab press Enter
pinchtab text
Use diff snapshots in a multi-step flow
pinchtab nav https://example.com/checkout
pinchtab snap -i -c
pinchtab click e8
pinchtab snap -d -i -c
Bootstrap an authenticated profile
pinchtab profiles
pinchtab instance start --profile work --mode headed
# Human signs in once.
PINCHTAB_URL=http://localhost:9868 pinchtab text
Run separate instances for separate sites
pinchtab instance start --profile work --mode headless
pinchtab instance start --profile staging --mode headless
pinchtab instances
Then point each command stream at its own PINCHTAB_URL.
Security and Token Economy
- Use a dedicated automation profile, not a daily browsing profile.
- If PinchTab is reachable off-machine, require a token and bind conservatively.
- Prefer
text,snap -i -c, andsnap -dbefore screenshots, PDFs, eval, downloads, or uploads. - Use
--block-imagesfor read-heavy tasks that do not need visual assets. - Stop or isolate instances when switching between unrelated accounts or environments.
Diffing and Verification
- Use
pinchtab snap -dafter each state-changing action in long workflows. - Use
pinchtab textto confirm success messages, table updates, or navigation outcomes. - Use
pinchtab screenshotonly when visual regressions, CAPTCHA, or layout-specific confirmation matters. - If a ref disappears after a change, treat that as expected and fetch fresh refs instead of retrying the stale one.
Privacy and Security
PinchTab is a fully open-source, local-only browser automation tool:
- Runs on localhost only. The server binds to
127.0.0.1by default. No external network calls are made by PinchTab itself. - No telemetry or analytics. The binary makes zero outbound connections.
- Single Go binary (~16 MB). Fully verifiable — anyone can build from source at github.com/pinchtab/pinchtab.
- Local Chrome profiles. Persistent profiles store cookies and sessions on your machine only. This enables agents to reuse authenticated sessions without re-entering credentials, similar to how a human reuses their browser profile.
- Token-efficient by design. Uses the accessibility tree (structured text) instead of screenshots, keeping agent context windows small. Comparable to Playwright but purpose-built for AI agents.
- Multi-instance isolation. Each browser instance runs in its own profile directory with tab-level locking for safe multi-agent use.
References
- Command surface: commands.md
- CLI overview: cli.md
- Profiles: profiles.md
- Instances: instances.md
- Full API: api.md
- Minimal env vars: env.md
- Config reference: config.md
- Security model: TRUST.md