actionbook

When to Use This Skill

Activate when the user:

Needs to do anything on a website ("Send a LinkedIn message", "Book an Airbnb", "Search Google for...")
Asks how to interact with a site ("How do I post a tweet?", "How to apply on LinkedIn?")
Wants to fill out forms, click buttons, navigate, search, filter, or browse on a specific site
Wants to take a screenshot of a web page or monitor changes
Builds browser-based AI agents, web scrapers, or E2E tests for external websites
Automates repetitive web tasks (data entry, form submission, content posting)
Needs to operate multiple websites or tabs concurrently

How It Works

Actionbook provides up-to-date action manuals for the modern web. Action manuals tell agents exactly what to do on a page — no parsing, no guessing.

Why this matters:

10x faster — action manuals provide selectors and page structure upfront. No snapshot-per-step loop needed.
Accurate — handles SPAs, streaming components, dropdowns, date pickers, and dynamic content reliably.
Concurrent — stateless architecture with explicit --session/--tab. Operate dozens of tabs in parallel.

The workflow:

Start a browser session
Navigate to the target page
Snapshot to get the page structure with element refs
Automate using refs from the snapshot

Run actionbook <command> --help for full usage and examples of any command.

Browser Automation

Every browser command is stateless — pass --session and --tab explicitly. No "current tab" — you can run commands on any session/tab in parallel.

Start a session

actionbook browser start --set-session-id s1

Both --session and --set-session-id are get-or-create: they reuse a Running session with the given ID, or create one if not found. If --profile is passed and does not match the session's bound profile, the command fails with SESSION_PROFILE_MISMATCH.

Core workflow: snapshot, act, wait

actionbook browser goto <url> --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1          # Get page structure with refs
actionbook browser fill @e3 "text" --session s1 --tab t1   # Use refs from snapshot
actionbook browser click @e7 --session s1 --tab t1
actionbook browser wait navigation --session s1 --tab t1   # Wait for page load

Snapshot refs

snapshot labels every element with a ref (e.g. @e3, @e7). Use these refs as selectors in any command — they are the recommended way to target elements.

Refs are stable across snapshots — if the element stays the same, the ref stays the same. This lets you chain multiple commands without re-snapshotting after every step.

Command categories

All commands support --help for full usage and examples.

Category	Key commands	Help
Search	`search`	`actionbook search --help`
Manual	`manual` (alias: `man`)	`actionbook manual --help`
Session	`start`, `close`, `restart`, `list-sessions`, `status`	`actionbook browser start --help`
Tab	`new-tab`, `close-tab`, `list-tabs`	`actionbook browser new-tab --help`
Navigation	`goto`, `back`, `forward`, `reload`	`actionbook browser goto --help`
Observation	`snapshot`, `text`, `html`, `value`, `title`, `url`, `viewport`, `attr`, `attrs`, `box`, `styles`, `describe`, `state`, `inspect-point`, `screenshot`, `pdf`	`actionbook browser snapshot --help`
Interaction	`click`, `fill`, `type`, `press`, `select`, `hover`, `focus`, `scroll`, `drag`, `upload`, `eval`, `mouse-move`, `cursor-position`	`actionbook browser click --help`
Wait	`wait element`, `wait navigation`, `wait network-idle`, `wait condition`	`actionbook browser wait element --help`
Cookies	`cookies list`, `cookies get`, `cookies set`, `cookies delete`, `cookies clear`	`actionbook browser cookies list --help`
Storage	`local-storage list\|get\|set\|delete\|clear`, `session-storage ...`	`actionbook browser local-storage get --help`
Logs	`logs console`, `logs errors`	`actionbook browser logs console --help`
Network	`network requests`, `network request <id>`, `network har start`, `network har stop`	`actionbook browser network requests --help`
Query	`query one\|all\|nth\|count`	`actionbook browser query --help`
Batch	`batch-new-tab`, `batch-snapshot`, `batch-click`	`actionbook browser batch-new-tab --help`
Extension	`extension status`, `extension ping`, `extension install`, `extension uninstall`, `extension path`	`actionbook extension status --help`
Daemon	`daemon restart`	`actionbook daemon restart --help`

Full command reference: command-reference.md

Cloud providers

Use -p / --provider with browser start to run sessions on a remote browser instead of launching local Chrome. Supported providers: driver, hyperbrowser, browseruse. Each reads its own <PROVIDER>_API_KEY from the shell env.

export HYPERBROWSER_API_KEY="your-key"
actionbook browser start -p hyperbrowser --session s1
actionbook browser goto "https://example.com" --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1

All browser commands work the same way regardless of mode. browser restart --session <id> mints a fresh remote session while preserving the session_id.

Example: End-to-End

User request: "Find a room next week in SF on Airbnb"

actionbook browser start --set-session-id s1
actionbook browser goto "https://airbnb.com" --session s1 --tab t1
actionbook browser snapshot --session s1 --tab t1
actionbook browser fill @e3 "San Francisco" --session s1 --tab t1
actionbook browser click @e7 --session s1 --tab t1
actionbook browser wait navigation --session s1 --tab t1

Eval Input Sources

browser eval accepts the expression from three mutually-exclusive sources:

Positional: actionbook browser eval "expr" ...
--file: actionbook browser eval --file script.js ...
Stdin: echo 'expr' | actionbook browser eval - ...

Eval Error Handling

browser eval returns structured error codes on failure — branch on error.code instead of parsing the message:

EVAL_RUNTIME_ERROR — JS exception. Inspect the expression before retrying.
EVAL_CROSS_ORIGIN — cross-origin fetch or CSP block. Proxy the request server-side.
EVAL_RESPONSE_NOT_JSON / EVAL_RESPONSE_NOT_OK — read error.details.body_head (first ≤256 chars of the response body) to distinguish 403 / challenge pages / CORS errors. Do not blindly retry.
EVAL_TIMEOUT — expression exceeded --timeout. Reduce work or raise the timeout.
EVAL_ARGS_CONFLICT — multiple input sources or none. Provide exactly one.
EVAL_FILE_NOT_FOUND — --file path unreadable. Verify the path.
EVAL_STDIN_TTY — - but stdin is a terminal. Pipe the expression.
EVAL_STDIN_EMPTY — stdin produced empty input. Verify the upstream pipeline.

CDP Error Handling

Browser commands that interact with elements, navigate, or communicate via CDP return structured error codes — branch on error.code:

CDP_NODE_NOT_FOUND — DOM node is stale. Call snapshot to refresh refs then retry.
CDP_NOT_INTERACTABLE — element exists but can't be acted on. Scroll into view, wait for visibility, or dismiss overlays.
CDP_NAV_TIMEOUT — navigation timeout. Increase --timeout or verify URL reachability. Retryable.
CDP_TARGET_CLOSED — tab navigated away or session torn down mid-command. Start a fresh session. Retryable.
CDP_PROTOCOL_ERROR — CDP response malformed. Inspect details.reason and details.cdp_code.
CDP_GENERIC — unclassified CDP error (transport/parse). No specific remediation.

CDP_NAV_TIMEOUT and CDP_TARGET_CLOSED are retryable (error.retryable == true). All other CDP codes require caller intervention before retrying. When error.code is a CDP_* code, error.details includes reason and cdp_code when available.

Selectors

Selectors should come from actionbook browser snapshot — not from prior knowledge or memory. Always snapshot first to get current refs, then use those refs to interact with the page.

Login Page Handling

When you hit a login/auth wall (sign-in page, password prompt, MFA/OTP, CAPTCHA, account chooser):

Pause automation and keep the current browser session open (same tab/profile/cookies).
Ask the user to complete login manually in that same browser window.
After user confirms login is done, continue in the same session.
If the post-login page is different, run actionbook browser snapshot to get the new page structure before continuing.

Do not switch tools just because a login page appears.

Session Cleanup

browser close is idempotent — closing an unknown or already-closed session returns ok: true with a warning in meta.warnings, not a fatal error. A typo in the session ID or a session that was already torn down is no longer an error condition.

Safe to call browser close unconditionally during cleanup without checking session existence first.
Read meta.warnings to distinguish a fresh close from an already-gone session. Do not treat a warning inside an ok: true response as a signal that the session is still alive.
If another close is already in flight for the same session, the command returns SESSION_CLOSING (fatal).

HAR Recording

network har start accepts --max-entries N to set the ring-buffer cap (default: 10000). When har stop detects dropped entries (data.dropped > 0), the envelope includes meta.truncated = true and a HAR_TRUNCATED warning in meta.warnings. Read data.max_entries to see the configured cap. Raise --max-entries or stop recording sooner to keep the full trace.

References

Reference	Description
command-reference.md	Complete command reference with all flags and options
authentication.md	Login flows, OAuth, 2FA handling, session persistence

Safety Notice

Copy this and send it to your AI assistant to learn