Dev Browser Skill
Browser automation that maintains page state across script executions. Write small, focused scripts to accomplish tasks incrementally. Once you've proven out part of a workflow and there is repeated work to be done, you can write a script to do the repeated work in a single execution.
Choosing Your Approach
-
Local/source-available sites: Read the source code first to write selectors directly
-
Unknown page layouts: Use getAISnapshot() to discover elements and selectSnapshotRef() to interact with them
-
Visual feedback: Take screenshots to see what the user sees
Setup
Installation: See references/installation.md for detailed setup instructions including Windows support.
Two modes available. Ask the user if unclear which to use.
Standalone Mode (Default)
Launches a new Chromium browser for fresh automation sessions.
./skills/dev-browser/server.sh &
Add --headless flag if user requests it. Wait for the Ready message before running scripts.
Extension Mode
Connects to user's existing Chrome browser. Use this when:
-
The user is already logged into sites and wants you to do things behind an authed experience that isn't local dev.
-
The user asks you to use the extension
Important: The core flow is still the same. You create named pages inside of their browser.
Start the relay server:
cd skills/dev-browser && npm i && npm run start-extension &
Wait for Waiting for extension to connect... followed by Extension connected in the console. To know that a client has connected and the browser is ready to be controlled. Workflow:
-
Scripts call client.page("name") just like the normal mode to create new pages / connect to existing ones.
-
Automation runs on the user's actual browser session
If the extension hasn't connected yet, tell the user to launch and activate it. Download link: https://github.com/SawyerHood/dev-browser/releases
Writing Scripts
Run all scripts from skills/dev-browser/ directory. The @/ import alias requires this directory's config.
Execute scripts inline using heredocs:
cd skills/dev-browser && npx tsx <<'EOF' import { connect, waitForPageLoad } from "@/client.js";
const client = await connect(); // Create page with custom viewport size (optional) const page = await client.page("example", { viewport: { width: 1920, height: 1080 } });
await page.goto("https://example.com"); await waitForPageLoad(page);
console.log({ title: await page.title(), url: page.url() }); await client.disconnect(); EOF
Write to tmp/ files only when the script needs reuse, is complex, or user explicitly requests it.
Key Principles
-
Small scripts: Each script does ONE thing (navigate, click, fill, check)
-
Evaluate state: Log/return state at the end to decide next steps
-
Descriptive page names: Use "checkout" , "login" , not "main"
-
Disconnect to exit: await client.disconnect()
-
pages persist on server
-
Plain JS in evaluate: page.evaluate() runs in browser - no TypeScript syntax
Workflow Loop
Follow this pattern for complex tasks:
-
Write a script to perform one action
-
Run it and observe the output
-
Evaluate - did it work? What's the current state?
-
Decide - is the task complete or do we need another script?
-
Repeat until task is done
No TypeScript in Browser Context
Code passed to page.evaluate() runs in the browser, which doesn't understand TypeScript:
// ✅ Correct: plain JavaScript const text = await page.evaluate(() => { return document.body.innerText; });
// ❌ Wrong: TypeScript syntax will fail at runtime const text = await page.evaluate(() => { const el: HTMLElement = document.body; // Type annotation breaks in browser! return el.innerText; });
Scraping Data
For scraping large datasets, intercept and replay network requests rather than scrolling the DOM. See references/scraping.md for the complete guide covering request capture, schema discovery, and paginated API replay.
Client API
const client = await connect();
// Get or create named page (viewport only applies to new pages) const page = await client.page("name"); const pageWithSize = await client.page("name", { viewport: { width: 1920, height: 1080 } });
const pages = await client.list(); // List all page names await client.close("name"); // Close a page await client.disconnect(); // Disconnect (pages persist)
// ARIA Snapshot methods const snapshot = await client.getAISnapshot("name"); // Get accessibility tree const element = await client.selectSnapshotRef("name", "e5"); // Get element by ref
The page object is a standard Playwright Page.
Waiting
import { waitForPageLoad } from "@/client.js";
await waitForPageLoad(page); // After navigation await page.waitForSelector(".results"); // For specific elements await page.waitForURL("**/success"); // For specific URL
Inspecting Page State
Screenshots
await page.screenshot({ path: "tmp/screenshot.png" }); await page.screenshot({ path: "tmp/full.png", fullPage: true });
ARIA Snapshot (Element Discovery)
Use getAISnapshot() to discover page elements. Returns YAML-formatted accessibility tree:
- banner:
- link "Hacker News" [ref=e1]
- navigation:
- link "new" [ref=e2]
- main:
- list:
- listitem:
- link "Article Title" [ref=e8]
- link "328 comments" [ref=e9]
- listitem:
- list:
- contentinfo:
- textbox [ref=e10]
- /placeholder: "Search"
- textbox [ref=e10]
Interpreting refs:
-
[ref=eN]
-
Element reference for interaction (visible, clickable elements only)
-
[checked] , [disabled] , [expanded]
-
Element states
-
[level=N]
-
Heading level
-
/url: , /placeholder:
-
Element properties
Interacting with refs:
const snapshot = await client.getAISnapshot("hackernews"); console.log(snapshot); // Find the ref you need
const element = await client.selectSnapshotRef("hackernews", "e2"); await element.click();
Error Recovery
Page state persists after failures. Debug with:
cd skills/dev-browser && npx tsx <<'EOF' import { connect } from "@/client.js";
const client = await connect(); const page = await client.page("hackernews");
await page.screenshot({ path: "tmp/debug.png" }); console.log({ url: page.url(), title: await page.title(), bodyText: await page.textContent("body").then((t) => t?.slice(0, 200)), });
await client.disconnect(); EOF