Browser - Code-First Browser Automation

Browser automation and web verification using code-first Playwright.

File-Based MCP

This skill is a file-based MCP - pre-written code that executes existing scripts, NOT generates new code.

Why file-based? Filter data in code BEFORE returning to model context = 99%+ token savings.

STOP - CLI First, Always

The Wrong Pattern

DO NOT write new TypeScript code for simple browser tasks:

// WRONG - Writing new code defeats the purpose of file-based MCPs import { PlaywrightBrowser } from '$PAI_DIR/skills/Browser/index.ts' const browser = new PlaywrightBrowser() await browser.launch({ headless: true }) await browser.navigate('https://example.com') await browser.screenshot({ path: '/tmp/shot.png' }) await browser.close()

Problems with this approach:

You're writing 5+ lines of boilerplate every time
You manage browser lifecycle manually
You duplicate what the CLI already does
You're generating new code instead of executing existing code

The Right Pattern

USE the CLI tool - it executes pre-written code:

RIGHT - One command, zero boilerplate

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com /tmp/shot.png

Benefits:

One command, instant execution
Lifecycle handled automatically
Error handling built-in
TRUE file-based MCP pattern

CLI Commands (Primary Interface)

Location: $PAI_DIR/skills/Browser/Tools/Browse.ts

screenshot - Take a screenshot

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot <url> [output-path]

Examples:

Screenshot to default location

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://danielmiessler.com

Screenshot to specific file

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com /tmp/example.png

verify - Check element exists

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify <url> <selector>

Examples:

Verify body exists

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com "body"

Verify specific element

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://danielmiessler.com "h1"

Verify by CSS selector

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com ".main-content"

open - Open URL in visible browser

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts open <url>

Examples:

Open site for manual inspection

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts open https://danielmiessler.com

Decision Tree: When to Use What

                What are you trying to do?
                          |
       ┌──────────────────┴──────────────────┐
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│   SIMPLE    │                      │   COMPLEX   │
│ Single task │                      │ Multi-step  │
└─────────────┘                      └─────────────┘
       │                                     │
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│ • Screenshot│                      │ • Form fill │
│ • Verify    │                      │ • Auth flow │
│ • Open URL  │                      │ • Conditionals│
└─────────────┘                      └─────────────┘
       │                                     │
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│ USE CLI     │                      │ USE WORKFLOW│
│ Browse.ts   │                      │ or API      │
└─────────────┘                      └─────────────┘

Quick Reference

Task Use CLI? Use TypeScript?

Take screenshot YES NO

Verify element exists YES NO

Open page visually YES NO

Fill multi-field form NO YES (Workflow)

Authentication flow NO YES (Workflow)

Conditional logic NO YES (API)

Multi-step interaction NO YES (Workflow)

The Rule: Can you describe it in ONE action? (screenshot, verify, open) → CLI

VERIFY Phase Integration

The Browser skill is MANDATORY for VERIFY phase of web changes.

Using CLI for Verification

Before claiming ANY web change is "live" or "working":

1. Take screenshot of the changed page

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com/changed-page /tmp/verify.png

2. Verify the specific element that changed

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com/changed-page ".changed-element"

Then use the Read tool to view the screenshot:

Read /tmp/verify.png

If you haven't LOOKED at the rendered page, you CANNOT claim it works.

Workflow Routing

For complex, multi-step tasks, use the pre-built workflows:

Trigger Workflow

Fill forms, interact with page Workflows/Interact.md

Extract page content Workflows/Extract.md

Complex verification sequence Workflows/VerifyPage.md

Screenshot with custom options Workflows/Screenshot.md

Workflows use the TypeScript API internally but are pre-written.

Advanced: TypeScript API

Only use this for custom automation that CLI cannot handle.

Before using this API, ask yourself:

Did I check if CLI can do this? (screenshot/verify/open)
Is this a multi-step workflow? (not just one action)
Do I need conditional logic between actions?

If you answered NO to all, use the CLI instead.

Quick Start (Advanced Users Only)

import { PlaywrightBrowser } from '$PAI_DIR/skills/Browser/index.ts'

const browser = new PlaywrightBrowser() await browser.launch({ headless: true }) await browser.navigate('https://example.com') // ... custom logic here ... await browser.close()

API Reference

Navigation:

launch(options?)
Start browser
navigate(url)
Go to URL
goBack() / goForward()
History navigation
reload()
Refresh page
close()
Shut down browser

Capture:

screenshot({ path, fullPage, selector })
Take screenshot
getVisibleText(selector?)
Extract text
getVisibleHtml(options)
Get HTML
savePdf(path)
Export PDF
getAccessibilityTree()
A11y snapshot

Interaction:

click(selector)
Click element
fill(selector, value)
Fill input
type(selector, text, delay?)
Type with delay
select(selector, value)
Select dropdown
pressKey(key)
Keyboard input
hover(selector)
Mouse hover
drag(source, target)
Drag and drop
uploadFile(selector, path)
File upload

Waiting:

waitForSelector(selector, options)
Wait for element
waitForText(text, options)
Wait for text
waitForNavigation(options)
Wait for page load
waitForNetworkIdle(timeout?)
Wait for idle
wait(ms)
Fixed delay

JavaScript:

evaluate(script)
Run JS
getConsoleLogs(options)
Get console output
setUserAgent(ua)
Change user agent

Viewport:

resize(width, height)
Set size
setDevice(name)
Emulate device

Token Savings

Approach Tokens Notes

Playwright MCP ~13,700 Loaded at startup, always

CLI tool ~0 Executes pre-written code

TypeScript API ~50-200 Only what you write

CLI Savings 99%+ Compared to MCP

browser

Safety Notice

Copy this and send it to your AI assistant to learn

RIGHT - One command, zero boilerplate

Screenshot to default location

Screenshot to specific file

Verify body exists

Verify specific element

Verify by CSS selector

Open site for manual inspection

1. Take screenshot of the changed page

2. Verify the specific element that changed

Source Transparency

Related Skills

businessstrategy

supplychain

automotivemanufacturing

powerplatform