browser

Browser - Code-First Browser Automation

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "browser" with this command: npx skills add robdtaylor/personal-ai-infrastructure/robdtaylor-personal-ai-infrastructure-browser

Browser - Code-First Browser Automation

Browser automation and web verification using code-first Playwright.

File-Based MCP

This skill is a file-based MCP - pre-written code that executes existing scripts, NOT generates new code.

Why file-based? Filter data in code BEFORE returning to model context = 99%+ token savings.

STOP - CLI First, Always

The Wrong Pattern

DO NOT write new TypeScript code for simple browser tasks:

// WRONG - Writing new code defeats the purpose of file-based MCPs import { PlaywrightBrowser } from '$PAI_DIR/skills/Browser/index.ts' const browser = new PlaywrightBrowser() await browser.launch({ headless: true }) await browser.navigate('https://example.com') await browser.screenshot({ path: '/tmp/shot.png' }) await browser.close()

Problems with this approach:

  • You're writing 5+ lines of boilerplate every time

  • You manage browser lifecycle manually

  • You duplicate what the CLI already does

  • You're generating new code instead of executing existing code

The Right Pattern

USE the CLI tool - it executes pre-written code:

RIGHT - One command, zero boilerplate

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com /tmp/shot.png

Benefits:

  • One command, instant execution

  • Lifecycle handled automatically

  • Error handling built-in

  • TRUE file-based MCP pattern

CLI Commands (Primary Interface)

Location: $PAI_DIR/skills/Browser/Tools/Browse.ts

screenshot - Take a screenshot

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot <url> [output-path]

Examples:

Screenshot to default location

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://danielmiessler.com

Screenshot to specific file

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com /tmp/example.png

verify - Check element exists

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify <url> <selector>

Examples:

Verify body exists

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com "body"

Verify specific element

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://danielmiessler.com "h1"

Verify by CSS selector

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com ".main-content"

open - Open URL in visible browser

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts open <url>

Examples:

Open site for manual inspection

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts open https://danielmiessler.com

Decision Tree: When to Use What

                What are you trying to do?
                          |
       ┌──────────────────┴──────────────────┐
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│   SIMPLE    │                      │   COMPLEX   │
│ Single task │                      │ Multi-step  │
└─────────────┘                      └─────────────┘
       │                                     │
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│ • Screenshot│                      │ • Form fill │
│ • Verify    │                      │ • Auth flow │
│ • Open URL  │                      │ • Conditionals│
└─────────────┘                      └─────────────┘
       │                                     │
       ▼                                     ▼
┌─────────────┐                      ┌─────────────┐
│ USE CLI     │                      │ USE WORKFLOW│
│ Browse.ts   │                      │ or API      │
└─────────────┘                      └─────────────┘

Quick Reference

Task Use CLI? Use TypeScript?

Take screenshot YES NO

Verify element exists YES NO

Open page visually YES NO

Fill multi-field form NO YES (Workflow)

Authentication flow NO YES (Workflow)

Conditional logic NO YES (API)

Multi-step interaction NO YES (Workflow)

The Rule: Can you describe it in ONE action? (screenshot, verify, open) → CLI

VERIFY Phase Integration

The Browser skill is MANDATORY for VERIFY phase of web changes.

Using CLI for Verification

Before claiming ANY web change is "live" or "working":

1. Take screenshot of the changed page

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts screenshot https://example.com/changed-page /tmp/verify.png

2. Verify the specific element that changed

bun run $PAI_DIR/skills/Browser/Tools/Browse.ts verify https://example.com/changed-page ".changed-element"

Then use the Read tool to view the screenshot:

Read /tmp/verify.png

If you haven't LOOKED at the rendered page, you CANNOT claim it works.

Workflow Routing

For complex, multi-step tasks, use the pre-built workflows:

Trigger Workflow

Fill forms, interact with page Workflows/Interact.md

Extract page content Workflows/Extract.md

Complex verification sequence Workflows/VerifyPage.md

Screenshot with custom options Workflows/Screenshot.md

Workflows use the TypeScript API internally but are pre-written.

Advanced: TypeScript API

Only use this for custom automation that CLI cannot handle.

Before using this API, ask yourself:

  • Did I check if CLI can do this? (screenshot/verify/open)

  • Is this a multi-step workflow? (not just one action)

  • Do I need conditional logic between actions?

If you answered NO to all, use the CLI instead.

Quick Start (Advanced Users Only)

import { PlaywrightBrowser } from '$PAI_DIR/skills/Browser/index.ts'

const browser = new PlaywrightBrowser() await browser.launch({ headless: true }) await browser.navigate('https://example.com') // ... custom logic here ... await browser.close()

API Reference

Navigation:

  • launch(options?)

  • Start browser

  • navigate(url)

  • Go to URL

  • goBack() / goForward()

  • History navigation

  • reload()

  • Refresh page

  • close()

  • Shut down browser

Capture:

  • screenshot({ path, fullPage, selector })

  • Take screenshot

  • getVisibleText(selector?)

  • Extract text

  • getVisibleHtml(options)

  • Get HTML

  • savePdf(path)

  • Export PDF

  • getAccessibilityTree()

  • A11y snapshot

Interaction:

  • click(selector)

  • Click element

  • fill(selector, value)

  • Fill input

  • type(selector, text, delay?)

  • Type with delay

  • select(selector, value)

  • Select dropdown

  • pressKey(key)

  • Keyboard input

  • hover(selector)

  • Mouse hover

  • drag(source, target)

  • Drag and drop

  • uploadFile(selector, path)

  • File upload

Waiting:

  • waitForSelector(selector, options)

  • Wait for element

  • waitForText(text, options)

  • Wait for text

  • waitForNavigation(options)

  • Wait for page load

  • waitForNetworkIdle(timeout?)

  • Wait for idle

  • wait(ms)

  • Fixed delay

JavaScript:

  • evaluate(script)

  • Run JS

  • getConsoleLogs(options)

  • Get console output

  • setUserAgent(ua)

  • Change user agent

Viewport:

  • resize(width, height)

  • Set size

  • setDevice(name)

  • Emulate device

Token Savings

Approach Tokens Notes

Playwright MCP ~13,700 Loaded at startup, always

CLI tool ~0 Executes pre-written code

TypeScript API ~50-200 Only what you write

CLI Savings 99%+ Compared to MCP

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

businessstrategy

No summary provided by upstream source.

Repository SourceNeeds Review
Web3

supplychain

No summary provided by upstream source.

Repository SourceNeeds Review
General

automotivemanufacturing

No summary provided by upstream source.

Repository SourceNeeds Review
General

powerplatform

No summary provided by upstream source.

Repository SourceNeeds Review