browser-automation

Browser Automation for AI Web Interfaces

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "browser-automation" with this command: npx skills add drshailesh88/integrated_content_os/drshailesh88-integrated-content-os-browser-automation

Browser Automation for AI Web Interfaces

Use your ChatGPT Plus and Gemini Advanced subscriptions through browser automation. No API costs - just your monthly subscription.

How It Works

┌─────────────────────────────────────────────────────────────────┐ │ BROWSER AUTOMATION FLOW │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Your Prompt ──► Playwright MCP ──► Browser Instance │ │ │ │ │ ┌────────────┴────────────┐ │ │ ▼ ▼ │ │ ┌─────────────┐ ┌─────────────┐ │ │ │ ChatGPT │ │ Gemini │ │ │ │ chat.openai│ │ gemini. │ │ │ │ .com │ │ google.com│ │ │ └──────┬──────┘ └──────┬──────┘ │ │ │ │ │ │ ▼ ▼ │ │ Response captured & returned to you │ │ │ └─────────────────────────────────────────────────────────────────┘

Prerequisites

  1. Playwright MCP Must Be Active

You have Playwright MCP configured. Verify it's working:

Use browser_snapshot to check if browser is available

  1. Login Sessions

The browser automation uses saved sessions. You need to log in once:

First-time setup:

  • Navigate to ChatGPT/Gemini

  • Log in with your credentials

  • Session is saved for future use

ChatGPT Automation

Step-by-Step Workflow

Step 1: Navigate to ChatGPT

browser_navigate → https://chat.openai.com

Step 2: Check if logged in

browser_snapshot → Look for chat input or login button

Step 3: If not logged in, authenticate

browser_click → "Log in" button browser_type → Enter email browser_click → Continue browser_type → Enter password browser_click → Log in

Step 4: Start new chat

browser_click → "New chat" button (or navigate to chat.openai.com)

Step 5: Type your prompt

browser_type → Your prompt text in the message input

Step 6: Submit and wait

browser_click → Send button browser_wait_for → Wait for response to complete

Step 7: Capture response

browser_snapshot → Get the response text

Example: ChatGPT Writing Task

I will now use browser automation to get ChatGPT's response:

  1. browser_navigate to https://chat.openai.com
  2. browser_snapshot to see current state
  3. browser_type to enter prompt in textarea
  4. browser_click to send
  5. browser_wait_for response
  6. browser_snapshot to capture output

Gemini Automation

Step-by-Step Workflow

Step 1: Navigate to Gemini

browser_navigate → https://gemini.google.com

Step 2: Check if logged in

browser_snapshot → Look for chat input

Step 3: Type your prompt

browser_type → Your prompt in the input area

Step 4: Submit

browser_press_key → Enter (or click send button)

Step 5: Wait and capture

browser_wait_for → Response generation browser_snapshot → Get response

Practical Commands

For Claude Code Session

When you want me to use browser automation, say:

"Use browser automation to ask ChatGPT: [your prompt]" "Get Gemini's take on: [your prompt]" "Compare browser outputs for: [your prompt]"

I will then:

  • Use Playwright MCP tools

  • Navigate to the appropriate site

  • Enter your prompt

  • Capture and return the response

Handling Authentication

Session Persistence

Browser automation works best with persistent sessions:

The Playwright MCP maintains browser state

Once logged in, sessions typically persist

If Session Expires

If you see a login screen:

  • ChatGPT: Look for "Log in" button, click it

  • Gemini: Look for "Sign in" button, click it

  • Complete authentication flow

  • Resume automation

Two-Factor Authentication

If 2FA is required:

  • Automation will pause at 2FA screen

  • You manually complete 2FA

  • Automation continues

Limitations

Browser Automation Caveats

Limitation Workaround

Slower than API Use for comparison, not bulk

Can break if UI changes Report issues, I'll adapt

Requires active session Keep browser open

Rate limits still apply Don't spam requests

CAPTCHAs possible May need manual intervention

When NOT to Use Browser Automation

  • Bulk content generation (use GLM-4.7 API instead)

  • Time-critical tasks (APIs are faster)

  • Fully automated pipelines (APIs more reliable)

When TO Use Browser Automation

  • Comparing writing styles

  • Using features only in Plus/Advanced

  • Testing latest model versions

  • When APIs are down

Comparison Workflow

Get Same Prompt from Multiple Sources

Step 1: Write with Claude (default, in this conversation) Step 2: browser_navigate to ChatGPT, get response Step 3: browser_navigate to Gemini, get response Step 4: Compare all three side-by-side

Example Request

"Compare how you, ChatGPT, and Gemini would write a tweet about the cardiovascular benefits of SGLT2 inhibitors"

I will:

  • Write my version (Claude)

  • Use browser automation to get ChatGPT's version

  • Use browser automation to get Gemini's version

  • Present all three for comparison

Troubleshooting

Browser Not Responding

browser_close → Close current browser Then start fresh with browser_navigate

Wrong Page Loaded

browser_snapshot → Check current state browser_navigate → Go to correct URL

Element Not Found

browser_snapshot → Get fresh page state Look for correct element reference Retry with updated reference

Session Logged Out

browser_navigate → Go to login page Complete login flow Resume automation

Integration with Multi-Model Writer

This skill works with multi-model-writer :

API Models:

  • /write-glm → Z.AI API
  • /write-gpt → OpenAI API
  • /write-gemini → Google AI Studio API

Browser Models:

  • /browser-chatgpt → ChatGPT Plus web
  • /browser-gemini → Gemini Advanced web

Use APIs for speed and reliability. Use browser for subscription-only features or comparison.

Example Session

User: "Use browser to compare how ChatGPT writes about statins"

Claude: I'll get ChatGPT's perspective using browser automation.

[Uses browser_navigate to https://chat.openai.com] [Uses browser_snapshot to verify page state] [Uses browser_type to enter: "Write a patient-friendly explanation of how statins work"] [Uses browser_click to send] [Uses browser_wait_for to wait for response] [Uses browser_snapshot to capture response]

Here's what ChatGPT wrote: [Response text]

Compared to my approach: [Claude's version]

Key differences:

  • ChatGPT emphasized X while I focused on Y
  • Tone: ChatGPT more conversational, mine more clinical
  • Length: Similar word count

Browser automation gives you access to your paid subscriptions programmatically, complementing the API-based models in your arsenal.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

pylabrobot

No summary provided by upstream source.

Repository SourceNeeds Review
Research

social-media-trends-research

No summary provided by upstream source.

Repository SourceNeeds Review
Research

academic-chapter-writer

No summary provided by upstream source.

Repository SourceNeeds Review
General

medical-newsletter-writer

No summary provided by upstream source.

Repository SourceNeeds Review