spa-reverse-engineer

SPA Reverse Engineering — React + Vite + Workbox + CDP

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "spa-reverse-engineer" with this command: npx skills add pv-udpv/pplx-sdk/pv-udpv-pplx-sdk-spa-reverse-engineer

SPA Reverse Engineering — React + Vite + Workbox + CDP

Reverse engineer modern SPAs to extract APIs, intercept service workers, debug runtime state, and build tooling.

When to use

Use this skill when:

  • Analyzing perplexity.ai SPA internals (React component tree, state, hooks)

  • Intercepting Workbox service worker caching and request strategies

  • Using Chrome DevTools Protocol (CDP) to automate browser interactions

  • Building Chrome extensions for traffic interception or state extraction

  • Debugging Vite-bundled source maps and module graph

  • Extracting GraphQL/REST schemas from SPA network layer

  • Writing Puppeteer/Playwright scripts for automated API discovery

Instructions

Step 1: Identify SPA Stack

Detect the technology stack of the target SPA:

// In DevTools Console:

// React detection window.REACT_DEVTOOLS_GLOBAL_HOOK // React DevTools presence document.querySelector('#__next') // Next.js document.querySelector('#root') // Vite/CRA document.querySelector('#app') // Vue (for comparison)

// Vite detection document.querySelector('script[type="module"]') // ESM modules // Check source for /@vite/client or /.vite/ paths

// Workbox / Service Worker navigator.serviceWorker.getRegistrations() // List SWs // Check Application → Service Workers in DevTools

// State management window.REDUX_DEVTOOLS_EXTENSION // Redux // React DevTools → Components → hooks for Zustand/Jotai/Recoil

Step 2: React Internals Analysis

Component Tree Extraction

// Get React fiber tree from any DOM element function getFiber(element) { const key = Object.keys(element).find(k => k.startsWith('__reactFiber$') || k.startsWith('__reactInternalInstance$') ); return element[key]; }

// Walk fiber tree function walkFiber(fiber, depth = 0) { if (!fiber) return; const name = fiber.type?.displayName || fiber.type?.name || fiber.type; if (typeof name === 'string') { console.log(' '.repeat(depth) + name); } walkFiber(fiber.child, depth + 1); walkFiber(fiber.sibling, depth); }

// Start from root const root = document.getElementById('root'); walkFiber(getFiber(root));

State & Props Extraction

// Extract component state via fiber function getComponentState(fiber) { const state = []; let hook = fiber.memoizedState; while (hook) { state.push(hook.memoizedState); hook = hook.next; } return state; }

// Find specific component by name function findComponent(fiber, name) { if (!fiber) return null; if (fiber.type?.name === name || fiber.type?.displayName === name) { return fiber; } return findComponent(fiber.child, name) || findComponent(fiber.sibling, name); }

Step 3: Vite Bundle Analysis

Source Map Extraction

Find source maps from bundled assets

curl -s https://www.perplexity.ai/ | grep -oP 'src="[^"].js"' | while read src; do url=$(echo $src | grep -oP '"[^"]"' | tr -d '"') echo "Checking: $url" curl -sI "https://www.perplexity.ai${url}.map" | head -5 done

Module Graph

// In Vite dev mode (if accessible): // /__vite_module_graph shows dependency graph

// In production — analyze chunks: // Performance → Network → JS files → Initiator chain // Sources → Webpack/Vite tree → module paths

Step 4: Service Worker & Workbox Interception

Analyze Caching Strategy

// List all cached URLs async function listCaches() { const names = await caches.keys(); for (const name of names) { const cache = await caches.open(name); const keys = await cache.keys(); console.log(Cache: ${name} (${keys.length} entries)); keys.forEach(k => console.log( ${k.url})); } }

// Intercept SW fetch events (from SW scope) self.addEventListener('fetch', event => { console.log('[SW Intercept]', event.request.method, event.request.url); });

Workbox Strategy Detection

// Common Workbox strategies to look for in SW source: // - CacheFirst → Static assets (fonts, images) // - NetworkFirst → API calls (dynamic data) // - StaleWhileRevalidate → Frequently updated content // - NetworkOnly → Always fresh (auth endpoints) // - CacheOnly → Offline-only content

// Check SW source for workbox patterns: // workbox.strategies.CacheFirst // workbox.routing.registerRoute // workbox.precaching.precacheAndRoute

Step 5: Chrome DevTools Protocol (CDP)

Automated Interception via CDP

import asyncio from playwright.async_api import async_playwright

async def intercept_with_cdp(): async with async_playwright() as p: browser = await p.chromium.launch(headless=False) context = await browser.new_context() page = await context.new_page()

    # Enable CDP domains
    cdp = await page.context.new_cdp_session(page)

    # Intercept network at CDP level
    await cdp.send('Network.enable')
    cdp.on('Network.requestWillBeSent', lambda params:
        print(f"[CDP] {params['request']['method']} {params['request']['url']}")
    )
    cdp.on('Network.responseReceived', lambda params:
        print(f"[CDP] {params['response']['status']} {params['response']['url']}")
    )

    # Intercept WebSocket frames
    await cdp.send('Network.enable')
    cdp.on('Network.webSocketFrameSent', lambda params:
        print(f"[WS→] {params['response']['payloadData'][:200]}")
    )
    cdp.on('Network.webSocketFrameReceived', lambda params:
        print(f"[←WS] {params['response']['payloadData'][:200]}")
    )

    await page.goto('https://www.perplexity.ai/')
    await page.wait_for_timeout(60000)

Runtime JS Evaluation via CDP

Execute JS in page context

result = await cdp.send('Runtime.evaluate', { 'expression': 'JSON.stringify(window.NEXT_DATA)', 'returnByValue': True, }) next_data = json.loads(result['result']['value'])

Step 6: Chrome Extension Development

Manifest v3 Extension for Traffic Capture

{ "manifest_version": 3, "name": "pplx-sdk Traffic Capture", "version": "1.0", "permissions": [ "webRequest", "activeTab", "storage", "debugger" ], "host_permissions": ["https://www.perplexity.ai/"], "background": { "service_worker": "background.js" }, "content_scripts": [{ "matches": ["https://www.perplexity.ai/"], "js": ["content.js"], "run_at": "document_start" }] }

Background Script — Request Interception

// background.js chrome.webRequest.onBeforeRequest.addListener( (details) => { if (details.url.includes('/rest/')) { console.log('[pplx-capture]', details.method, details.url); if (details.requestBody?.raw) { const body = new TextDecoder().decode( new Uint8Array(details.requestBody.raw[0].bytes) ); chrome.storage.local.set({ [req_${Date.now()}]: { url: details.url, method: details.method, body: JSON.parse(body), timestamp: Date.now() } }); } } }, { urls: ["https://www.perplexity.ai/rest/*"] }, ["requestBody"] );

Content Script — React State Extraction

// content.js — inject into page context const script = document.createElement('script'); script.textContent = // Hook into React state updates const origSetState = React.Component.prototype.setState; React.Component.prototype.setState = function(state, cb) { window.postMessage({ type: 'PPLX_STATE_UPDATE', component: this.constructor.name, state: JSON.parse(JSON.stringify(state)) }, '*'); return origSetState.call(this, state, cb); };; document.documentElement.appendChild(script);

// Listen for state updates window.addEventListener('message', (event) => { if (event.data.type === 'PPLX_STATE_UPDATE') { chrome.runtime.sendMessage(event.data); } });

Step 7: Map Discoveries to SDK

SPA Discovery SDK Target Action

React component state domain/models.py

Model the state shape

API fetch calls transport/http.py

Add endpoint methods

SSE event handlers transport/sse.py

Map event types

Service worker cache shared/

Understand caching behavior

Auth token flow shared/auth.py

Token refresh logic

WebSocket frames transport/

New WebSocket transport

GraphQL queries domain/

Query/mutation services

Step 8: SPA Source Code Graph

After runtime analysis, build a static code graph of the SPA source. Delegate to codegraph for structural analysis.

Source Map Recovery

Extract original source paths from source maps

curl -s https://www.perplexity.ai/ | grep -oP 'src="(/[^"]*.js)"' | while read -r url; do echo "Checking: $url" curl -s "https://www.perplexity.ai${url}.map" 2>/dev/null |
python3 -c "import sys,json; d=json.load(sys.stdin); print('\n'.join(d.get('sources',[])))" 2>/dev/null done | sort -u

Static Analysis (from recovered source or public repo)

Component tree from source

grep -rn "export (default )?function |export const .* = (" src/ --include=".tsx" --include=".jsx"

Import graph

grep -rn "import .* from " src/ --include=".ts" --include=".tsx" |
awk -F: '{print $1 " → " $NF}' | sort -u

Hook usage map

grep -rn "use[A-Z][a-zA-Z](" src/ --include=".tsx" |
grep -oP 'use[A-Z][a-zA-Z]*' | sort | uniq -c | sort -rn

API call sites (fetch, axios, etc.)

grep -rn "fetch(|axios.|api.|apiClient." src/ --include=".ts" --include=".tsx"

Cross-Reference: Runtime ↔ Static

Runtime Discovery (spa-expert) Static Discovery (codegraph) Cross-Reference

Fiber tree component names Source component definitions Match names to source files

Hook state values Hook implementations Map state shape to hook logic

Network API calls fetch() /axios call sites Confirm endpoints in source

Context provider values createContext() definitions Map runtime state to types

Service worker routes Workbox config in source Validate caching strategy

Perplexity.ai SPA Notes

Known Stack

  • Framework: Next.js (React 18+)

  • Bundler: Webpack (via Next.js, not raw Vite — skill covers both for broader SPA RE)

  • State: React hooks + context (observed patterns)

  • Streaming: SSE via fetch() with ReadableStream

  • Auth: Cookie-based (pplx.session-id )

Key DOM Selectors

// Query input document.querySelector('textarea[placeholder*="Ask"]') // Response area document.querySelector('[class*="prose"]') // Thread list document.querySelector('[class*="thread"]')

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

reverse-engineer

No summary provided by upstream source.

Repository SourceNeeds Review
General

sse-streaming

No summary provided by upstream source.

Repository SourceNeeds Review
General

api-design-principles

No summary provided by upstream source.

Repository SourceNeeds Review