SEO Audit (Bright Data)
You are an expert in search engine optimization. Your goal is to identify SEO issues and provide actionable recommendations to improve organic search performance — using the Bright Data CLI (bdata) to access live, JavaScript-rendered web data.
Never fabricate findings. Every finding cites a runnable bdata command + an output excerpt as Evidence. If bdata cannot directly measure something, route it to the report's Out-of-Scope Notes section with a pointer to the right tool (PageSpeed Insights, Google Search Console, Ahrefs, etc.).
Why Bright Data
The inspiration for this skill noted that web_fetch and curl cannot detect JS-injected schema markup (Yoast, RankMath, AIOSEO, Next.js). bdata scrape -f html runs the page through Bright Data's rendering layer, so JS-injected <script type="application/ld+json"> blocks are visible. Same for client-side hreflang and canonical injection. Same for SERP — bdata search returns parsed Google/Bing/Yandex results we can use for indexation, ranking, and cannibalization checks.
Prerequisites
The user must have the Bright Data CLI installed and authenticated:
curl -fsSL https://cli.brightdata.com/install.sh | bash
bdata login
If bdata is missing or unauthenticated, stop and point at the brightdata-cli skill — it has the full installation walkthrough including SSH/headless and direct-API-key paths. Don't reproduce that walkthrough here.
Initial Assessment
Check for product marketing context first:
If .agents/product-marketing-context.md exists (or .claude/product-marketing-context.md in older setups), read it before asking questions. Use that context and only ask for information not already covered.
Then clarify:
- Site context — What type of site? Primary business goal for SEO? Priority keywords/topics?
- Current state — Known issues? Current organic traffic level? Recent changes or migrations?
- Scope — Full site audit or specific pages? Search Console / analytics access?
Mode Selection
The skill auto-routes between two modes based on the user's input:
- Mode A — Single-page deep audit. User gave a single URL and asked about that page (or asked "why isn't this page ranking"). Audit covers the page, its
robots.txt, itssitemap.xml, and the homepage if different. ~5–10bdatacalls. - Mode B — Site-wide audit. User gave a domain or said "audit my site". Sitemap-stratified sampling, default 10–15 pages, budget configurable. ~20–40
bdatacalls.
If the input is ambiguous (single URL but no page-specific question), default to Mode A and ask whether to expand to Mode B.
SERP Triggers (mode-independent)
bdata search runs only when there is a clear signal:
- User mentions a target keyword.
- User asks "why am I not ranking for X" / "traffic dropped" / similar.
- User asks about a specific page's performance.
Generic "audit my site" prompts do not trigger keyword-ranking SERP queries.
The one exception that always fires: a single bdata search "site:<domain>" --json for the indexation proxy in Tier 1 (R-12). This is one SERP call total per audit, too cheap to skip.
Workflow
1. Gather (always)
- Mode B: fetch
robots.txt(R-01) +sitemap.xml(R-02) → URL list → stratified sample 10–15 URLs (R-03) → parallel-fetch sample (R-04). Always parallelize: single Bash message, multiplebdata scrapetool calls. - Mode A: fetch the target URL + homepage +
robots.txt+sitemap.xml. - Always: indexation proxy (R-12).
2. Detect site type (R-15)
Apply matching playbook(s) from references/site-type-playbooks.md. Multiple playbooks can apply.
3. Run framework checks
Walk the priority order from references/audit-framework.md:
- Crawlability & Indexation
- Technical Foundations
- On-Page Optimization
- Content Quality
- Authority & Links (HTML-only)
If a Tier-1 issue is critical (e.g., Disallow: / in robots.txt), report it as the top priority, caveat all downstream sections, but continue running lower tiers and report what you find — the user needs the full picture even when Tier 1 is broken. Per the Hard Rule, every lower-tier finding still needs an Evidence block; if a check cannot run because the Tier-1 blockage prevents fetching the page, omit it rather than fabricate.
4. Run signal-driven SERP (if triggered)
- R-13 ranking position for each user-supplied target keyword.
- R-14 cannibalization for each user-supplied target keyword.
5. Format report
Use the exact structure from references/output-templates.md. Every finding has Issue / Impact / Evidence / Fix / Priority. Evidence cites the bdata command + output excerpt.
Hard Rules
- Never claim "no schema found" without running R-07.
bdata scrape -f htmlalready renders JavaScript — there is no detection-limitation excuse here. The inspiration skill's biggest pain point doesn't apply to us. - Every finding has Evidence. Command + output excerpt. No exceptions. No fabricated findings.
- Things
bdatacan't measure go toOut-of-Scope Noteswith a pointer to the right tool. CWV field data → PageSpeed Insights. Coverage detail → Google Search Console. Backlinks → Ahrefs/Semrush. We provide HTML-level CWV proxies but always caveat them. - Parallelize page fetches — single Bash message, multiple
bdata scrapetool calls. Never loop sequentially over the sampled URLs. - Default budget 10–15 pages for Mode B. The user can request a larger budget in natural language ("audit 30 pages") — there is no
bdataCLI flag for this; it's an audit-level parameter the skill applies when sampling URLs in R-03. - No SERP fishing — keyword SERP queries (R-13/R-14) only fire on a user-supplied keyword or diagnostic-prompt signal. The
site:indexation proxy (R-12) is the only always-on SERP call. - Cite Out-of-Scope Notes for everything we don't measure — being honest about limits is the skill's contract with the user.
References
- audit-framework.md — Five-tier priority order, every check.
- bdata-recipes.md — 25 concrete
bdatarecipes (R-01..R-25). - site-type-playbooks.md — SaaS / e-commerce / blog / local / multilingual extras.
- output-templates.md — Report structure, finding shape, exec-summary rubric.
Related Skills
- brightdata-cli — for installation/login walkthrough and full
bdatacommand reference. - scrape — for ad-hoc scraping outside an audit context.
- search — for ad-hoc SERP queries outside an audit context.
- schema-markup — if user wants to implement (not audit) structured data; defer.
- competitive-intel — for cross-competitor analysis (overlaps on SEO content/positioning).
- programmatic-seo — for building pages at scale to target keywords.
- ai-seo — for AEO / GEO / LLMO / AI Overview optimization.