GEO Audit Orchestration Skill
Purpose
This skill performs a comprehensive Generative Engine Optimization (GEO) audit of any website. GEO is the practice of optimizing web content so that AI systems (ChatGPT, Claude, Perplexity, Gemini, etc.) can discover, understand, cite, and recommend it. This audit measures how well a site performs across all GEO dimensions and produces an actionable improvement plan.
Key Insight
Traditional SEO optimizes for search engine rankings. GEO optimizes for AI citation and recommendation. Sites that score high on GEO metrics see 30-115% more visibility in AI-generated responses (Georgia Tech / Princeton / IIT Delhi 2024 study). The two disciplines overlap but have distinct requirements.
Audit Workflow
Phase 1: Discovery and Reconnaissance
Step 1: Fetch Homepage and Detect Business Type
-
Use WebFetch to retrieve the homepage at the provided URL.
-
Extract the following signals:
- Page title, meta description, H1 heading
- Navigation menu items (reveals site structure)
- Footer content (reveals business info, location, legal pages)
- Schema.org markup on homepage (Organization, LocalBusiness, etc.)
- Pricing page link (SaaS indicator)
- Product listing patterns (E-commerce indicator)
- Blog/resource section (Publisher indicator)
- Service pages (Agency indicator)
- Address/phone/Google Maps embed (Local business indicator)
-
Classify the business type using these patterns:
| Business Type | Detection Signals |
|---|---|
| SaaS | Pricing page, "Sign up" / "Free trial" CTAs, app.domain.com subdomain, feature comparison tables, integration pages |
| Local Business | Physical address on homepage, Google Maps embed, "Near me" content, LocalBusiness schema, service area pages |
| E-commerce | Product listings, shopping cart, product schema, category pages, price displays, "Add to cart" buttons |
| Publisher | Blog-heavy navigation, article schema, author pages, date-based archives, RSS feeds, high content volume |
| Agency/Services | Case studies, portfolio, "Our Work" section, team page, client logos, service descriptions |
| Hybrid | Combination of above signals -- classify by dominant pattern |
Step 2: Crawl Sitemap and Internal Links
- Attempt to fetch
/sitemap.xmland/sitemap_index.xml. - If sitemap exists, extract up to 50 unique page URLs prioritized by:
- Homepage (always include)
- Top-level navigation pages
- High-value pages (pricing, about, contact, key service/product pages)
- Blog posts (sample 5-10 most recent)
- Category/landing pages
- If no sitemap exists, crawl internal links from the homepage:
- Extract all
<a href>links pointing to the same domain - Follow up to 2 levels deep
- Prioritize pages linked from main navigation
- Extract all
- Respect
robots.txtdirectives -- do not fetch disallowed paths. - Enforce a maximum of 50 pages and a 30-second timeout per fetch.
Step 3: Collect Page-Level Data
For each page in the crawl set, record:
- URL, title, meta description, canonical URL
- H1-H6 heading structure
- Word count of main content
- Schema.org types present
- Internal/external link counts
- Images with/without alt text
- Open Graph and Twitter Card meta tags
- Response status code
- Whether the page has structured data
Phase 2: Parallel Subagent Delegation
Delegate analysis to 5 specialized subagents. Each subagent operates on the collected page data and produces a category score (0-100) plus findings.
Subagent 1: AI Citability Analysis (geo-citability)
- Analyze content blocks for quotability by AI systems
- Score passage self-containment, answer block quality, statistical density
- Identify high-value pages that could be reformatted for better AI citation
Subagent 2: Platform & Brand Analysis (geo-brand-mentions)
- Check brand presence across YouTube, Reddit, Wikipedia, LinkedIn
- Assess third-party mention volume and sentiment
- Score brand authority signals that AI models use for entity recognition
Subagent 3: Technical GEO Infrastructure (geo-crawlers + geo-llmstxt)
- Analyze robots.txt for AI crawler access
- Check for llms.txt presence and quality
- Verify meta tags, headers, and technical accessibility for AI systems
- Check page speed and rendering (JS-heavy sites are harder for AI crawlers)
Subagent 4: Content E-E-A-T Quality (geo-content)
- Evaluate Experience, Expertise, Authoritativeness, Trustworthiness signals
- Check author bios, credentials, source citations
- Assess content freshness, depth, and originality
- Verify "About" page quality and team credentials
Subagent 5: Schema & Structured Data (geo-schema)
- Validate all schema.org markup
- Check for GEO-critical schema types (FAQ, HowTo, Organization, Product, Article)
- Assess schema completeness and accuracy
- Identify missing schema opportunities
Phase 3: Score Aggregation and Report Generation
Composite GEO Score Calculation
The overall GEO Score (0-100) is a weighted average of six category scores:
| Category | Weight | What It Measures |
|---|---|---|
| AI Citability | 25% | How quotable/extractable content is for AI systems |
| Brand Authority | 20% | Third-party mentions, entity recognition signals |
| Content E-E-A-T | 20% | Experience, Expertise, Authoritativeness, Trustworthiness |
| Technical GEO | 15% | AI crawler access, llms.txt, rendering, speed |
| Schema & Structured Data | 10% | Schema.org markup quality and completeness |
| Platform Optimization | 10% | Presence on platforms AI models train on and cite |
Formula:
GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)
Score Interpretation
| Score Range | Rating | Interpretation |
|---|---|---|
| 90-100 | Excellent | Top-tier GEO optimization; site is highly likely to be cited by AI |
| 75-89 | Good | Strong GEO foundation with room for improvement |
| 60-74 | Fair | Moderate GEO presence; significant optimization opportunities exist |
| 40-59 | Poor | Weak GEO signals; AI systems may struggle to cite or recommend |
| 0-39 | Critical | Minimal GEO optimization; site is largely invisible to AI systems |
Issue Severity Classification
Every issue found during the audit is classified by severity:
Critical (Fix Immediately)
- All AI crawlers blocked in robots.txt
- No indexable content (JavaScript-rendered only with no SSR)
- Domain-level noindex directive
- Site returns 5xx errors on key pages
- Complete absence of any structured data
- Brand not recognized as an entity by any AI system
High (Fix Within 1 Week)
- Key AI crawlers (GPTBot, ClaudeBot, PerplexityBot) blocked
- No llms.txt file present
- Zero question-answering content blocks on key pages
- Missing Organization or LocalBusiness schema
- No author attribution on content pages
- All content behind login/paywall with no preview
Medium (Fix Within 1 Month)
- Partial AI crawler blocking (some allowed, some blocked)
- llms.txt exists but is incomplete or malformed
- Content blocks average under 50 citability score
- Missing FAQ schema on pages with FAQ content
- Thin author bios without credentials
- No Wikipedia or Reddit brand presence
Low (Optimize When Possible)
- Minor schema validation errors
- Some images missing alt text
- Content freshness issues on non-critical pages
- Missing Open Graph tags
- Suboptimal heading hierarchy on some pages
- LinkedIn company page exists but is incomplete
Output Format
Generate a file called GEO-AUDIT-REPORT.md with the following structure:
# GEO Audit Report: [Site Name]
**Audit Date:** [Date]
**URL:** [URL]
**Business Type:** [Detected Type]
**Pages Analyzed:** [Count]
---
## Executive Summary
**Overall GEO Score: [X]/100 ([Rating])**
[2-3 sentence summary of the site's GEO health, biggest strengths, and most critical gaps.]
### Score Breakdown
| Category | Score | Weight | Weighted Score |
|---|---|---|---|
| AI Citability | [X]/100 | 25% | [X] |
| Brand Authority | [X]/100 | 20% | [X] |
| Content E-E-A-T | [X]/100 | 20% | [X] |
| Technical GEO | [X]/100 | 15% | [X] |
| Schema & Structured Data | [X]/100 | 10% | [X] |
| Platform Optimization | [X]/100 | 10% | [X] |
| **Overall GEO Score** | | | **[X]/100** |
---
## Critical Issues (Fix Immediately)
[List each critical issue with specific page URLs and recommended fix]
## High Priority Issues
[List each high-priority issue with details]
## Medium Priority Issues
[List each medium-priority issue]
## Low Priority Issues
[List each low-priority issue]
---
## Category Deep Dives
### AI Citability ([X]/100)
[Detailed findings, examples of good/bad passages, rewrite suggestions]
### Brand Authority ([X]/100)
[Platform presence map, mention volume, sentiment]
### Content E-E-A-T ([X]/100)
[Author quality, source citations, freshness, depth]
### Technical GEO ([X]/100)
[Crawler access, llms.txt, rendering, headers]
### Schema & Structured Data ([X]/100)
[Schema types found, validation results, missing opportunities]
### Platform Optimization ([X]/100)
[Presence on YouTube, Reddit, Wikipedia, etc.]
---
## Quick Wins (Implement This Week)
1. [Specific, actionable quick win with expected impact]
2. [Another quick win]
3. [Another quick win]
4. [Another quick win]
5. [Another quick win]
## 30-Day Action Plan
### Week 1: [Theme]
- [ ] Action item 1
- [ ] Action item 2
### Week 2: [Theme]
- [ ] Action item 1
- [ ] Action item 2
### Week 3: [Theme]
- [ ] Action item 1
- [ ] Action item 2
### Week 4: [Theme]
- [ ] Action item 1
- [ ] Action item 2
---
## Appendix: Pages Analyzed
| URL | Title | GEO Issues |
|---|---|---|
| [url] | [title] | [issue count] |
Quality Gates
- Page Limit: Never crawl more than 50 pages per audit. Prioritize high-value pages.
- Timeout: 30-second maximum per page fetch. Skip pages that exceed this.
- Robots.txt: Always check and respect robots.txt before crawling. Note any AI-specific directives.
- Rate Limiting: Wait at least 1 second between page fetches to avoid overloading the server.
- Error Handling: Log failed fetches but continue the audit. Report fetch failures in the appendix.
- Content Type: Only analyze HTML pages. Skip PDFs, images, and other binary content.
- Deduplication: Canonicalize URLs before crawling. Skip duplicate content (e.g., HTTP vs HTTPS, www vs non-www, trailing slashes).
Business-Type-Specific Audit Adjustments
SaaS Sites
- Extra weight on: Feature comparison tables (high citability), integration pages, documentation quality
- Check for: API documentation structure, changelog pages, knowledge base organization
- Key schema: SoftwareApplication, FAQPage, HowTo
Local Businesses
- Extra weight on: NAP consistency, Google Business Profile signals, local schema
- Check for: Service area pages, location-specific content, review markup
- Key schema: LocalBusiness, GeoCoordinates, OpeningHoursSpecification
E-commerce Sites
- Extra weight on: Product descriptions (citability), comparison content, buying guides
- Check for: Product schema completeness, review aggregation, FAQ sections on product pages
- Key schema: Product, AggregateRating, Offer, BreadcrumbList
Publishers
- Extra weight on: Article quality, author credentials, source citation practices
- Check for: Article schema, author pages, publication date freshness, original research
- Key schema: Article, NewsArticle, Person (author), ClaimReview
Agency/Services
- Extra weight on: Case studies (citability), expertise demonstration, thought leadership
- Check for: Portfolio schema, team credentials, industry-specific expertise signals
- Key schema: Organization, Service, Person (team), Review