GEO Audit Orchestration Skill

Purpose

This skill performs a comprehensive Generative Engine Optimization (GEO) audit of any website. GEO is the practice of optimizing web content so that AI systems (ChatGPT, Claude, Perplexity, Gemini, etc.) can discover, understand, cite, and recommend it. This audit measures how well a site performs across all GEO dimensions and produces an actionable improvement plan.

Key Insight

Traditional SEO optimizes for search engine rankings. GEO optimizes for AI citation and recommendation. Sites that score high on GEO metrics see 30-115% more visibility in AI-generated responses (Georgia Tech / Princeton / IIT Delhi 2024 study). The two disciplines overlap but have distinct requirements.

Audit Workflow

Phase 1: Discovery and Reconnaissance

Step 1: Fetch Homepage and Detect Business Type

Use WebFetch to retrieve the homepage at the provided URL.
Extract the following signals:
- Page title, meta description, H1 heading
- Navigation menu items (reveals site structure)
- Footer content (reveals business info, location, legal pages)
- Schema.org markup on homepage (Organization, LocalBusiness, etc.)
- Pricing page link (SaaS indicator)
- Product listing patterns (E-commerce indicator)
- Blog/resource section (Publisher indicator)
- Service pages (Agency indicator)
- Address/phone/Google Maps embed (Local business indicator)
Classify the business type using these patterns:

Business Type	Detection Signals
SaaS	Pricing page, "Sign up" / "Free trial" CTAs, app.domain.com subdomain, feature comparison tables, integration pages
Local Business	Physical address on homepage, Google Maps embed, "Near me" content, LocalBusiness schema, service area pages
E-commerce	Product listings, shopping cart, product schema, category pages, price displays, "Add to cart" buttons
Publisher	Blog-heavy navigation, article schema, author pages, date-based archives, RSS feeds, high content volume
Agency/Services	Case studies, portfolio, "Our Work" section, team page, client logos, service descriptions
Hybrid	Combination of above signals -- classify by dominant pattern

Step 2: Crawl Sitemap and Internal Links

Attempt to fetch /sitemap.xml and /sitemap_index.xml.
If sitemap exists, extract up to 50 unique page URLs prioritized by:
- Homepage (always include)
- Top-level navigation pages
- High-value pages (pricing, about, contact, key service/product pages)
- Blog posts (sample 5-10 most recent)
- Category/landing pages
If no sitemap exists, crawl internal links from the homepage:
- Extract all <a href> links pointing to the same domain
- Follow up to 2 levels deep
- Prioritize pages linked from main navigation
Respect robots.txt directives -- do not fetch disallowed paths.
Enforce a maximum of 50 pages and a 30-second timeout per fetch.

Step 3: Collect Page-Level Data

For each page in the crawl set, record:

URL, title, meta description, canonical URL
H1-H6 heading structure
Word count of main content
Schema.org types present
Internal/external link counts
Images with/without alt text
Open Graph and Twitter Card meta tags
Response status code
Whether the page has structured data

Phase 2: Parallel Subagent Delegation

Delegate analysis to 5 specialized subagents. Each subagent operates on the collected page data and produces a category score (0-100) plus findings.

Subagent 1: AI Citability Analysis (geo-citability)

Analyze content blocks for quotability by AI systems
Score passage self-containment, answer block quality, statistical density
Identify high-value pages that could be reformatted for better AI citation

Subagent 2: Platform & Brand Analysis (geo-brand-mentions)

Check brand presence across YouTube, Reddit, Wikipedia, LinkedIn
Assess third-party mention volume and sentiment
Score brand authority signals that AI models use for entity recognition

Subagent 3: Technical GEO Infrastructure (geo-crawlers + geo-llmstxt)

Analyze robots.txt for AI crawler access
Check for llms.txt presence and quality
Verify meta tags, headers, and technical accessibility for AI systems
Check page speed and rendering (JS-heavy sites are harder for AI crawlers)

Subagent 4: Content E-E-A-T Quality (geo-content)

Evaluate Experience, Expertise, Authoritativeness, Trustworthiness signals
Check author bios, credentials, source citations
Assess content freshness, depth, and originality
Verify "About" page quality and team credentials

Subagent 5: Schema & Structured Data (geo-schema)

Validate all schema.org markup
Check for GEO-critical schema types (FAQ, HowTo, Organization, Product, Article)
Assess schema completeness and accuracy
Identify missing schema opportunities

Phase 3: Score Aggregation and Report Generation

Composite GEO Score Calculation

The overall GEO Score (0-100) is a weighted average of six category scores:

Category	Weight	What It Measures
AI Citability	25%	How quotable/extractable content is for AI systems
Brand Authority	20%	Third-party mentions, entity recognition signals
Content E-E-A-T	20%	Experience, Expertise, Authoritativeness, Trustworthiness
Technical GEO	15%	AI crawler access, llms.txt, rendering, speed
Schema & Structured Data	10%	Schema.org markup quality and completeness
Platform Optimization	10%	Presence on platforms AI models train on and cite

Formula:

GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)

Score Interpretation

Score Range	Rating	Interpretation
90-100	Excellent	Top-tier GEO optimization; site is highly likely to be cited by AI
75-89	Good	Strong GEO foundation with room for improvement
60-74	Fair	Moderate GEO presence; significant optimization opportunities exist
40-59	Poor	Weak GEO signals; AI systems may struggle to cite or recommend
0-39	Critical	Minimal GEO optimization; site is largely invisible to AI systems

Issue Severity Classification

Every issue found during the audit is classified by severity:

Critical (Fix Immediately)

All AI crawlers blocked in robots.txt
No indexable content (JavaScript-rendered only with no SSR)
Domain-level noindex directive
Site returns 5xx errors on key pages
Complete absence of any structured data
Brand not recognized as an entity by any AI system

High (Fix Within 1 Week)

Key AI crawlers (GPTBot, ClaudeBot, PerplexityBot) blocked
No llms.txt file present
Zero question-answering content blocks on key pages
Missing Organization or LocalBusiness schema
No author attribution on content pages
All content behind login/paywall with no preview

Medium (Fix Within 1 Month)

Partial AI crawler blocking (some allowed, some blocked)
llms.txt exists but is incomplete or malformed
Content blocks average under 50 citability score
Missing FAQ schema on pages with FAQ content
Thin author bios without credentials
No Wikipedia or Reddit brand presence

Low (Optimize When Possible)

Minor schema validation errors
Some images missing alt text
Content freshness issues on non-critical pages
Missing Open Graph tags
Suboptimal heading hierarchy on some pages
LinkedIn company page exists but is incomplete

Output Format

Generate a file called GEO-AUDIT-REPORT.md with the following structure:

# GEO Audit Report: [Site Name]

**Audit Date:** [Date]
**URL:** [URL]
**Business Type:** [Detected Type]
**Pages Analyzed:** [Count]

---

## Executive Summary

**Overall GEO Score: [X]/100 ([Rating])**

[2-3 sentence summary of the site's GEO health, biggest strengths, and most critical gaps.]

### Score Breakdown

| Category | Score | Weight | Weighted Score |
|---|---|---|---|
| AI Citability | [X]/100 | 25% | [X] |
| Brand Authority | [X]/100 | 20% | [X] |
| Content E-E-A-T | [X]/100 | 20% | [X] |
| Technical GEO | [X]/100 | 15% | [X] |
| Schema & Structured Data | [X]/100 | 10% | [X] |
| Platform Optimization | [X]/100 | 10% | [X] |
| **Overall GEO Score** | | | **[X]/100** |

---

## Critical Issues (Fix Immediately)

[List each critical issue with specific page URLs and recommended fix]

## High Priority Issues

[List each high-priority issue with details]

## Medium Priority Issues

[List each medium-priority issue]

## Low Priority Issues

[List each low-priority issue]

---

## Category Deep Dives

### AI Citability ([X]/100)
[Detailed findings, examples of good/bad passages, rewrite suggestions]

### Brand Authority ([X]/100)
[Platform presence map, mention volume, sentiment]

### Content E-E-A-T ([X]/100)
[Author quality, source citations, freshness, depth]

### Technical GEO ([X]/100)
[Crawler access, llms.txt, rendering, headers]

### Schema & Structured Data ([X]/100)
[Schema types found, validation results, missing opportunities]

### Platform Optimization ([X]/100)
[Presence on YouTube, Reddit, Wikipedia, etc.]

---

## Quick Wins (Implement This Week)

1. [Specific, actionable quick win with expected impact]
2. [Another quick win]
3. [Another quick win]
4. [Another quick win]
5. [Another quick win]

## 30-Day Action Plan

### Week 1: [Theme]
- [ ] Action item 1
- [ ] Action item 2

### Week 2: [Theme]
- [ ] Action item 1
- [ ] Action item 2

### Week 3: [Theme]
- [ ] Action item 1
- [ ] Action item 2

### Week 4: [Theme]
- [ ] Action item 1
- [ ] Action item 2

---

## Appendix: Pages Analyzed

| URL | Title | GEO Issues |
|---|---|---|
| [url] | [title] | [issue count] |

Quality Gates

Page Limit: Never crawl more than 50 pages per audit. Prioritize high-value pages.
Timeout: 30-second maximum per page fetch. Skip pages that exceed this.
Robots.txt: Always check and respect robots.txt before crawling. Note any AI-specific directives.
Rate Limiting: Wait at least 1 second between page fetches to avoid overloading the server.
Error Handling: Log failed fetches but continue the audit. Report fetch failures in the appendix.
Content Type: Only analyze HTML pages. Skip PDFs, images, and other binary content.
Deduplication: Canonicalize URLs before crawling. Skip duplicate content (e.g., HTTP vs HTTPS, www vs non-www, trailing slashes).

Business-Type-Specific Audit Adjustments

SaaS Sites

Extra weight on: Feature comparison tables (high citability), integration pages, documentation quality
Check for: API documentation structure, changelog pages, knowledge base organization
Key schema: SoftwareApplication, FAQPage, HowTo

Local Businesses

Extra weight on: NAP consistency, Google Business Profile signals, local schema
Check for: Service area pages, location-specific content, review markup
Key schema: LocalBusiness, GeoCoordinates, OpeningHoursSpecification

E-commerce Sites

Extra weight on: Product descriptions (citability), comparison content, buying guides
Check for: Product schema completeness, review aggregation, FAQ sections on product pages
Key schema: Product, AggregateRating, Offer, BreadcrumbList

Publishers

Extra weight on: Article quality, author credentials, source citation practices
Check for: Article schema, author pages, publication date freshness, original research
Key schema: Article, NewsArticle, Person (author), ClaimReview

Agency/Services

Extra weight on: Case studies (citability), expertise demonstration, thought leadership
Check for: Portfolio schema, team credentials, industry-specific expertise signals
Key schema: Organization, Service, Person (team), Review

geo-audit

Safety Notice

Copy this and send it to your AI assistant to learn