Diagnose SEO
Structured diagnostic framework for crawl issues, canonicalization errors, indexation problems, and rendering failures.
Diagnostic Approach
Technical SEO problems fall into four categories. Diagnose in this order — each layer depends on the previous one working correctly:
- Crawlability — Can search engines find and access the pages?
- Indexability — Are the pages allowed to be indexed?
- Renderability — Can search engines see the full content?
- Signals — Are the right signals (titles, structured data, links) in place?
Layer 1: Crawlability
Check these in order:
robots.txt
- Fetch
[domain]/robots.txtand review the rules - Look for overly broad
Disallowrules blocking important paths - Verify
Sitemap:directive points to the correct sitemap URL - Check for different rules per user-agent (Googlebot vs others)
Common mistakes:
Disallow: /blocking the entire site (often left from staging)- Blocking CSS/JS files that Googlebot needs for rendering
- Blocking API or AJAX endpoints that load dynamic content
- Staging robots.txt accidentally deployed to production
XML Sitemap
- Fetch the sitemap URL(s) and check:
- Does it return 200? Is it valid XML?
- Does it list all important pages?
- Does it exclude pages that shouldn't be indexed (404s, redirects, noindex pages)?
- Are
<lastmod>dates accurate and recent? - For large sites: is there a sitemap index?
Site Architecture
- Pages should be reachable within 3 clicks from the homepage
- Check for orphan pages (no internal links pointing to them)
- Check for redirect chains (page A → B → C — should be A → C)
- Check for redirect loops
Server Response
- Do all important pages return HTTP 200?
- Check for unexpected 301/302 redirects
- Check for soft 404s (page returns 200 but shows "not found" content)
- Verify HTTPS is enforced (HTTP should 301 to HTTPS)
Layer 2: Indexability
Meta Robots / X-Robots-Tag
- Check for
<meta name="robots" content="noindex">on pages that should be indexed - Check HTTP headers for
X-Robots-Tag: noindex - Common cause: CMS accidentally applying noindex to pagination, tag pages, or new pages
Canonical Tags
- Every page should have a
<link rel="canonical">pointing to itself (self-referencing canonical) - Check for canonical tags pointing to wrong pages (common in paginated content, filtered URLs)
- Check for conflicting signals: canonical says page A, but noindex is set, or the page redirects
Canonical diagnosis checklist:
- Does the canonical URL match the actual URL?
- Is the canonical URL accessible (returns 200)?
- Does the canonical URL have the same content?
- Is there only one canonical tag on the page?
Duplicate Content
- Check for the same content accessible at multiple URLs:
- With and without trailing slash (
/pagevs/page/) - With and without
www(example.comvswww.example.com) - HTTP vs HTTPS
- URL parameters creating duplicate pages (
?sort=price,?page=1)
- With and without trailing slash (
- Each duplicate set needs one canonical URL; all others should redirect or use canonical tags
Layer 3: Renderability
JavaScript Rendering
- Does the page content appear in the raw HTML source? Or is it loaded via JavaScript?
- If JS-rendered: does Googlebot see the full content? (Use URL Inspection tool in Search Console)
- Check for content hidden behind click events, tabs, or accordions
- Check for lazy-loaded content that only appears on scroll
Core Content Visibility
- Is the main content in the initial HTML? Or loaded async after page load?
- Are important elements (titles, headings, product details) in the DOM on first render?
- Check for content that requires login or cookies to view
Layer 4: Signals
Title Tags
- Every page has a unique
<title> - Title includes the primary keyword
- Under 60 characters (to avoid truncation in SERPs)
- Descriptive and click-worthy
Meta Descriptions
- Every important page has a meta description
- 150-160 characters
- Includes target keyword and a value proposition
- Unique per page
Heading Structure
- One H1 per page containing the primary keyword
- Logical heading hierarchy (H1 → H2 → H3, no skips)
- Headings describe section content (not decorative)
Structured Data
- Check for JSON-LD structured data appropriate to the page type
- Validate with Google's Rich Results Test
- Common types: Article, Product, FAQ, HowTo, BreadcrumbList, Organization
Hreflang (multilingual sites)
- Check for correct
hreflangtags linking language variants - Verify reciprocal tags (page A points to B, B points back to A)
- Check for
x-defaulttag
Output Format
Technical SEO Diagnosis: [domain]
Summary
- Critical issues: [count]
- Warnings: [count]
- Passed checks: [count]
Findings by Layer
For each issue found:
| Layer | Issue | Severity | Affected Pages | Fix |
|---|---|---|---|---|
| Crawlability | robots.txt blocks /blog/ | Critical | All blog pages | Remove Disallow: /blog/ from robots.txt |
| Indexability | Missing canonical tags | Warning | 15 pages | Add self-referencing canonicals |
| ... | ... | ... | ... | ... |
Priority Fix List
Ordered by impact:
- [Critical fix] — affects [n] pages, blocks [crawling/indexing/ranking]
- [Warning fix] — affects [n] pages, reduces [signal quality]
- ...
Pro Tip: Run the free SEO Audit for a quick technical check, the Broken Link Checker to find dead links, and the Robots.txt Generator to fix crawl directives. SEOJuice MCP users can run
/seojuice:site-healthfor a full technical report and/seojuice:page-audit [domain] [url]to drill into specific pages.