Wiki Lint — Health Audit
You are performing a health check on an Obsidian wiki. Your goal is to find and fix structural issues that degrade the wiki's value over time.
Before scanning anything: follow the Retrieval Primitives table in llm-wiki/SKILL.md . Prefer frontmatter-scoped greps and section-anchored reads over full-page reads. On a large vault, blindly reading every page to lint it is exactly what this framework is built to avoid.
Before You Start
-
Read .env to get OBSIDIAN_VAULT_PATH
-
Read index.md for the full page inventory
-
Read log.md for recent activity context
Lint Checks
Run these checks in order. Report findings as you go.
- Orphaned Pages
Find pages with zero incoming wikilinks. These are knowledge islands that nothing connects to.
How to check:
-
Glob all .md files in the vault
-
For each page, Grep the rest of the vault for [[page-name]] references
-
Pages with zero incoming links (except index.md and log.md ) are orphans
How to fix:
-
Identify which existing pages should link to the orphan
-
Add wikilinks in appropriate sections
- Broken Wikilinks
Find [[wikilinks]] that point to pages that don't exist.
How to check:
-
Grep for [[.*?]] across all pages
-
Extract the link targets
-
Check if a corresponding .md file exists
How to fix:
-
If the target was renamed, update the link
-
If the target should exist, create it
-
If the link is wrong, remove or correct it
- Missing Frontmatter
Every page should have: title, category, tags, sources, created, updated.
How to check:
-
Grep frontmatter blocks (scope to ^--- at file heads) instead of reading every page in full
-
Flag pages missing required fields
How to fix:
- Add missing fields with reasonable defaults
3a. Missing Summary (soft warning)
Every page should have a summary: frontmatter field — 1–2 sentences, ≤200 chars. This is what cheap retrieval (e.g. wiki-query 's index-only mode) reads to avoid opening page bodies.
How to check:
-
Grep frontmatter for ^summary: across the vault
-
Flag pages without it, but as a soft warning, not an error — older pages predating this field are fine; the check exists to nudge ingest skills into filling it on new writes.
-
Also flag pages whose summary exceeds 200 chars.
How to fix:
- Re-ingest the page, or manually write a short summary (1–2 sentences of the page's content).
- Stale Content
Pages whose updated timestamp is old relative to their sources.
How to check:
-
Compare page updated timestamps to source file modification times
-
Flag pages where sources have been modified after the page was last updated
- Contradictions
Claims that conflict across pages.
How to check:
-
This requires reading related pages and comparing claims
-
Focus on pages that share tags or are heavily cross-referenced
-
Look for phrases like "however", "in contrast", "despite" that may signal existing acknowledged contradictions vs. unacknowledged ones
How to fix:
-
Add an "Open Questions" section noting the contradiction
-
Reference both sources and their claims
- Index Consistency
Verify index.md matches the actual page inventory.
How to check:
-
Compare pages listed in index.md to actual files on disk
-
Check that summaries in index.md still match page content
- Provenance Drift
Check whether pages are being honest about how much of their content is inferred vs extracted. See the Provenance Markers section in llm-wiki for the convention.
How to check:
-
For each page with a provenance: block or any ^[inferred] /^[ambiguous] markers, count sentences/bullets and how many end with each marker
-
Compute rough fractions (extracted , inferred , ambiguous )
-
Apply these thresholds:
-
AMBIGUOUS > 15%: flag as "speculation-heavy" — even 1-in-7 claims being genuinely uncertain is a signal the page needs tighter sourcing or should be moved to synthesis/
-
INFERRED > 40% with no sources: in frontmatter: flag as "unsourced synthesis" — the page is making connections but has nothing to cite
-
Hub pages (top 10 by incoming wikilink count) with INFERRED > 20%: flag as "high-traffic page with questionable provenance" — errors on hub pages propagate to every page that links to them
-
Drift: if the page has a provenance: frontmatter block, flag it when any field is more than 0.20 off from the recomputed value
-
Skip pages with no provenance: frontmatter and no markers — treated as fully extracted by convention
How to fix:
-
For ambiguous-heavy: re-ingest from sources, resolve the uncertain claims, or split speculative content into a synthesis/ page
-
For unsourced synthesis: add sources: to frontmatter or clearly label the page as synthesis
-
For hub pages with INFERRED > 20%: prioritize for re-ingestion — errors here have the widest blast radius
-
For drift: update the provenance: frontmatter to match the recomputed values
- Fragmented Tag Clusters
Checks whether pages that share a tag are actually linked to each other. Tags imply a topic cluster; if those pages don't reference each other, the cluster is fragmented — knowledge islands that should be woven together.
How to check:
-
For each tag that appears on ≥ 5 pages:
-
n = count of pages with this tag
-
actual_links = count of wikilinks between any two pages in this tag group (check both directions)
-
cohesion = actual_links / (n × (n−1) / 2)
-
Flag any tag group where cohesion < 0.15 and n ≥ 5
How to fix:
-
Run the cross-linker skill targeted at the fragmented tag — it will surface and insert the missing links
-
If a tag group is large (n > 15) and still fragmented, consider splitting it into more specific sub-tags
Output Format
Report findings as a structured list:
Wiki Health Report
Orphaned Pages (N found)
concepts/foo.md— no incoming links
Broken Wikilinks (N found)
entities/bar.md:15— links to [[nonexistent-page]]
Missing Frontmatter (N found)
skills/baz.md— missing: tags, sources
Stale Content (N found)
references/paper-x.md— source modified 2024-03-10, page last updated 2024-01-05
Contradictions (N found)
concepts/scaling.mdclaims "X" butsynthesis/efficiency.mdclaims "not X"
Index Issues (N found)
concepts/new-page.mdexists on disk but not in index.md
Missing Summary (N found — soft)
concepts/foo.md— nosummary:fieldentities/bar.md— summary exceeds 200 chars
Provenance Issues (N found)
concepts/scaling.md— AMBIGUOUS > 15%: 22% of claims are ambiguous (re-source or move to synthesis/)entities/some-tool.md— drift: frontmatter says inferred=0.10, recomputed=0.45concepts/transformers.md— hub page (31 incoming links) with INFERRED=28%: errors here propagate widelysynthesis/speculation.md— unsourced synthesis: nosources:field, 55% inferred
Fragmented Tag Clusters (N found)
- #systems — 7 pages, cohesion=0.06 ⚠️ — run cross-linker on this tag
- #databases — 5 pages, cohesion=0.10 ⚠️
After Linting
Append to log.md :
- [TIMESTAMP] LINT issues_found=N orphans=X broken_links=Y stale=Z contradictions=W prov_issues=P missing_summary=S fragmented_clusters=F
Offer to fix issues automatically or let the user decide which to address.