Knowledge Management System

Turn tribal knowledge into searchable, maintained organizational intelligence. Stop losing expertise when people leave.

Phase 1: Knowledge Audit

Current State Assessment

Score each dimension 1-5 (1=nonexistent, 5=excellent):

Dimension	Score	Evidence
Documentation coverage		% of processes documented
Findability		Can new hire find answers in <5 min?
Freshness		% of docs updated in last 6 months
Contribution culture		% of team actively contributing
Onboarding effectiveness		Time to productivity for new hires
Knowledge retention		Impact when someone leaves
Cross-team sharing		Teams accessing other teams' knowledge

Total Score: ___/35

Interpretation:

28-35: Mature — optimize and maintain
21-27: Developing — fill gaps systematically
14-20: Basic — needs foundational work
7-13: Critical — knowledge is at risk

Knowledge Risk Register

knowledge_risk:
  single_points_of_failure:
    - person: "[Name]"
      unique_knowledge: "[What only they know]"
      risk_if_leaves: "high|medium|low"
      extraction_priority: 1
      extraction_method: "interview|shadowing|recording|pair-work"
  
  undocumented_processes:
    - process: "[Name]"
      frequency: "daily|weekly|monthly|quarterly"
      complexity: "high|medium|low"
      current_owner: "[Name]"
      documentation_priority: 1
  
  tribal_knowledge:
    - topic: "[What people 'just know']"
      holders: ["[Name1]", "[Name2]"]
      impact_area: "[What breaks without it]"
      capture_method: "interview|workshop|write-up"

Knowledge Extraction Interview Guide

For each single-point-of-failure person:

Context: "I'm documenting [X] so the team isn't dependent on any one person. This protects you too — less interruptions."
Process walk: "Walk me through [X] from start to finish. I'll record/note."
Decision points: "Where do you make judgment calls? What factors do you consider?"
Edge cases: "What are the weird situations that come up? How do you handle them?"
Tools & access: "What tools, credentials, or access do you need?"
History: "Why is it done this way? What was tried before?"
Gotchas: "What are the things that trip people up?"

Output format: Write up as a runbook (see Phase 3 templates).

Phase 2: Knowledge Architecture

Taxonomy Design

knowledge_taxonomy:
  # Level 1: Knowledge Types
  types:
    how_to:
      description: "Step-by-step procedures and guides"
      examples: ["Deploy to production", "Process a refund", "Set up dev environment"]
      template: "runbook"
      
    reference:
      description: "Facts, specs, configurations to look up"
      examples: ["API endpoints", "Config values", "Vendor contacts", "Pricing tables"]
      template: "reference_doc"
      
    explanation:
      description: "Why things work the way they do"
      examples: ["Architecture decisions", "Policy rationale", "Historical context"]
      template: "explainer"
      
    decision:
      description: "How to make specific judgment calls"
      examples: ["Escalation criteria", "Approval thresholds", "Priority frameworks"]
      template: "decision_tree"
      
    troubleshooting:
      description: "Diagnosis and fix for known problems"
      examples: ["Error codes", "Common failures", "Debug procedures"]
      template: "troubleshooting_guide"

  # Level 2: Domains (customize per org)
  domains:
    - engineering
    - product
    - sales
    - operations
    - finance
    - hr_people
    - customer_success
    - security
    - legal_compliance

  # Level 3: Topics (within each domain)
  # Example for engineering:
  engineering_topics:
    - architecture
    - deployment
    - monitoring
    - incident_response
    - development_workflow
    - testing
    - security
    - infrastructure

Information Architecture Rules

Maximum 3 levels deep — if deeper, reorganize
One canonical location per topic — link, don't duplicate
Every page has an owner — no orphan docs
Every page has a freshness date — reviewed within 6 months or flagged
Cross-references over duplication — "See [X]" beats copy-paste
Search-first design — assume people search, not browse

Naming Conventions

[DOMAIN]-[TYPE]-[TOPIC]-[SPECIFICS]

Examples:
eng-howto-deploy-production
eng-ref-api-endpoints-v3
sales-decision-pricing-enterprise
ops-troubleshoot-billing-failed-charges
product-explain-auth-architecture

Navigation Structure

knowledge_base:
  homepage:
    - quick_links:  # Top 10 most-accessed pages
    - recently_updated:  # Last 10 changes
    - needs_review:  # Stale docs flagged
    
  by_audience:
    new_hire: "[Onboarding path → essential reading list]"
    engineer: "[Dev setup → architecture → deployment → debugging]"
    manager: "[Policies → processes → templates → reports]"
    customer_facing: "[Product knowledge → troubleshooting → escalation]"
    
  by_domain: "[Taxonomy Level 2 domains]"
  by_type: "[How-to | Reference | Explanations | Decisions | Troubleshooting]"

Phase 3: Document Templates

Runbook Template (How-To)

# [Title]: [Action verb] + [Object]

**Owner:** [Name]  
**Last verified:** [YYYY-MM-DD]  
**Estimated time:** [X minutes]  
**Difficulty:** Easy | Medium | Advanced  

## Prerequisites
- [ ] [Access/tool/permission needed]
- [ ] [Knowledge assumed]

## Steps

### 1. [First action]
[Specific instruction with exact commands, clicks, or actions]

> ⚠️ [Warning about common mistake at this step]

### 2. [Second action]
[Instructions]

**Expected result:** [What you should see/get]

### 3. [Continue...]

## Verification
- [ ] [How to confirm it worked]
- [ ] [What to check]

## Troubleshooting
| Problem | Likely Cause | Fix |
|---------|-------------|-----|
| [Symptom] | [Why] | [Steps] |

## Related
- [Link to related runbook]
- [Link to reference doc]

Reference Document Template

# [Subject] Reference

**Owner:** [Name]  
**Last verified:** [YYYY-MM-DD]  
**Scope:** [What this covers and doesn't cover]

## Overview
[1-2 sentence summary of what this reference contains]

## [Main content organized as tables, lists, or structured data]

| Item | Value | Notes |
|------|-------|-------|
| | | |

## Quick Lookup
[Most frequently needed items at the top]

## Change Log
| Date | Change | By |
|------|--------|-----|
| | | |

Architecture Decision Record (ADR)

# ADR-[NNN]: [Title]

**Status:** Proposed | Accepted | Deprecated | Superseded by ADR-[NNN]  
**Date:** [YYYY-MM-DD]  
**Deciders:** [Names]  

## Context
[What situation or problem prompted this decision?]

## Decision
[What was decided and why?]

## Alternatives Considered
| Option | Pros | Cons | Why rejected |
|--------|------|------|-------------|
| [A] | | | |
| [B] | | | |

## Consequences
- **Positive:** [Benefits]
- **Negative:** [Tradeoffs accepted]
- **Risks:** [What could go wrong]

## Review Date
[When should this be revisited?]

Troubleshooting Guide Template

# Troubleshooting: [System/Process Name]

**Owner:** [Name]  
**Last verified:** [YYYY-MM-DD]

## Quick Diagnostic

[Flowchart as text] Is [X] happening? → YES: Go to Problem A → NO: Is [Y] happening? → YES: Go to Problem B → NO: Go to Problem C


## Problem A: [Symptom Description]

**Likely causes (in order of probability):**
1. [Most common cause]
2. [Second most common]
3. [Rare but possible]

**Fix for Cause 1:**
[Step-by-step resolution]

**Fix for Cause 2:**
[Step-by-step resolution]

**Escalation:** If none of the above work → [who to contact, what info to provide]

## Problem B: [Next symptom]
[Same structure]

Decision Tree Template

# Decision Guide: [Topic]

**Owner:** [Name]  
**Last verified:** [YYYY-MM-DD]

## When to use this guide
[Situation that triggers this decision]

## Decision Flow

### Step 1: [First question]
- **If [condition A]** → [Action/next step]
- **If [condition B]** → [Action/next step]
- **If unsure** → [Default action or escalation]

### Step 2: [Second question based on Step 1 answer]
[Continue branching]

## Override conditions
[When to ignore this guide and escalate instead]

## Examples
| Scenario | Decision | Reasoning |
|----------|----------|-----------|
| [Real example] | [What was decided] | [Why] |

Phase 4: Contribution System

Writing Standards

The 4C Test (every document must pass all four):

Clear — Would a new hire understand this? No jargon without definitions.
Correct — Has this been verified by doing/testing? Not from memory.
Current — Does this reflect how things work TODAY? Not 6 months ago.
Concise — Can anything be cut without losing meaning? Cut it.

Formatting rules:

Headers: action-oriented ("Deploy to Production" not "Production Deployment")
Steps: numbered, one action per step, imperative mood
Warnings: callout boxes, before the step (not after)
Code/commands: exact, copy-pasteable, tested
Screenshots: only if truly needed (they go stale fast)
Links: to canonical sources, never paste full URLs inline

Contribution Workflow

contribution_workflow:
  create:
    trigger: "New knowledge identified (incident learnings, process change, new tool)"
    steps:
      - choose_template: "Match content type to template"
      - draft: "Write using template structure"
      - self_review: "Run 4C Test checklist"
      - peer_review: "SME validates accuracy"
      - publish: "Add to knowledge base in correct location"
      - announce: "Notify relevant teams/channels"
    
  update:
    trigger: "Existing doc is wrong, incomplete, or stale"
    steps:
      - flag: "Mark as needs-update with reason"
      - update: "Make changes, update 'Last verified' date"
      - review: "If significant change, get peer review"
      - publish: "Update in place"
      - notify: "If behavioral change, announce"
    
  retire:
    trigger: "Doc no longer relevant (deprecated system, changed process)"
    steps:
      - mark: "Status: Deprecated, add redirect to replacement"
      - archive: "Move to archive after 30 days"
      - redirect: "Ensure all links point to replacement"

Incentivizing Contributions

Making it easy (remove friction):

Templates pre-filled with structure
"Quick capture" channel — dump raw notes, someone structures later
Post-incident: "What would have helped?" → becomes a doc
Post-onboarding: new hire documents what was confusing
Meeting notes → action items include "document [X]"

Making it visible (social proof):

Monthly "top contributors" shoutout
"Docs champion" rotating role — each sprint, one person owns doc health
Include documentation in performance criteria
Knowledge sharing in team meetings (5-min "TIL" segment)

Making it expected (cultural norms):

"If you answered a question twice, write it down"
PR template includes "Documentation updated? Y/N"
Incident postmortem includes "Docs to create/update"
Onboarding feedback includes "What couldn't you find?"

Phase 5: Search & Discovery

Search Optimization

Every document should be findable by:

Title — descriptive, includes key terms
Tags — domain, type, audience, technology
Synonyms — include alternate terms people might search
Problem description — "When [X] happens" phrasing

Tag schema:

document_tags:
  domain: "[engineering|product|sales|ops|finance|hr|cs|security|legal]"
  type: "[howto|reference|explanation|decision|troubleshooting]"
  audience: "[all|engineering|management|customer-facing|new-hire]"
  technology: "[list relevant tools/systems]"
  status: "[current|needs-review|deprecated]"
  difficulty: "[beginner|intermediate|advanced]"

Discovery Mechanisms

Contextual links — Related docs linked at bottom of every page
FAQ collections — Per-domain "frequently asked" with links to full docs
Onboarding paths — Curated reading lists by role
Slack/chat bot — "Ask the KB" — searches and returns relevant docs
Weekly digest — "New & updated docs this week" email/message
Error-page links — Application errors link to troubleshooting docs

Quality Signals

Prioritize search results by:

Freshness — Recently updated > stale
Verification — Peer-reviewed > unreviewed
Usage — Frequently accessed > rarely accessed
Completeness — Fully structured > quick notes

Phase 6: Knowledge Capture Workflows

Post-Incident Knowledge Capture

After every incident:

Immediate (within 24h): Raw timeline and resolution steps
Postmortem (within 5 days): Root cause, contributing factors, action items
Knowledge extraction (within 10 days):
- New troubleshooting guide? → Create from postmortem
- New runbook needed? → Create from resolution steps
- Existing doc wrong? → Update with correct information
- Architecture decision needed? → Write ADR
- Monitoring gap? → Document what to monitor

Post-Meeting Knowledge Capture

Meeting types that MUST produce knowledge artifacts:

Architecture review → ADR
Process change → Updated runbook
Strategy decision → Decision record
Customer feedback pattern → Product knowledge update
Retrospective → Process improvement doc

New Employee Knowledge Capture

First 30 days — new hire documents:

What was confusing during onboarding
Questions that weren't answered by existing docs
Things that were wrong in existing docs
Suggestions for improvement

Template for new hire feedback:

onboarding_feedback:
  week: "[1|2|3|4]"
  couldnt_find: 
    - topic: "[What they looked for]"
      where_looked: "[Where they searched]"
      how_resolved: "[Asked someone? Found eventually? Still unclear?]"
  wrong_or_outdated:
    - doc: "[Which document]"
      issue: "[What's wrong]"
  suggestions:
    - "[Free text improvements]"

Exit Knowledge Transfer

When someone is leaving:

Identify unique knowledge — What do they know that no one else does?
Schedule extraction sessions — 1-2 hours per major topic area
Record if possible — Video walkthroughs of complex processes
Pair them — Have successor shadow for final 2 weeks
Review their authored docs — Are they complete? Assign new owners
Document tribal knowledge — "Why" questions only they can answer

Phase 7: Maintenance & Freshness

Freshness Policy

freshness_policy:
  review_frequency:
    critical_operations: "quarterly"  # Deployment, incident response, security
    standard_processes: "semi-annually"  # Regular workflows
    reference_docs: "annually"  # Specs, contacts, architecture
    explanations: "annually"  # Background, history, rationale
    
  review_process:
    - owner_notified: "2 weeks before due date"
    - review_actions:
        - verify: "Is this still accurate? Test/confirm."
        - update: "Fix any outdated information"
        - stamp: "Update 'Last verified' date"
        - skip: "If can't review, reassign or flag"
    - escalation: "Unreviewed after 30 days → manager notified"
    - stale_threshold: "2x review period without update → flagged as stale"

Content Health Dashboard

kb_health:
  date: "[YYYY-MM-DD]"
  
  coverage:
    total_documents: 0
    by_type:
      howto: 0
      reference: 0
      explanation: 0
      decision: 0
      troubleshooting: 0
    by_domain: {}
    gaps_identified: []
    
  freshness:
    current: 0  # Reviewed within policy
    needs_review: 0  # Due for review
    stale: 0  # Past review deadline
    deprecated: 0
    freshness_rate: "0%"  # current / (current + needs_review + stale)
    
  quality:
    peer_reviewed: "0%"
    using_templates: "0%"
    has_owner: "0%"
    has_tags: "0%"
    
  usage:
    searches_per_week: 0
    failed_searches: 0  # Searches with no results
    top_10_pages: []
    pages_never_accessed: 0
    
  contribution:
    docs_created_this_month: 0
    docs_updated_this_month: 0
    unique_contributors: 0
    contribution_rate: "0%"  # contributors / total team size

Quarterly Knowledge Review

Agenda (60 min):

Dashboard review (10 min) — health metrics trend
Gap analysis (15 min) — what's missing? What questions keep being asked?
Stale doc triage (15 min) — update, deprecate, or reassign owners
Failed searches review (10 min) — what are people searching for and not finding?
Process improvements (10 min) — what's working, what isn't?

Phase 8: Knowledge-Driven Automation

Automated Knowledge Triggers

automation_triggers:
  incident_resolved:
    action: "Create task: 'Write troubleshooting guide for [incident title]'"
    assignee: "Incident commander"
    due: "+10 days"
    
  new_hire_started:
    action: "Generate personalized onboarding reading list from KB by role"
    
  doc_stale:
    action: "Notify owner, CC manager if unreviewed after 14 days"
    
  repeated_question:
    threshold: "Same question asked 3+ times in support/Slack"
    action: "Create task: 'Document answer to [question]'"
    
  process_changed:
    trigger: "PR merged that changes workflow/process"
    action: "Check if related docs need updating, create task if yes"
    
  failed_search:
    threshold: "Same search term fails 5+ times/week"
    action: "Flag as gap, create task to write missing doc"

Knowledge-Powered Chatbot Design

kb_chatbot:
  flow:
    1_receive_question: "User asks in designated channel"
    2_search: "Semantic search across KB"
    3_respond:
      found_match: "Return relevant doc link + summary"
      partial_match: "Return closest docs + 'Did you mean...?'"
      no_match: "Log as gap, route to human expert, create doc task"
    4_feedback: "Was this helpful? 👍/👎"
    5_improve: "Use feedback to tune search, identify doc improvements"
    
  sources:
    - knowledge_base_docs
    - slack_saved_answers  # Curated from Slack threads
    - incident_postmortems
    - meeting_notes_tagged_as_knowledge

Phase 9: Cross-Team Knowledge Sharing

Knowledge Sharing Mechanisms

Mechanism	Frequency	Format	Audience
"TIL" channel	Daily	Short post (1-3 sentences + link)	All
Brown bag lunch	Bi-weekly	20-min presentation + Q&A	Cross-team
Architecture review	Monthly	45-min deep dive + ADR	Engineering
Customer insight share	Monthly	Top 5 patterns + implications	Product + CS + Sales
Postmortem review	Per incident	Written + optional walkthrough	Engineering + ops
New tool/technique demo	As needed	15-min demo + doc link	Relevant teams
Quarterly knowledge review	Quarterly	Dashboard + gap analysis	Leadership

Cross-Team Knowledge Map

knowledge_map:
  engineering:
    produces: ["Architecture docs", "Runbooks", "API specs", "ADRs"]
    consumes_from:
      product: ["PRDs", "User research", "Roadmap"]
      customer_success: ["Bug patterns", "Feature requests", "Usage data"]
      sales: ["Technical requirements", "Integration needs"]
      
  product:
    produces: ["PRDs", "User research", "Roadmap", "Release notes"]
    consumes_from:
      engineering: ["Technical feasibility", "Architecture constraints"]
      customer_success: ["Feature requests", "Churn reasons"]
      sales: ["Deal requirements", "Competitive intel"]
      
  customer_success:
    produces: ["FAQ", "Troubleshooting guides", "Best practices"]
    consumes_from:
      engineering: ["Release notes", "Known issues"]
      product: ["Feature docs", "Roadmap"]
      
  sales:
    produces: ["Battlecards", "Competitive intel", "Use case docs"]
    consumes_from:
      product: ["Feature docs", "Roadmap", "Pricing"]
      customer_success: ["Case studies", "Success metrics"]
      engineering: ["Technical capabilities", "Integration docs"]

Phase 10: Metrics & ROI

Knowledge Management KPIs

Metric	Target	Measurement
Time to answer	<5 min for documented topics	Sample timing tests
New hire time to productivity	Reduce by 30%	First solo task date
Repeated questions	Decrease 50% in 6 months	Support ticket analysis
Doc coverage	>80% of critical processes	Audit against process list
Freshness rate	>85% within review policy	Dashboard metric
Contribution rate	>40% of team contributing monthly	Contributor count
Search success rate	>80% find what they need	Search analytics
Failed search rate	<10% of searches	Search analytics
Knowledge reuse	>60% of team using KB weekly	Usage analytics

ROI Calculation

Knowledge Management ROI:

Time Saved:
  Reduced question-answering = [hours/week] × [avg hourly cost] × 52
  Faster onboarding = [weeks saved] × [new hires/year] × [weekly cost]
  Faster incident resolution = [hours saved/incident] × [incidents/year] × [hourly cost]
  
Risk Reduced:
  Key person dependency = [probability of departure] × [knowledge reconstruction cost]
  Compliance documentation = [audit prep hours saved] × [hourly cost]
  
Quality Improved:
  Fewer repeated mistakes = [error rate reduction] × [cost per error]
  Consistent processes = [variance reduction] × [rework cost]
  
Total Annual Value = Time Saved + Risk Reduced + Quality Improved
Investment = Tool cost + Time spent maintaining KB + Training
ROI = (Total Annual Value - Investment) / Investment × 100

Phase 11: Scoring & Quality

Document Quality Rubric (0-100)

Dimension	Weight	0-2 (Poor)	3-5 (Adequate)	6-8 (Good)	9-10 (Excellent)
Accuracy	20%	Unverified, possibly wrong	Mostly correct	Verified, accurate	Tested, peer-reviewed
Completeness	15%	Major gaps	Covers basics	Comprehensive	Edge cases included
Clarity	15%	Confusing, jargon-heavy	Understandable	Clear, well-structured	A new hire gets it
Findability	10%	No tags, bad title	Some tags	Good tags, clear title	Synonyms, cross-refs
Freshness	15%	>12 months stale	Within annual review	Within semi-annual	Within quarterly
Template compliance	10%	No structure	Partial template	Full template	Template + extras
Actionability	10%	Theory only	Some steps	Clear steps	Copy-paste ready
Ownership	5%	No owner	Owner assigned	Owner active	Owner + backup

Score interpretation:

90-100: Exemplary — reference model for other docs
75-89: Good — meets standards
60-74: Acceptable — needs minor improvements
40-59: Below standard — needs significant work
0-39: Critical — rewrite from scratch

Knowledge Base Health Score (0-100)

Dimension	Weight	Metric
Coverage	20%	% of critical processes documented
Freshness	20%	% of docs within review policy
Quality	15%	Average document quality score
Usage	15%	% of team using KB weekly
Contribution	15%	% of team contributing monthly
Search effectiveness	15%	% of searches finding results

Edge Cases

Small Team (<10 people)

Start with a single shared doc/wiki, not a full KB platform
Focus on: runbooks for critical processes, onboarding guide, decision log
One person owns KB health (part-time, not full-time)
Review quarterly, not monthly

Remote/Distributed Teams

Default to written over verbal knowledge sharing
Record important meetings/decisions (not all meetings)
Async-first: every decision documented, not just discussed
Time zone coverage: ensure docs cover "what to do when the expert is asleep"

Rapid Growth (Doubling in 6 months)

Prioritize onboarding docs above all else
Implement "new hire documents what they learn" from day 1
Assign knowledge buddies — each new person paired with a doc mentor
Weekly new-hire cohort Q&A → captured and documented

Regulated Industry

Map compliance requirements to documentation requirements
Version control with audit trail (who changed what, when)
Approval workflows for regulated content
Retention policies aligned with regulations

Post-Merger/Acquisition

Map both organizations' knowledge structures
Identify overlaps and gaps
Prioritize: "how do we work NOW" docs over historical
Freeze archives of legacy systems/processes

Migrating from Scattered Docs

Don't try to migrate everything — start fresh with new structure
Import only: still-accurate, frequently-used docs
Redirect old locations to new ones
Set a sunset date for old system
"If it's not in the new KB, it doesn't exist" (after migration period)

Natural Language Commands

Command	Action
"Audit our knowledge management"	Run Phase 1 assessment, generate risk register
"Design our KB structure"	Create taxonomy and navigation architecture
"Write a runbook for [X]"	Generate using runbook template
"Write an ADR for [X]"	Generate architecture decision record
"Create a troubleshooting guide for [X]"	Generate using troubleshooting template
"Review KB health"	Generate health dashboard and identify gaps
"Plan knowledge extraction for [person]"	Generate interview guide and schedule
"Set up freshness tracking"	Create review schedule and notification rules
"Design onboarding knowledge path for [role]"	Curate reading list from KB
"Analyze failed searches"	Review search gaps and create tasks
"Generate quarterly KB report"	Full metrics dashboard with recommendations
"Plan KB migration from [source]"	Create migration plan with prioritization