root-cause-analysis

Systematic problem solving using Fishbone (Ishikawa) diagrams and 5 Whys technique. Identifies true root causes and recommends effective corrective actions.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "root-cause-analysis" with this command: npx skills add melodic-software/claude-code-plugins/melodic-software-claude-code-plugins-root-cause-analysis

Root Cause Analysis

Systematic problem solving using Fishbone (Ishikawa) diagrams and 5 Whys technique. Identifies true root causes and recommends effective corrective actions.

What is Root Cause Analysis?

Root Cause Analysis (RCA) is a structured approach to identify the fundamental cause(s) of problems, rather than addressing symptoms. Effective RCA prevents problem recurrence.

Approach Focus Best For

Fishbone (Ishikawa) Brainstorming all potential causes by category Complex problems with multiple potential causes

5 Whys Drilling down through cause chains Problems with linear cause-effect relationships

Combined Fishbone to identify, 5 Whys to drill down Comprehensive root cause identification

Framework 1: Fishbone (Ishikawa) Diagram

What is a Fishbone Diagram?

A Fishbone diagram (also called Ishikawa or cause-and-effect diagram) organizes potential causes into categories, resembling a fish skeleton with the problem as the "head."

    ┌─────────────────────────────────────────────────────────┐
    │                                                         │

Man ──┼──┐ ┌──┼── Method │ │ │ │ │ └─────────────────────┐ ┌───────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────┐ │ │ │ PROBLEM │ │ │ └─────────┘ │ │ ▲ ▲ │ │ ┌─────────────────────┘ └───────────────┐ │ │ │ │ │ Machine ┼──┘ └──┼── Material │ │ │ │ Measurement ────────────────────────────────────────────── Mother Nature │ │ └───────────────────────────────────────────────┘

The 6M Categories

Category Also Called Example Causes

Man People, Personnel Training, skills, fatigue, communication

Machine Equipment, Technology Hardware, software, tools, maintenance

Method Process, Procedure Workflow, documentation, standards

Material Inputs, Data Raw materials, components, information quality

Measurement Metrics, Inspection Calibration, accuracy, data collection

Mother Nature Environment Temperature, humidity, regulations, market

Alternative Category Sets

Domain Categories

Manufacturing (6M) Man, Machine, Method, Material, Measurement, Mother Nature

Service (8P) Product, Price, Place, Promotion, People, Process, Physical Evidence, Productivity

Software (5S) Systems, Skills, Suppliers, Surroundings, Safety

Custom Define categories relevant to your domain

Fishbone Workflow

Step 1: Define the Problem

Problem Statement

Problem: [Clear, specific, measurable description] Impact: [Quantified impact - cost, time, quality, safety] When Discovered: [Date/time] Where: [Location, system, process] Frequency: [One-time / Recurring / Intermittent]

Good problem statements:

  • ✅ "Customer orders delayed by 3+ days increased 40% in Q4"

  • ✅ "Production line 3 defect rate rose from 2% to 8% in November"

  • ❌ "Things are slow" (too vague)

  • ❌ "The system is broken" (not measurable)

Step 2: Assemble the Team

Include people with direct knowledge:

Role Contribution

Process Owner Overall accountability

Front-line Workers Direct observation and experience

Subject Matter Experts Technical knowledge

Customer Representative Impact perspective

Facilitator Guide the process

Step 3: Brainstorm Causes by Category

For each category, ask: "What in this category could cause the problem?"

Cause Brainstorming

Man (People)

Potential CauseEvidenceLikelihood
Insufficient trainingNew hires 60% of errorsHigh
Fatigue (overtime hours)Errors peak after 8 hoursMedium
Communication gapsHandoff errors documentedHigh

Machine (Equipment)

Potential CauseEvidenceLikelihood
Aging equipmentMachine 7 has 40% of failuresHigh
Software bugsVersion 2.1 introduced defectsMedium

Method (Process)

Potential CauseEvidenceLikelihood
Unclear procedures3 different approaches observedHigh
Missing quality checkNo verification step documentedHigh

Material (Inputs)

Potential CauseEvidenceLikelihood
Supplier quality issuesIncoming defect rate up 15%Medium
Specification ambiguityMultiple interpretations possibleLow

Measurement (Metrics)

Potential CauseEvidenceLikelihood
Inaccurate sensorsCalibration overdueMedium
Wrong metrics trackedLeading indicators missingLow

Mother Nature (Environment)

Potential CauseEvidenceLikelihood
Seasonal demand spikeQ4 always highest volumeHigh
Regulatory changesNew requirements in OctoberLow

Step 4: Identify Most Likely Root Causes

Prioritize based on:

Criterion Question

Evidence Is there data supporting this cause?

Frequency Does this cause relate to problem frequency?

Correlation Does the cause timing match problem timing?

Control Can we test or eliminate this cause?

Step 5: Validate Root Causes

For each high-likelihood cause:

Root Cause Validation

CauseValidation MethodResultConfirmed?
Insufficient trainingCompare error rates by tenureNew hires 3x error rate✅ Yes
Aging equipmentAnalyze failures by machineMachine 7: 40% of all failures✅ Yes
Unclear proceduresReview documentation3 conflicting SOPs found✅ Yes

Fishbone Output Format

Fishbone Analysis: [Problem]

Date: [ISO date] Facilitator: rca-analyst Team: [Participants]

Problem Statement

[Clear, specific, measurable problem]

Impact

[Quantified business impact]

Cause Analysis

Man (People)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected
  • [Cause 2] - Evidence: [X] - Status: Confirmed/Suspected

Machine (Equipment)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected

Method (Process)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected

Material (Inputs)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected

Measurement (Metrics)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected

Mother Nature (Environment)

  • [Cause 1] - Evidence: [X] - Status: Confirmed/Suspected

Confirmed Root Causes

#Root CauseCategoryEvidenceContribution
1[Description][6M][Data][% of problem]

Recommended Actions

#ActionAddressesOwnerDue Date
1[Corrective action]Root Cause #1[Name][Date]

Fishbone Mermaid Diagram

flowchart LR subgraph Man["👤 Man"] M1[Training gaps] M2[Communication issues] end

subgraph Machine["⚙️ Machine"]
    MA1[Aging equipment]
    MA2[Software bugs]
end

subgraph Method["📋 Method"]
    ME1[Unclear procedures]
    ME2[Missing QC step]
end

subgraph Material["📦 Material"]
    MT1[Supplier quality]
    MT2[Specification issues]
end

subgraph Measurement["📊 Measurement"]
    MS1[Sensor accuracy]
    MS2[Wrong metrics]
end

subgraph Environment["🌍 Environment"]
    E1[Seasonal demand]
    E2[Regulatory changes]
end

M1 & M2 --> Problem
MA1 & MA2 --> Problem
ME1 & ME2 --> Problem
MT1 & MT2 --> Problem
MS1 & MS2 --> Problem
E1 & E2 --> Problem

Problem[❌ PROBLEM:<br/>Customer delays<br/>increased 40%]

style Problem fill:#f99,stroke:#900

Framework 2: 5 Whys

What is 5 Whys?

5 Whys is an iterative interrogative technique that drills down from symptoms to root causes by repeatedly asking "Why?" (typically 5 times, but may be fewer or more).

Key Principle: Each answer becomes the subject of the next "Why?"

5 Whys Workflow

Step 1: State the Problem

5 Whys Analysis

Problem: [Specific problem statement] Date: [ISO date] Analyst: 5whys-analyst

Step 2: Ask Why Iteratively

Why Chain

Problem: The website crashed during the product launch.

Why 1: Why did the website crash? → The server ran out of memory.

Why 2: Why did the server run out of memory? → The application had a memory leak.

Why 3: Why did the application have a memory leak? → The new feature wasn't properly tested under load.

Why 4: Why wasn't the new feature tested under load? → We don't have load testing in our CI/CD pipeline.

Why 5: Why don't we have load testing in CI/CD? → It was never prioritized when the pipeline was set up.

Root Cause: Load testing was not included in CI/CD pipeline during initial setup.

Step 3: Validate the Root Cause

Validation Check Question Pass?

Specific Is the root cause specific and actionable? ☐

Systemic Does addressing this prevent recurrence? ☐

Controllable Can we actually fix this? ☐

Chain Valid Does each why-answer logically lead to the next? ☐

5 Whys Best Practices

Do Don't

✅ Focus on process/system causes ❌ Blame individuals

✅ Base answers on facts and evidence ❌ Speculate without data

✅ Keep asking until actionable root found ❌ Stop at symptoms

✅ Document the chain for future reference ❌ Accept "human error" as root cause

✅ Consider multiple parallel chains ❌ Assume single root cause

Multiple 5 Why Chains

Complex problems often have multiple contributing causes. Run parallel 5 Whys chains:

Multiple Why Chains

Problem: Customer complaints increased 50%

Chain A: Response Time

Why 1: Response time increased → Support queue grew Why 2: Queue grew → Fewer agents available Why 3: Fewer agents → High turnover Why 4: High turnover → Poor working conditions Why 5: Poor conditions → No ergonomic equipment Root Cause A: Lack of investment in workplace ergonomics

Chain B: First-Contact Resolution

Why 1: FCR dropped → Agents lack knowledge Why 2: Lack knowledge → Training inadequate Why 3: Training inadequate → Product changes not communicated Why 4: Not communicated → No process for training updates Root Cause B: Missing process for training on product changes

Chain C: (Additional chains as needed)

5 Whys Output Format

5 Whys Analysis: [Problem]

Date: [ISO date] Analyst: 5whys-analyst Problem: [Statement]

Why Chain

LevelQuestionAnswerEvidence
Why 1Why [problem]?[Answer][Data]
Why 2Why [answer 1]?[Answer][Data]
Why 3Why [answer 2]?[Answer][Data]
Why 4Why [answer 3]?[Answer][Data]
Why 5Why [answer 4]?[Answer][Data]

Root Cause

[Final root cause statement]

Validation

  • Specific and actionable
  • Systemic (prevents recurrence)
  • Within our control
  • Each step logically connected

Corrective Action

ActionOwnerDueStatus
[Action to address root cause][Name][Date]Open

5 Whys Mermaid Diagram

flowchart TD P[❌ Problem:<br/>Website crashed<br/>during launch] W1[Why 1: Server ran<br/>out of memory] W2[Why 2: Application<br/>had memory leak] W3[Why 3: New feature<br/>not load tested] W4[Why 4: No load testing<br/>in CI/CD pipeline] W5[Why 5: Load testing<br/>not prioritized initially] RC[🎯 Root Cause:<br/>Missing load testing<br/>in CI/CD setup]

P --> W1
W1 --> W2
W2 --> W3
W3 --> W4
W4 --> W5
W5 --> RC

style P fill:#f99,stroke:#900
style RC fill:#9f9,stroke:#090

Combined Approach

For comprehensive analysis, combine both techniques:

  • Fishbone to brainstorm all potential causes across categories

  • 5 Whys to drill down on each high-likelihood cause

  • Validate multiple root causes found

  • Prioritize corrective actions

Combined RCA: [Problem]

Phase 1: Fishbone Brainstorming

[Identify all potential causes by category]

Phase 2: 5 Whys Drill-Down

For each high-likelihood cause from fishbone:

Cause A: [From fishbone]

[5 Whys chain] Root Cause A: [Result]

Cause B: [From fishbone]

[5 Whys chain] Root Cause B: [Result]

Phase 3: Root Cause Summary

#Root CauseSourceValidated?Priority
1[RC-A]Fishbone + 5 WhysYesHigh
2[RC-B]Fishbone + 5 WhysYesMedium

Phase 4: Corrective Action Plan

[CAPA based on validated root causes]

Corrective Action Planning (CAPA)

Action Types

Type Purpose Example

Immediate/Containment Stop the bleeding Disable faulty feature

Corrective Fix the specific occurrence Patch the bug

Preventive Prevent recurrence Add automated testing

Systemic Address organizational factors Update training program

CAPA Template

Corrective Action Plan

Problem: [Reference to RCA] Root Causes: [List confirmed root causes]

Actions

#ActionTypeRoot CauseOwnerDueStatus
1[Description]CorrectiveRC-1[Name][Date]Open
2[Description]PreventiveRC-1[Name][Date]Open
3[Description]SystemicRC-2[Name][Date]Open

Verification

ActionVerification MethodCriteriaResult
1[How to verify][Success criteria]Pending

Follow-up

  • Review Date: [Date]
  • Responsible: [Name]
  • Metrics to Monitor: [KPIs]

Structured Data (YAML)

root_cause_analysis: version: "1.0" date: "2025-01-15" analyst: "rca-analyst"

problem: statement: "Customer order delays increased 40% in Q4" impact: "$500K in refunds, 15% churn increase" discovered: "2025-01-10" frequency: recurring

fishbone: categories: man: - cause: "Insufficient training" evidence: "New hires 3x error rate" likelihood: high confirmed: true machine: - cause: "Aging equipment on Line 3" evidence: "40% of failures from Machine 7" likelihood: high confirmed: true method: - cause: "Unclear procedures" evidence: "3 conflicting SOPs" likelihood: high confirmed: true material: [] measurement: [] environment: - cause: "Q4 demand spike" evidence: "Volume up 60%" likelihood: high confirmed: true

five_whys: - chain_name: "Training Gap" problem: "High error rate in new hires" whys: - question: "Why high error rate?" answer: "Insufficient training" - question: "Why insufficient training?" answer: "Training program not updated" - question: "Why not updated?" answer: "No owner assigned" - question: "Why no owner?" answer: "Process not established" root_cause: "No process for maintaining training program"

root_causes: - id: "RC-1" description: "No process for training program maintenance" source: "Fishbone + 5 Whys" validated: true contribution: 40

capa: - action: "Assign training program owner" type: corrective root_cause: "RC-1" owner: "HR Director" due_date: "2025-02-01" status: open

When to Use RCA

Scenario Use RCA? Approach

Major incident Yes Full Fishbone + 5 Whys

Recurring defect Yes Focused 5 Whys

Customer complaint pattern Yes Combined approach

Minor one-time issue Maybe Quick 5 Whys

Proactive improvement Partial Fishbone for ideation

Integration

Upstream

  • Incident reports - Trigger for RCA

  • Quality metrics - Identify problems

  • stakeholder-analysis - Identify team members

Downstream

  • Process improvement - Implement corrective actions

  • Training programs - Address people-related causes

  • value-stream-mapping - Eliminate systemic waste

Related Skills

  • decision-analysis

  • Evaluate corrective action options

  • risk-analysis

  • Assess risk of root causes

  • value-stream-mapping

  • Process improvement

  • stakeholder-analysis

  • Identify RCA participants

User-Facing Interface

When invoked directly by the user, this skill operates as follows.

Arguments

  • <problem-statement> : Description of the problem to analyze

  • --mode : Analysis technique (default: full )

  • fishbone : Fishbone diagram with 6M categories (~4K tokens)

  • 5whys : 5 Whys drill-down technique (~3K tokens)

  • full : Both techniques combined (~8K tokens)

  • --output : Output format (default: both )

  • yaml : Structured YAML for downstream processing

  • mermaid : Mermaid diagram visualization

  • both : Both formats

  • --dir : Output directory (default: docs/analysis/ )

Execution Workflow

  • Parse Arguments - Extract problem statement, mode, and output format. If no problem provided, ask the user what problem to analyze.

  • Define the Problem - Clarify problem statement, impact, frequency, first observed date, current vs desired state, and available evidence.

  • Execute Based on Mode:

  • Fishbone: Explore causes across all 6M categories (Man, Machine, Method, Material, Measurement, Mother Nature), rate likelihood, and identify relationships.

  • 5 Whys: Progressive questioning to drill down to root cause. May require fewer or more than 5 whys.

  • Full: Spawn the problem-solver agent for combined Fishbone + 5 Whys analysis with CAPA plan.

  • Consolidate Root Causes - Summarize findings with evidence and confidence ratings.

  • Develop CAPA Plan - Create immediate corrections, corrective actions (address root cause), preventive actions (prevent recurrence), and verification plan.

  • Generate Output - Produce YAML structure, Mermaid fishbone/flowchart diagrams, and summary report.

  • Save Results - Save to docs/analysis/root-cause-analysis.yaml and/or docs/analysis/root-cause-analysis.md (or custom --dir ).

Version History

  • v1.0.0 (2025-12-26): Initial release

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

design-thinking

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

plantuml-syntax

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

system-prompt-engineering

No summary provided by upstream source.

Repository SourceNeeds Review