Shard Document Skill
Purpose
Break large, monolithic documents into smaller, manageable shards that are easier to navigate, maintain, and understand. Particularly valuable for lengthy PRDs, technical specifications, architecture docs, and API documentation.
Core Principles:
-
Logical boundaries (shard by topic/section, not arbitrary size)
-
Relationship preservation (maintain cross-references)
-
Navigation clarity (create clear index and links)
-
Metadata richness (each shard self-describes purpose)
-
Validation rigor (verify all references work)
Prerequisites
-
Source document exists and is readable
-
Document has clear structure (headings, sections)
-
Output directory is writable
-
workspace/ directory available for shard storage
Workflow
Step 1: Analyze Document Structure
Action: Parse document to understand structure, identify logical boundaries, and assess sharding approach.
Key Activities:
Parse Document Hierarchy
Example document structure:
Product Requirements Document
1. Executive Summary (200 lines)
2. Vision & Objectives (150 lines)
3. User Personas (400 lines)
3.1 Persona: Sarah (SaaS Admin)
3.2 Persona: Mike (End User)
3.3 Persona: Lisa (Manager)
4. Market Analysis (600 lines)
4.1 Competitive Landscape
4.2 Market Segmentation
4.3 Differentiation Strategy
5. Feature Specifications (1200 lines) ⚠️ Large
5.1 Feature: Authentication
5.2 Feature: Dashboard
5.3 Feature: Reporting
... (20 more features)
6. Technical Requirements (800 lines)
7. Success Metrics (300 lines)
Total: ~3,650 lines (too large for single document)
Identify Shard Boundaries
Logical Strategy (Recommended):
-
Shard by major sections (Executive Summary, Personas, Market Analysis, etc.)
-
Break large sections into sub-shards (e.g., Features → one shard per feature)
-
Keep related content together (don't split mid-topic)
Size-Based Strategy:
-
Split when section exceeds max_shard_size (default 500 lines)
-
Find nearest heading boundary
-
Create shard at logical break point
Semantic Strategy:
-
Analyze content similarity
-
Group related topics even if in different sections
-
Advanced: requires NLP analysis
Assess Cross-References
Scan for internal links:
- See Market Analysis
- Feature dependencies in Technical Requirements
- Success metrics for Dashboard feature
Catalog all cross-references for later validation
Determine Sharding Plan
Proposed Shards:
- index.md - Navigation hub (50 lines)
- executive-summary.md - Overview (200 lines)
- vision-objectives.md - Goals and vision (150 lines)
- personas.md - All user personas (400 lines)
- market-analysis.md - Competitive & market insights (600 lines)
- features/ - Directory with one file per feature (20 files)
- authentication.md
- dashboard.md
- reporting.md
- ... (17 more)
- technical-requirements.md - Tech specs (800 lines)
- success-metrics.md - KPIs and metrics (300 lines)
Total: 8 top-level files + 20 feature shards = 28 shards Reduction: 3,650 lines → avg 130 lines per shard ✅
Output: Sharding plan with boundaries, file names, and relationships
See: references/sharding-strategies.md for strategy selection guidance
Step 2: Extract Shards with Metadata
Action: Create individual shard files with proper metadata, navigation links, and content.
Key Activities:
Create Shard Template
shard_id: feature-authentication shard_type: feature-specification parent: prd-main created_from: product-requirements-document.md section: 5.1 Feature: Authentication related:
- technical-requirements.md#authentication
- success-metrics.md#auth-metrics dependencies:
- feature-user-management tags:
- authentication
- security
- core-feature
Feature: Authentication
◄ Back to PRD Index | Next: Dashboard ►
Feature Overview
[Content from original section 5.1...]
Related Documents
- Technical Requirements: Authentication
- Success Metrics: Auth Metrics
- User Persona: Sarah (SaaS Admin)
Shard Navigation:
-
Previous: Vision & Objectives
-
Up: PRD Index
-
Next: Feature: Dashboard
Extract Content
-
Copy section content from original document
-
Preserve heading hierarchy (adjust if needed)
-
Maintain formatting (code blocks, tables, lists)
-
Keep internal structure intact
Update Internal Links
Original document: See Market Analysis for details.
Sharded version (from feature-authentication.md): See Market Analysis for details.
Update format: #section-name → filename.md #parent-section-name → filename.md#section-name
Add Contextual Navigation
-
Breadcrumbs (Parent document → Current section)
-
Previous/Next links (Sequential reading)
-
Related documents (Cross-references)
-
Back to index link (Always available)
Enrich with Metadata
Metadata includes:
- shard_id: Unique identifier
- shard_type: Category (feature, persona, analysis, technical, etc.)
- parent: Original source document
- section: Original section path
- related: Cross-referenced shards
- dependencies: Other shards this depends on
- tags: Keywords for discovery
- created_from: Source file path
- created_at: Timestamp
Output: Individual shard files with complete metadata and navigation
See: references/shard-metadata-guide.md for metadata standards
Step 3: Create Navigation Index
Action: Build master index document that provides overview and navigation to all shards.
Key Activities:
Create Index Structure
document_type: shard-index original_document: product-requirements-document.md shard_count: 28 sharding_date: 2025-01-04 sharding_strategy: logical
PRD: Product Requirements Document (Sharded)
This document has been split into 28 focused shards for easier navigation and maintenance.
Original Document: product-requirements-document.md (3,650 lines)
Sharded Version: 28 documents (avg 130 lines each)
Quick Navigation
Core Documents
- Executive Summary - Product overview and key findings
- Vision & Objectives - Product vision and goals
- User Personas - Target user profiles
- Market Analysis - Competitive landscape
- Technical Requirements - Technical specifications
- Success Metrics - KPIs and measurement
Features (20 total)
Core Features:
- Authentication - User login and security
- Dashboard - Main user interface
- Reporting - Analytics and reports
Secondary Features:
- Notifications - Email and in-app alerts
- User Profile - Profile management
- ... View all features
Reading Paths
Path 1: Executive Overview (15 min read)
Path 2: Feature Deep Dive (45 min read)
- Feature Overview
- Core features (Authentication, Dashboard, Reporting)
- Technical Requirements
- Success Metrics
Path 3: Complete Read (2 hour read)
Follow sequential navigation links in each document.
Shard Organization
prd-shards/ ├── index.md (this file) ├── executive-summary.md ├── vision-objectives.md ├── personas.md ├── market-analysis.md ├── technical-requirements.md ├── success-metrics.md └── features/ ├── README.md ├── authentication.md ├── dashboard.md ├── reporting.md └── ... (17 more)
Search by Tag
#core-feature: Authentication, Dashboard, Reporting #security: Authentication, Data Encryption, Access Control #analytics: Dashboard, Reporting, Success Metrics #user-facing: All features under features/
Original Document
The original, unshard document is preserved at:
product-requirements-document.md
Create Directory README Files
For subdirectories (like features/), create README.md:
Features Directory
This directory contains detailed specifications for all 20 product features.
Core Features (Must Have)
- Authentication - User login and security
- Dashboard - Main user interface
- Reporting - Analytics and reports
Secondary Features (Should Have)
- Notifications - Email and in-app alerts
- User Profile - Profile management ...
Legacy Features (Won't Have v1)
- Old Admin Panel - Deprecated interface
Generate Metadata Summary
Shard Metadata
| Shard | Type | Lines | Dependencies | Tags |
|---|---|---|---|---|
| executive-summary.md | overview | 200 | - | #summary #overview |
| authentication.md | feature | 85 | user-management | #core #security |
| dashboard.md | feature | 120 | authentication, reporting | #core #ui |
| ... |
Total Shards: 28 Total Lines: 3,650 (across all shards) Average Shard Size: 130 lines
Output: Comprehensive index with multiple navigation paths
See: references/navigation-patterns.md for index design patterns
Step 4: Validate Relationships
Action: Verify all cross-references, dependencies, and links work correctly.
Key Activities:
Validate Internal Links
Scan all shards for markdown links
for file in .md; do grep -o '[.](..md[^)])' "$file" done
Check each link target exists
Report broken links
Verify Dependencies
For each shard with dependencies:
- Check dependency shards exist
- Verify dependency sections referenced exist
- Ensure no circular dependencies
Example: authentication.md depends on user-management.md → Check: user-management.md exists ✅ → Check: No circular dependency (user-management depends on authentication) ✅
Check Cross-References
From authentication.md: "See Success Metrics: Auth Metrics"
Validation: → success-metrics.md exists? ✅ → Section #auth-metrics exists in success-metrics.md? ✅ → Link format correct? ✅
Verify Metadata Completeness
For each shard, check:
-
shard_id present and unique ✅
-
shard_type defined ✅
-
parent specified ✅
-
tags include at least one tag ✅
-
related docs list is valid ✅
Test Navigation Paths
Test reading paths from index:
Path 1: Executive Overview
- Index → Executive Summary ✅
- Executive Summary → Vision & Objectives ✅
- Vision & Objectives → User Personas ✅
- User Personas → Market Analysis ✅
- All "Back to Index" links work ✅
Generate Validation Report
Shard Validation Report
Total Shards: 28 Validation Status: ✅ PASSED
Validation Results
✅ Internal Links: 142/142 valid (0 broken) ✅ Dependencies: 35/35 resolved ✅ Cross-References: 89/89 valid ✅ Metadata: 28/28 complete ✅ Navigation Paths: 3/3 working
Issues Found
None ✅
Recommendations
- Consider adding more tags to personas.md (currently only 1 tag)
- Feature shards could benefit from examples section
- Add glossary shard for technical terms
Output: Validation report with broken link list and recommendations
See: references/validation-checklist.md for comprehensive validation
Common Scenarios
Scenario 1: Large PRD (3,000+ lines)
Context: Single PRD file too large to navigate
Approach:
-
Strategy: Logical sharding
-
Shard by major sections (Executive, Personas, Features, etc.)
-
Create features/ subdirectory (one shard per feature)
-
Result: 20-30 shards, avg 100-150 lines each
Example Structure:
prd-shards/ ├── index.md ├── executive-summary.md ├── personas.md ├── market-analysis.md ├── features/ │ ├── README.md │ ├── feature-1.md │ └── ... (19 more) ├── technical-requirements.md └── success-metrics.md
Scenario 2: API Documentation (5,000+ lines)
Context: Comprehensive API docs with 100+ endpoints
Approach:
-
Strategy: Logical + hierarchical
-
Shard by API resource (Users, Products, Orders, etc.)
-
Group endpoints within each resource shard
-
Create separate authentication.md and errors.md shards
Example Structure:
api-docs-shards/ ├── index.md ├── getting-started.md ├── authentication.md ├── errors.md ├── resources/ │ ├── users.md (GET/POST/PUT/DELETE /users) │ ├── products.md (GET/POST/PUT/DELETE /products) │ └── ... (20 more) └── webhooks.md
Scenario 3: Architecture Documentation (2,000+ lines)
Context: System architecture doc with multiple components
Approach:
-
Strategy: Logical by component
-
Shard by system component (Frontend, Backend, Database, etc.)
-
Create separate decision-records/ directory for ADRs
-
Maintain architecture-overview.md as entry point
Example Structure:
architecture-shards/ ├── index.md ├── architecture-overview.md ├── components/ │ ├── frontend.md │ ├── backend.md │ ├── database.md │ └── ... (5 more) ├── decision-records/ │ ├── adr-001-framework-choice.md │ └── ... (10 more) └── deployment.md
Scenario 4: Size-Based Sharding (Any large document)
Context: Document with no clear logical structure
Approach:
-
Strategy: Size-based (max_shard_size: 500 lines)
-
Split at nearest heading when size exceeded
-
Number shards sequentially (part-01.md, part-02.md, etc.)
-
Maintain reading order with prev/next links
Example Structure:
document-shards/ ├── index.md ├── part-01-introduction.md ├── part-02-methodology.md ├── part-03-findings-1.md ├── part-04-findings-2.md └── part-05-conclusion.md
Best Practices
-
Prefer Logical over Size-Based - Shard by topic/section, not arbitrary size
-
Keep Related Content Together - Don't split mid-topic for size constraints
-
Use Descriptive File Names - authentication.md not feature-1.md
-
Add Rich Metadata - Tags, dependencies, relationships for discovery
-
Provide Multiple Navigation Paths - Index, sequential, topic-based
-
Validate Thoroughly - Check all links, dependencies, cross-references
-
Preserve Original - Keep unshard document as backup/reference
-
Document Shard Structure - Clear README explaining organization
-
Use Consistent Naming - Follow conventions (kebab-case, descriptive)
-
Version Control Friendly - Smaller files = better diffs, easier reviews
When to Escalate
Escalate to stakeholders when:
-
Document structure unclear (no clear shard boundaries)
-
Multiple valid sharding approaches (need decision)
-
Critical cross-references can't be preserved
-
Original document has no clear hierarchy
Escalate to authors when:
-
Content organization illogical
-
Sections heavily interdependent (can't cleanly separate)
-
Terminology inconsistent across sections
Use alternative skill when:
-
Creating new document → Use create-prd or create-brownfield-prd
-
Document already well-structured → No sharding needed
-
Interactive navigation required → Use interactive-checklist skill
Reference Files
-
references/sharding-strategies.md
-
Strategy selection and comparison
-
references/shard-metadata-guide.md
-
Metadata standards and examples
-
references/navigation-patterns.md
-
Index and navigation design
-
references/validation-checklist.md
-
Comprehensive validation steps
-
references/naming-conventions.md
-
File and directory naming standards
Part of BMAD Enhanced Planning Suite