Reverse Engineer (Route-Aware)
Step 2 of 6 in the Reverse Engineering to Spec-Driven Development process. The 6-step process: 1. Analyze, 2. Reverse Engineer (this skill), 3. Create Specs, 4. Gap Analysis, 5. Implementation Planning, 6. Implementation.
Estimated Time: 30-45 minutes Prerequisites: Step 1 completed (analysis-report.md and route selection in .stackshift-state.json ) Output: 11 documentation files in docs/reverse-engineering/
Route-Dependent Behavior:
-
Greenfield: Extract business logic only (framework-agnostic)
-
Brownfield: Extract business logic + technical implementation details
Output is the same regardless of implementation framework (Spec Kit, BMAD, or BMAD Auto-Pilot). The framework choice only affects what happens after Step 2.
Configuration Check
Guard: Verify state file exists before proceeding.
if [ ! -f .stackshift-state.json ]; then echo "ERROR: .stackshift-state.json not found." echo "Step 1 (Initial Analysis) must be completed first. Run /stackshift.analyze to begin." exit 1 fi
DETECTION_TYPE=$(cat .stackshift-state.json | jq -r '.detection_type') ROUTE=$(cat .stackshift-state.json | jq -r '.route')
if [ "$DETECTION_TYPE" = "null" ] || [ -z "$DETECTION_TYPE" ]; then echo "ERROR: detection_type missing from state file. Re-run /stackshift.analyze." exit 1 fi if [ "$ROUTE" = "null" ] || [ -z "$ROUTE" ]; then echo "ERROR: route missing from state file. Re-run /stackshift.analyze." exit 1 fi
echo "Detection: $DETECTION_TYPE" echo "Route: $ROUTE"
SPEC_OUTPUT=$(cat .stackshift-state.json | jq -r '.config.spec_output_location // "."') echo "Writing specs to: $SPEC_OUTPUT"
if [ "$SPEC_OUTPUT" != "." ]; then mkdir -p "$SPEC_OUTPUT/docs/reverse-engineering" mkdir -p "$SPEC_OUTPUT/.specify/memory/specifications" fi
State file structure:
{
"detection_type": "monorepo-service",
"route": "greenfield",
"implementation_framework": "speckit",
"config": {
"spec_output_location": "/git/my-new-app",
"build_location": "/git/my-new-app",
"target_stack": "Next.js 15..."
}
}
Capture commit hash for incremental updates:
COMMIT_HASH=$(git rev-parse HEAD 2>/dev/null || echo "unknown") COMMIT_DATE=$(git log -1 --format=%ci 2>/dev/null || date -u +"%Y-%m-%d %H:%M:%S") echo "Pinning docs to commit: $COMMIT_HASH"
Extraction approach based on detection + route:
Detection Type
- Greenfield
- Brownfield
Monorepo Service Business logic only (tech-agnostic) Full implementation + shared packages (tech-prescriptive)
Nx App Business logic only (framework-agnostic) Full Nx/Angular implementation details
Generic App Business logic only Full implementation
-
detection_type determines WHAT patterns to look for (shared packages, Nx project config, monorepo structure, etc.)
-
route determines HOW to document them (tech-agnostic vs tech-prescriptive)
Phase 1: Deep Codebase Analysis
Use the Task tool with subagent_type=stackshift:stackshift-code-analyzer:AGENT to perform analysis. If the agent is unavailable, fall back to the Explore agent.
Error recovery: If a subagent fails or returns empty results for a sub-phase, retry once with the Explore agent. If the retry also fails, record the gap with an [ANALYSIS INCOMPLETE] marker and continue with remaining sub-phases.
Missing components: If a sub-phase finds no relevant code (e.g., no frontend in a backend-only service), document the absence in the corresponding output file rather than skipping the sub-phase.
Launch sub-phases 1.1 through 1.6 in parallel using separate subagent invocations. Collect all results before proceeding to Phase 2.
1.1 Backend Analysis
-
Find all API endpoints and record their method, route, auth requirements, parameters, and purpose.
-
Catalog every data model including schemas, types, interfaces, and field definitions.
-
Inventory all configuration sources: env vars, config files, and settings.
-
Map every external integration: APIs, services, and databases.
-
Extract business logic from services, utilities, and algorithms.
1.2 Frontend Analysis
-
List all pages and routes with their purpose and auth requirements.
-
Catalog all components by category: layout, form, and UI components.
-
Document state management: store structure and global state patterns.
-
Map the API client layer: how the frontend calls the backend.
-
Extract styling patterns: design system, themes, and component styles.
1.3 Infrastructure Analysis
-
Document deployment configuration: IaC tools, cloud provider, and services.
-
Map CI/CD pipelines and workflows.
-
Catalog database setup: type, schema, and migrations.
-
Identify storage systems: object storage, file systems, and caching.
1.4 Testing Analysis
-
Locate all test files and identify the testing frameworks in use.
-
Classify tests by type: unit, integration, and E2E.
-
Estimate coverage percentages by module.
-
Catalog test data: mocks, fixtures, and seed data.
1.5 Business Context Analysis
-
Read README, CONTRIBUTING, and any marketing or landing pages.
-
Extract package descriptions and repository metadata.
-
Identify comment patterns indicating user-facing features.
-
Collect error messages and user-facing strings for persona inference.
-
Analyze naming conventions to reveal domain concepts.
-
Examine git history for decision archaeology.
1.6 Decision Archaeology
-
Inspect dependency manifests (package.json, go.mod, requirements.txt) for technology choices.
-
Analyze config files (tsconfig, eslint, prettier) for design philosophy.
-
Review CI/CD configuration for deployment decisions.
-
Run git blame on key architectural files to identify decision points.
-
Collect comments with "why" explanations (TODO, HACK, FIXME, NOTE).
-
Look for rejected alternatives visible in git history or comments.
Progress signal: After all sub-phases complete, log: "Phase 1 complete: Analysis gathered for [list which sub-phases produced results]."
Phase 2: Generate Documentation
Create docs/reverse-engineering/ directory and generate all 11 documentation files. For each file, apply the greenfield or brownfield variant as described in operations/output-file-specs.md . Read that file now for the detailed per-file specifications.
If .stackshift-docs-meta.json already exists, overwrite it completely with fresh metadata.
Step 2.1: Write metadata file FIRST
COMMIT_HASH=$(git rev-parse HEAD 2>/dev/null || echo "unknown") COMMIT_DATE=$(git log -1 --format=%ci 2>/dev/null || date -u +"%Y-%m-%d %H:%M:%S") GENERATED_AT=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
Write docs/reverse-engineering/.stackshift-docs-meta.json :
{ "commit_hash": "<COMMIT_HASH>", "commit_date": "<COMMIT_DATE>", "generated_at": "<GENERATED_AT>", "doc_count": 11, "route": "<greenfield|brownfield>", "docs": { "functional-specification.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "integration-points.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "configuration-reference.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "data-architecture.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "operations-guide.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "technical-debt-analysis.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "observability-requirements.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "visual-design-system.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "test-documentation.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "business-context.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" }, "decision-rationale.md": { "generated_at": "<GENERATED_AT>", "commit_hash": "<COMMIT_HASH>" } } }
Step 2.2: Add metadata header to each doc
Every generated doc starts with this header after the title:
[Document Title]
Generated by StackShift | Commit:
<short-hash>| Date:<GENERATED_AT>Run/stackshift.refresh-docsto update with latest changes.
Step 2.3: Generate files with checkpoints
Generate files in this order, logging progress after each:
Batch 1 (core architecture):
-
functional-specification.md
-
data-architecture.md
-
integration-points.md
-
configuration-reference.md
After writing files 1-4, log: "Generated 4/11 files (core architecture complete)." Verify the output directory contains 4 files before continuing.
Batch 2 (operations and quality): 5. operations-guide.md 6. technical-debt-analysis.md 7. observability-requirements.md 8. visual-design-system.md 9. test-documentation.md
After writing files 5-9, log: "Generated 9/11 files (operations and quality complete)." Verify the output directory contains 9 files before continuing.
Batch 3 (context and decisions): 10. business-context.md 11. decision-rationale.md
After writing files 10-11, log: "Generated 11/11 files. Phase 2 complete."
Output structure:
docs/reverse-engineering/ ├── .stackshift-docs-meta.json ├── functional-specification.md ├── integration-points.md ├── configuration-reference.md ├── data-architecture.md ├── operations-guide.md ├── technical-debt-analysis.md ├── observability-requirements.md ├── visual-design-system.md ├── test-documentation.md ├── business-context.md └── decision-rationale.md
Success Criteria
-
All 11 documentation files generated in docs/reverse-engineering/
-
Comprehensive coverage of all application aspects
-
Framework-agnostic functional specification (for greenfield)
-
Complete data model documentation
-
Business context captured with clear [INFERRED] / [NEEDS USER INPUT] markers
-
Decision rationale documented with ADR format
-
Integration points fully mapped with data flow diagrams
-
.stackshift-docs-meta.json created with commit hash for incremental updates
-
Each doc has metadata header with commit hash and generation date
Next Step
Once all documentation is generated:
For GitHub Spec Kit (implementation_framework: speckit ): Proceed to Step 3 -- use /stackshift.create-specs to transform docs into .specify/ specs.
For BMAD Method (implementation_framework: bmad ): Proceed to Step 6 -- hand off to BMAD's *workflow-init . BMAD's PM and Architect agents use the reverse-engineering docs as context.
For BMAD Auto-Pilot (implementation_framework: bmad-autopilot ): Proceed to /stackshift.bmad-synthesize to auto-generate BMAD artifacts. The 11 reverse-engineering docs provide ~90% of what BMAD needs.
DO / DON'T
DO:
-
Describe WHAT the system does, not HOW (especially for greenfield)
-
Use all available signals for inference: README, comments, naming, config, git history
-
Mark confidence levels: no marker = confident, [INFERRED] = reasonable inference, [NEEDS USER INPUT] = genuinely unknown
-
Cross-reference between docs (e.g., tech debt informs trade-offs)
-
Cite specific evidence for each inference
DON'T:
-
Hard-code framework names in functional specs (greenfield)
-
Mix business logic with technical implementation (greenfield)
-
Fabricate business goals with no supporting evidence
-
State inferences as facts without marking them
-
Skip a section because it requires inference -- attempt it and mark confidence
Completeness Checklist
Verify analysis captured:
-
ALL API endpoints (not just the obvious ones)
-
ALL data models (including DTOs, types, interfaces)
-
ALL configuration options (check multiple files)
-
ALL external integrations
-
ALL user-facing strings and error messages (for persona/context inference)
-
ALL config files (for decision rationale inference)
Each document must be comprehensive, accurate, organized, actionable, and honest about inferred vs verified information.