Analyzing Codebase
Overview
Systematically analyze existing codebases to extract structural information. Supports three modes: Context (project characteristics), Brownfield (entities and collision risks), and Setup-Brownfield (comprehensive analysis for /humaninloop:setup ).
When to Use
-
Setting up constitution on existing codebase (brownfield projects)
-
Planning new features against existing code
-
Understanding tech stack before making changes
-
Detecting collision risks for new entities or endpoints
-
Running /humaninloop:setup on projects with existing code
-
Gathering project context for governance decisions
When NOT to Use
-
Greenfield projects: No existing code to analyze; start with humaninloop:authoring-constitution directly
-
Single-file scripts: No architectural patterns to extract
-
Documentation-only review: Use standard file reading instead
-
Before project directory exists: Nothing to analyze yet
-
When user provides complete context: Skip analysis if user already documented tech stack and patterns
Common Mistakes
Mistake Problem Fix
Assuming framework Guessing without evidence Verify with code patterns
Missing directories Only checking standard paths Projects vary, explore
Over-extracting Analyzing every file Focus on config and patterns
Ignoring governance Missing existing decisions Check README, CLAUDE.md, ADRs
Inventing findings Documenting assumptions Only report what is found
Mode Selection
Mode When to Use Output
Context Setting up constitution, understanding project DNA Markdown report for humans
Brownfield Planning new features against existing code JSON inventory with collision risks
Setup-Brownfield /humaninloop:setup on existing codebase codebase-analysis.md with inventory + assessment
Project Type Detection
Identify project type from package manager files:
File Project Type
package.json
Node.js/JavaScript/TypeScript
pyproject.toml / requirements.txt
Python
go.mod
Go
Cargo.toml
Rust
pom.xml / build.gradle
Java
Gemfile
Ruby
pubspec.yaml
Flutter/Dart
Framework Detection
Web Frameworks
Framework Indicators
Express express() , router.get() , app.use()
FastAPI @app.get() , FastAPI() , APIRouter
Django urls.py , views.py , models.py pattern
Flask @app.route() , @bp.route()
Rails routes.rb , app/models/ , app/controllers/
Spring @RestController , @GetMapping , @Entity
Gin/Echo r.GET() , e.GET()
ORM/Database Frameworks
Framework Indicators
Prisma schema.prisma , @prisma/client
TypeORM @Entity() , @Column() , DataSource
SQLAlchemy Base , db.Model , Column()
Django ORM models.Model , models.CharField
GORM gorm.Model , db.AutoMigrate
Mongoose mongoose.Schema , new Schema({
ActiveRecord ApplicationRecord , has_many
Architecture Pattern Recognition
Pattern Indicators
Layered src/models/ , src/services/ , src/controllers/
Feature-based src/auth/ , src/users/ , src/tasks/
Microservices Multiple package files, docker compose
Serverless serverless.yml , lambda/ , functions/
MVC models/ , views/ , controllers/
Clean/Hexagonal domain/ , application/ , infrastructure/
Mode: Context Gathering
For constitution authoring - gather broad project characteristics.
What to Extract:
-
Tech stack with versions
-
Linting/formatting conventions
-
CI/CD quality gates
-
Team signals (test coverage, required approvals, CODEOWNERS)
-
Existing governance docs (CODEOWNERS, ADRs, CONTRIBUTING.md)
Output: Project Context Report (markdown)
See references/CONTEXT-GATHERING.md for detailed guidance.
Mode: Brownfield Analysis
For planning - extract structural details for collision detection.
What to Extract:
-
Entities with fields and relationships
-
Endpoints with handlers
-
Collision risks against proposed spec
Output: Codebase Inventory (JSON)
See references/BROWNFIELD-ANALYSIS.md for detailed guidance.
Mode: Setup Brownfield
For /humaninloop:setup
- comprehensive analysis combining Context + Brownfield with Essential Floor assessment.
What to Extract:
-
Everything from Context mode (tech stack, conventions, architecture)
-
Everything from Brownfield mode (entities, relationships)
-
Essential Floor assessment (Security, Testing, Error Handling, Observability)
-
Inconsistencies and strengths assessment
Output: .humaninloop/memory/codebase-analysis.md following codebase-analysis-template.md
Essential Floor Analysis
Assess each of the four essential floor categories:
Security Assessment
Check How to Detect Status Values
Auth at boundaries Middleware patterns (authenticate , authorize , requireAuth ) present/partial/absent
Secrets from env .env.example exists, no hardcoded credentials in code present/partial/absent
Input validation Schema validation libraries, input checking patterns present/partial/absent
Indicators to search:
Auth middleware
grep -r "authenticate|authorize|requireAuth|isAuthenticated" src/ 2>/dev/null
Environment variables
ls .env.example .env.sample 2>/dev/null grep -r "process.env|os.environ|os.Getenv" src/ 2>/dev/null
Validation
grep -r "zod|yup|joi|pydantic|validator" package.json pyproject.toml 2>/dev/null
Testing Assessment
Check How to Detect Status Values
Test framework configured Config files (jest.config.* , pytest.ini , vitest.config.* ) present/partial/absent
Test files present Files matching .test. , _test. , test_.
present/partial/absent
CI runs tests Test commands in workflow files present/partial/absent
Indicators to search:
Test config
ls jest.config.* vitest.config.* pytest.ini pyproject.toml 2>/dev/null
Test files
find . -name ".test." -o -name "_test." -o -name "test_." 2>/dev/null | head -5
CI test commands
grep -r "npm test|yarn test|pytest|go test" .github/workflows/ 2>/dev/null
Error Handling Assessment
Check How to Detect Status Values
Explicit error types Custom error classes/types defined present/partial/absent
Context preservation Error messages include context, stack traces logged present/partial/absent
Appropriate status codes API responses use correct HTTP status codes present/partial/absent
Indicators to search:
Custom errors
grep -r "class.*Error|extends Error|Exception" src/ 2>/dev/null | head -5
Error logging
grep -r "error.*context|error.*stack|logger.error" src/ 2>/dev/null | head -3
Status codes
grep -r "status(4|status(5|HttpStatus|status_code" src/ 2>/dev/null | head -3
Observability Assessment
Check How to Detect Status Values
Structured logging Logger config (winston, pino, structlog, logrus) present/partial/absent
Correlation IDs Request ID middleware, trace ID patterns present/partial/absent
No PII in logs Log sanitization, no email/password in log statements present/partial/absent
Indicators to search:
Logger config
grep -r "winston|pino|structlog|logrus|zap" package.json pyproject.toml go.mod 2>/dev/null
Correlation IDs
grep -r "requestId|correlationId|traceId|x-request-id" src/ 2>/dev/null | head -3
PII check (negative - should NOT find these in logs)
grep -r "logger.*email|logger.*password|log.*password" src/ 2>/dev/null
Setup-Brownfield Quality Checklist
Before finalizing setup-brownfield analysis:
-
Project identity complete (name, language, framework, entry points)
-
Directory structure documented with purposes
-
Architecture pattern identified with evidence
-
Naming conventions documented (files, variables, functions, classes)
-
All four Essential Floor categories assessed
-
Domain entities extracted with relationships
-
External dependencies documented
-
Strengths to preserve identified (minimum 2-3)
-
Inconsistencies documented with severity
-
Recommendations provided for constitution focus
Detection Script
Run the automated detection script for fast, deterministic stack identification:
bash scripts/detect-stack.sh /path/to/project
Output:
{ "project_type": "nodejs", "package_manager": "npm", "frameworks": ["express"], "orms": ["prisma"], "architecture": ["feature-based"], "ci_cd": ["github-actions"], "files_found": {...} }
The script detects:
-
Project type: nodejs, python, go, rust, java, ruby, flutter, elixir
-
Package manager: npm, yarn, pnpm, pip, poetry, cargo, etc.
-
Frameworks: express, fastapi, django, nextjs, gin, rails, spring-boot, etc.
-
ORMs: prisma, typeorm, sqlalchemy, mongoose, gorm, activerecord, etc.
-
Architecture: clean-architecture, mvc, layered, feature-based, serverless, microservices
-
CI/CD: github-actions, gitlab-ci, jenkins, circleci, etc.
Usage pattern:
-
Run script first for deterministic baseline
-
Use script output to guide deeper LLM analysis
-
Script findings are ground truth; LLM adds nuance
Manual Detection Commands
For cases where script detection is insufficient:
Tech stack detection
cat package.json | jq '{name, engines, dependencies}' cat pyproject.toml cat .tool-versions .nvmrc .python-version 2>/dev/null
Architecture detection
ls -d src/domain src/application src/features 2>/dev/null
CI/CD detection
ls .github/workflows/*.yml .gitlab-ci.yml 2>/dev/null
Governance detection
ls CODEOWNERS .github/CODEOWNERS docs/CODEOWNERS 2>/dev/null cat CODEOWNERS 2>/dev/null | head -20
Test structure
ls -d test/ tests/ spec/ tests/ 2>/dev/null
Quality Checklist
Before finalizing analysis:
Both Modes:
-
Project type and framework correctly identified
-
Architecture pattern documented
-
File paths cited for all findings
Context Mode:
-
Existing linting/formatting config extracted
-
CI quality gates analyzed
-
Existing governance docs checked (CODEOWNERS, ADRs, CONTRIBUTING.md)
-
Approvers identified (from CODEOWNERS or team structure)
-
Recommendations provided
Brownfield Mode:
-
All entity directories scanned
-
All route directories scanned
-
Collision risks classified by severity
Setup-Brownfield Mode:
-
All Context Mode checks completed
-
All four Essential Floor categories assessed
-
Strengths and inconsistencies documented
-
Output written to .humaninloop/memory/codebase-analysis.md
Related Skills
-
For brownfield constitutions: REQUIRED: Use humaninloop:brownfield-constitution after analysis
-
For greenfield projects: OPTIONAL: Use humaninloop:authoring-constitution directly
-
For validation: OPTIONAL: Use humaninloop:validation-constitution after constitution creation