Ralph Loop Best Practices Guide
This skill provides guidelines for effective autonomous iteration using the Ralph Wiggum pattern, based on Anthropic's official ralph-loop plugin.
How Ralph Loop Works
The Ralph Loop is a methodology for iterative AI development through self-referential feedback loops:
-
The prompt never changes between iterations
-
Claude's previous work persists in files
-
Each cycle, the AI sees modified files and git history
-
Enables autonomous refinement without manual re-prompting
Effective Ralph Prompt Design
Required Elements
Clear Completion Signal
Completion Criteria
- All unit tests pass (npm test)
- Build succeeds (npm run build)
- Coverage ≥80% (npm run coverage)
- No TypeScript errors (npx tsc --noEmit)
When ALL criteria are met, output: "RALPH_COMPLETE: All tasks done"
Incremental Phases
Implementation Phases
Phase 1: Foundation
- Create database schema
- Set up repository layer
- Write unit tests for repository
Phase 2: Business Logic
- Implement service layer
- Add validation
- Write service tests
Phase 3: API Layer
-
Create endpoints
-
Add input validation
-
Write API tests
-
Run full test suite
Self-Correction Cycles
On Each Iteration
- Run tests:
npm test - If tests fail:
- Read error messages
- Fix the failing code
- Run tests again
- If tests pass:
- Check coverage:
npm run coverage - If coverage < 80%, add more tests
- If coverage ≥ 80%, proceed to next phase
- Check coverage:
Safety Limits
Safety Rules
- Maximum 100 iterations per phase
- If stuck for 5 iterations on same error, ask for help
- Never delete test files
- Always commit working state before major changes
Good Ralph Prompts
Example 1: API Development
Task: Build User Authentication API
Context
- Node.js + Express + TypeScript
- PostgreSQL with Prisma
- JWT authentication
Phases
Phase 1: Database (iterations 1-10)
Create Prisma schema for User model with:
- id, email (unique), passwordHash, createdAt, updatedAt
Run: npx prisma migrate dev
Test: Schema validates with npx prisma validate
Phase 2: Repository (iterations 11-25)
Create UserRepository with:
- create(email, password) → User
- findByEmail(email) → User | null
- findById(id) → User | null
Tests: All repository tests pass
Phase 3: Service (iterations 26-45)
Create AuthService with:
- register(email, password) → { user, token }
- login(email, password) → { user, token }
- validateToken(token) → User
Tests: All service tests pass
Phase 4: Routes (iterations 46-70)
Create routes:
- POST /auth/register
- POST /auth/login
- GET /auth/me (protected)
Tests: All API tests pass with supertest
Phase 5: Integration (iterations 71-100)
- Full E2E flow works
- Error handling for all edge cases
- Rate limiting on auth endpoints
Completion
When npm test passes with 0 failures AND coverage ≥80%,
output: "RALPH_COMPLETE"
Example 2: Refactoring Task
Task: Refactor Legacy Auth Module
Current State
- Monolithic auth.js with 500 lines
- No tests
- Mixed concerns
Target State
- Separate files: auth-service.ts, user-repository.ts, token-utils.ts
- 100% backward compatible
- 80%+ test coverage
Iteration Loop
- Read current auth.js
- Identify one function to extract
- Create new file with extracted function
- Update imports in auth.js
- Run existing integration tests
- If tests fail, fix and retry
- If tests pass, proceed to next function
Completion Criteria
- auth.js < 100 lines
- All functions have dedicated files
- All tests pass
- No breaking changes to API
Output "RALPH_COMPLETE" when done.
When Ralph Works
Task Type Suitability Reason
API development ✅ Excellent Clear test-driven feedback
Refactoring ✅ Excellent Tests verify each step
Bug fixing ✅ Good Reproduce → fix → verify cycle
Test writing ✅ Good Coverage metrics as feedback
Documentation ⚠️ Limited No automated verification
UI development ⚠️ Limited Visual verification hard
Design decisions ❌ Poor Requires human judgment
One-time scripts ❌ Poor No iteration benefit
When NOT to Use Ralph
-
Subjective decisions - No objective completion signal
-
One-time operations - No benefit from iteration
-
Ambiguous requirements - Will spin without progress
-
Security-critical code - Needs human review
-
Production deployments - Too risky for autonomous action
Anti-Patterns to Avoid
- Vague Completion Criteria
BAD
Complete when the code looks good.
GOOD
Complete when:
- npm test exits with code 0
- npm run build succeeds
- No TypeScript errors (npx tsc --noEmit)
- No Phase Boundaries
BAD
Build the entire application.
GOOD
Phase 1: Database schema (test: migrations apply) Phase 2: Repository layer (test: unit tests pass) Phase 3: Service layer (test: integration tests pass)
- Missing Error Recovery
BAD
If something fails, figure it out.
GOOD
If tests fail:
-
Read the error message
-
Identify the failing file:line
-
Fix the specific issue
-
Run tests again
-
If same error after 3 attempts, try alternative approach
-
No Safety Limits
BAD
Keep going until done.
GOOD
- Max 100 iterations total
- Max 10 retries per failing test
- Checkpoint every 25 iterations
Integration with SpecWeave
SpecWeave's /sw:auto command implements Ralph Loop with:
Tasks.md as Completion Checklist
-
Each [ ] pending task is a completion criterion
-
Auto mode continues until all [x] completed
Built-in Quality Gates
-
--build
-
Build must pass
-
--tests
-
Tests must pass
-
--e2e
-
E2E tests must pass
-
--cov N
-
Coverage threshold
Automatic Phase Management
-
Tasks grouped by User Story
-
Progress tracked in metadata.json
-
External sync keeps stakeholders informed
Real-World Success Stories
From Anthropic's documentation:
-
"6 repositories overnight"
-
"$50k contract for $297 in API costs"
-
"259 PRs, 497 commits, 40,000 lines in one month without opening IDE"
The key to success: Well-defined tasks with automated verification.