This skill emphasizes writing tests that provide confidence without becoming maintenance burdens. Tests should be fast, reliable, and focused on behavior rather than implementation details.
<quick_start> TDD Red-Green-Refactor cycle:
RED: Write a failing test first
test('adds numbers', () => { expect(add(1, 2)).toBe(3); // Fails - add() doesn't exist });
GREEN: Write minimum code to pass
const add = (a, b) => a + b; // Test passes
REFACTOR: Clean up while tests stay green
Test pyramid: 70% unit, 25% integration, 5% E2E </quick_start>
<success_criteria> Testing is successful when:
-
TDD cycle followed: test written before implementation code
-
Test pyramid balanced: ~70% unit, ~25% integration, ~5% E2E
-
Tests are independent and can run in any order
-
No flaky tests (run 3x to verify reliability)
-
Coverage meets targets: 70-80% lines, 100% critical paths
-
Test names describe behavior (what + when + expected result)
-
Mocks only used for external dependencies, not own code </success_criteria>
<core_principles>
The Testing Mindset
-
Tests are documentation - A failing test is a specification that hasn't been implemented
-
Test behavior, not implementation - Tests should survive refactoring
-
Fast feedback loops - Unit tests run in milliseconds, not seconds
-
Isolation by default - Each test should be independent
-
Arrange-Act-Assert - Clear structure in every test </core_principles>
<tdd_workflow>
TDD: Red-Green-Refactor
┌─────────────────────────────────────────────────────────┐ │ TDD CYCLE │ │ │ │ ┌─────────┐ │ │ │ RED │ ◄─── Write a failing test │ │ └────┬────┘ │ │ │ │ │ ▼ │ │ ┌─────────┐ │ │ │ GREEN │ ◄─── Write minimum code to pass │ │ └────┬────┘ │ │ │ │ │ ▼ │ │ ┌─────────┐ │ │ │REFACTOR │ ◄─── Clean up while tests stay green │ │ └────┬────┘ │ │ │ │ │ └──────────────► Back to RED │ └─────────────────────────────────────────────────────────┘
The Rules
-
Write a failing test first - Never write production code without a failing test
-
Write only enough test to fail - Compilation failures count as failures
-
Write only enough code to pass - No more, no less
-
Refactor only when green - Never refactor with failing tests
Common TDD Mistakes
Mistake Why It's Wrong Instead
Writing tests after code Tests become confirmation bias Red-Green-Refactor
Testing private methods Tests implementation, not behavior Test public interface
Big leaps in test complexity Hard to debug failures Baby steps
Skipping refactor step Technical debt accumulates Always clean up
</tdd_workflow>
<test_pyramid>
The Test Pyramid
┌───────────┐
│ E2E │ Few, slow, expensive
│ Tests │ (minutes)
└─────┬─────┘
│
┌──────────┴──────────┐
│ Integration Tests │ Some, medium speed
│ (API, Database) │ (seconds)
└──────────┬───────────┘
│
┌─────────────────┴─────────────────┐
│ Unit Tests │ Many, fast, cheap
│ (Functions, Components) │ (milliseconds)
└────────────────────────────────────┘
Distribution Guidelines
Type Percentage Speed Scope
Unit 70-80% <10ms each Single function/component
Integration 15-25% <1s each Multiple components, DB
E2E 5-10% <30s each Full user flows
What to Test Where
Unit Tests:
-
Pure functions
-
Business logic
-
Data transformations
-
Validation rules
-
Component rendering
Integration Tests:
-
API endpoints
-
Database operations
-
Service interactions
-
Component integration
E2E Tests:
-
Critical user flows (login, checkout)
-
Happy paths only
-
Smoke tests </test_pyramid>
<when_to_mock>
Mocking Strategy
The London vs Detroit Schools
London School (Mockist):
-
Mock all dependencies
-
Test in complete isolation
-
Tests are very focused
Detroit School (Classicist):
-
Only mock external services
-
Test natural units together
-
Tests are more realistic
Recommended: Pragmatic approach
-
Mock external services (APIs, DBs in unit tests)
-
Don't mock your own code unless necessary
-
Use real implementations in integration tests
What to Mock
Mock Don't Mock
External APIs Your own pure functions
File system (in unit tests) Data transformations
Network requests Business logic
Time/randomness In-memory data structures
Expensive computations Simple utilities
Mocking Patterns
// GOOD: Mock external dependency const mockFetch = vi.fn().mockResolvedValue({ data: [] });
// BAD: Mocking your own utilities const mockFormatDate = vi.fn(); // Don't do this
// GOOD: Dependency injection for testability function createService(httpClient = fetch) { return { getData: () => httpClient('/api/data') }; }
// In test: const mockClient = vi.fn(); const service = createService(mockClient);
</when_to_mock>
<test_structure>
Test Organization
File Naming
src/ ├── components/ │ ├── Button.tsx │ └── Button.test.tsx # Colocated test ├── utils/ │ ├── format.ts │ └── format.test.ts └── tests/ # Or separate folder └── integration/ └── api.test.ts
Test Naming
// Pattern: describe what + when + expected result describe('UserService', () => { describe('createUser', () => { it('returns user object when given valid email', () => {}); it('throws ValidationError when email is invalid', () => {}); it('sends welcome email after successful creation', () => {}); }); });
// Alternative: BDD style describe('UserService', () => { describe('when creating a user with valid data', () => { it('should return the created user', () => {}); it('should send a welcome email', () => {}); });
describe('when email is invalid', () => { it('should throw ValidationError', () => {}); }); });
Arrange-Act-Assert
it('calculates total with discount', () => { // Arrange - set up test data const cart = createCart([ { price: 100, quantity: 2 }, { price: 50, quantity: 1 } ]); const discount = 0.1;
// Act - perform the action const total = calculateTotal(cart, discount);
// Assert - verify result expect(total).toBe(225); // (200 + 50) * 0.9 });
</test_structure>
<what_not_to_test>
What NOT to Test
Skip These
-
Framework code - React's useState, Express routing
-
Third-party libraries - They have their own tests
-
Trivial getters/setters - No logic = no test needed
-
Implementation details - Private methods, internal state
-
One-line functions - Unless they have complex logic
Focus On
-
Business logic - Where bugs hide
-
Edge cases - Nulls, empty arrays, boundaries
-
Error paths - What happens when things fail
-
User-facing behavior - What users actually do
-
Regressions - Bugs that came back once
Coverage Targets
Metric Target Notes
Line coverage 70-80% Higher isn't always better
Branch coverage 70-80% More important than lines
Critical paths 100% Auth, payments, data mutations
Warning: 100% coverage doesn't mean good tests. Bad tests can hit every line without testing anything meaningful. </what_not_to_test>
Topic Reference File When to Load
Unit testing patterns reference/unit-testing.md
Writing unit tests, mocking
Integration testing reference/integration-testing.md
API tests, database tests
Test organization reference/test-organization.md
Structuring test suites
Coverage strategies reference/coverage-strategies.md
Setting coverage goals
To load: Ask for the specific topic or check if context suggests it.
<framework_patterns>
Quick Reference by Framework
pytest (Python)
Fixtures
@pytest.fixture def user(): return User(name="test")
def test_user_greet(user): assert user.greet() == "Hello, test"
Parametrize
@pytest.mark.parametrize("input,expected", [ ("hello", "HELLO"), ("world", "WORLD"), ]) def test_uppercase(input, expected): assert uppercase(input) == expected
vitest/jest (TypeScript)
// Basic test test('adds numbers', () => { expect(add(1, 2)).toBe(3); });
// Mock vi.mock('./api', () => ({ fetchUser: vi.fn().mockResolvedValue({ name: 'test' }) }));
// Component test import { render, screen } from '@testing-library/react';
test('renders button', () => { render(<Button>Click</Button>); expect(screen.getByRole('button')).toHaveTextContent('Click'); });
Testing Library Principles
-
Query by role, not test ID
-
Test what users see, not implementation
-
Prefer userEvent over fireEvent
-
Avoid testing internal state </framework_patterns>
Before marking code complete:
-
Unit tests cover happy path
-
Unit tests cover error cases
-
Edge cases tested (null, empty, boundary)
-
Integration tests for API endpoints
-
No flaky tests (run 3x to verify)
-
Tests are independent (run in any order)
-
Test names describe behavior
-
No hardcoded timeouts (use waitFor)
-
Mocks are reset between tests
-
Coverage meets project standards