Fix Failing E2E Tests
Fix failing E2E tests by modifying test implementations. This command NEVER modifies E2E_TESTS.md spec files — the spec is the source of truth.
Additional instructions from the user: "$ARGUMENTS". Ignore if empty.
Phase 1: Identify Failures
- Run the E2E test suite to capture current failures.
- If the user specified a particular test or suite in
$ARGUMENTS, run only that. - Otherwise, run the full E2E suite.
- If the user specified a particular test or suite in
- For each failure, capture:
- Test file and test name
- Error message and stack trace
- Expected vs actual values
Phase 2: Diagnose
For each failing test:
-
Read the failing test file to understand the test implementation.
-
Read the corresponding
E2E_TESTS.mdspec to understand the intended behavior. -
Classify the failure:
Test bug: The test implementation is wrong (bad assertion, incorrect setup, missing cleanup).
- Fix: Correct the test to match the spec.
Stale test: The code under test changed but the test wasn't updated.
- Fix: Update the test to match current code behavior AND verify the spec still matches.
- If the spec is outdated, inform the user but do NOT modify the spec.
Environment issue: Missing dependency, port conflict, timing issue.
- Fix: Make the test more resilient (add retries, increase timeouts, fix resource cleanup).
Flaky test: Passes sometimes, fails sometimes.
- Fix: Identify the source of non-determinism (timing, resource contention, ordering) and eliminate it.
Phase 3: Fix
Apply fixes following these rules:
- Never modify
E2E_TESTS.md: If the spec seems wrong, report it to the user but don't change it. - Follow existing patterns: Match the code style and patterns of adjacent passing tests.
- Maintain isolation: Fixes must not introduce shared state or ordering dependencies.
- Preserve assertions: Fix HOW a test runs, not WHAT it asserts (unless the assertion contradicts the spec).
- One fix at a time: Fix one test, verify it passes, then move to the next.
- Red/green TDD: When fixing a test, first confirm the failure is reproducible (red). Apply the minimal fix. Re-run to confirm it passes (green). Then run the full suite to check for regressions. When manual testing or debugging reveals an issue not covered by existing tests, add a spec entry and generate a test that fails (red), then fix the underlying code or test (green).
Common fixes:
Missing cleanup:
// Add proper afterEach if missing
afterEach:
changeDirectory(originalCwd)
env.PATH = originalPath
safeCleanup(testDir)
Timing issues:
// Add polling with timeout instead of fixed waits
function waitFor(condition, timeout=30000, interval=500):
start = now()
while now() - start < timeout:
if condition():
return
wait(interval)
throw TimeoutError("condition not met within {timeout}ms")
Resource conflicts:
// Use unique resources per test
testDir = createTempDirectory("unique-prefix-") // Not a fixed path
Phase 4: Verify
- Run the previously failing tests to confirm they now pass.
- Run the full E2E suite to confirm no regressions.
- Run the project's formatter on all modified files.
- Present a summary:
- Tests fixed
- Root cause for each fix
- Any spec issues found (for user to address separately)