flaky-test-diagnoser-temporal
Use when a test is flaky, intermittent, non-deterministic, or randomly failing via a durable Temporal-backed workflow that survives session crashes. Trigger phrases include "flaky-test-diagnoser temporal", "sagaflow flaky test", "durable flakiness diagnosis", "temporal-backed flaky-test investigation". Multi-run experiments, isolation tests, ordering permutations, timing analysis. Fire-and-forget — let the workflow grind while you do other work.
Repository SourceNeeds Review
deep-debug
Use when a bug, test failure, or unexpected behavior needs diagnosing — including production incidents, regressions, stack traces, mysterious failures, flaky tests, or any symptom needing root-cause analysis. Trigger phrases include "debug this", "why is this failing", "find the bug", "fix the bug", "root cause", "what's wrong with", "this is broken", "diagnose", "troubleshoot", "investigate this failure", "the test is failing", "this used to work", "why doesn't this work", "where's the bug". Adversarial hypothesis-driven debugging with parallel competing hypotheses across orthogonal dimensions, blind independent judging, discriminating probes that falsify leaders, TDD-gated fix loops, and mandatory architectural escalation after 3 failed attempts.
Repository SourceNeeds Review
deep-research
Use when researching, investigating, or exploring a topic systematically with orthogonal multi-dimensional coverage and source-quality tiers. Trigger phrases include "research this deeply", "deep research on", "investigate this topic thoroughly", "explore this topic", "systematic research", "multi-dimensional research", "comprehensive research", "cover all angles of", "thorough research on", "deep dive into (research)", "exhaustive research". Spawns parallel agents across WHO/WHAT/HOW/WHERE/WHEN/WHY/LIMITS with risk-stratified spot-checking. Bounded by a user-controlled round budget with honest coverage reporting on what was and wasn't covered.
Repository SourceNeeds Review
flaky-test-diagnoser
Systematically diagnoses why a test is flaky by running multi-run experiments, isolation tests, ordering permutations, and timing analysis. Use when the user says a test is flaky, intermittent, non-deterministic, randomly failing, passes sometimes, or asks to debug test flakiness.
Repository SourceNeeds Review