Agentic Quality Engineering
<default_to_action> When implementing agentic QE or coordinating agents:
-
SPAWN appropriate agent(s) for the task using Task tool with agent type
-
CONFIGURE agent coordination (hierarchical/mesh/sequential)
-
EXECUTE with PACT principles: Proactive analysis, Autonomous operation, Collaborative feedback, Targeted risk focus
-
VALIDATE results through quality gates before deployment
-
LEARN from outcomes - store patterns in aqe/learning/* namespace
Quick Agent Selection:
-
Test generation needed → qe-test-generator
-
Coverage gaps → qe-coverage-analyzer
-
Quality decision → qe-quality-gate
-
Security scan → qe-security-scanner
-
Performance test → qe-performance-tester
-
Full pipeline → qe-fleet-commander
Critical Success Factors:
-
Agents amplify human expertise, not replace it
-
Human-in-the-loop for critical decisions
-
Measure: bugs caught, time saved, coverage improved </default_to_action>
Quick Reference Card
When to Use
-
Designing autonomous testing systems
-
Scaling QE with intelligent agents
-
Implementing multi-agent coordination
-
Building CI/CD quality pipelines
PACT Principles
Principle Agent Behavior Human Role
Proactive Analyze pre-merge, predict risk Set guardrails
Autonomous Execute tests, fix flaky tests Review critical
Collaborative Multi-agent coordination Provide context
Targeted Risk-based prioritization Define risk areas
19-Agent Fleet
Category Agents Primary Use
Core Testing (5) test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzer Daily testing
Performance/Security (2) performance-tester, security-scanner Non-functional
Strategic (3) requirements-validator, production-intelligence, fleet-commander Planning
Advanced (4) regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunter Specialized
Visual/Chaos (2) visual-tester, chaos-engineer Edge cases
Deployment (1) deployment-readiness Release
Analysis (1) code-complexity Maintainability
Coordination Patterns
Hierarchical: fleet-commander → [generators] → [executors] → quality-gate Mesh: test-gen ↔ coverage ↔ quality (peer decisions) Sequential: risk-analyzer → test-gen → executor → coverage → gate
Success Criteria
✅ 10x deployment frequency with same/better quality ✅ Coverage gaps detected in real-time ✅ Bugs caught pre-production ❌ Agents acting without human oversight on critical decisions ❌ Deploying all 19 agents at once (start with 1-2)
Core Concepts
QE Evolution
Stage Approach Limitation
Traditional Manual everything Human bottleneck
Automation Scripts + fixed scenarios Needs orchestration
Agentic AI agents + human judgment Requires trust-building
Core Premise: Agents amplify human expertise for 10x scale.
Key Capabilities
- Intelligent Test Generation
// Agent analyzes code change, generates targeted tests const tests = await qeTestGenerator.generate(prDiff); // → Happy path, edge cases, error handling tests
-
Pattern Detection - Scan logs, find anomalies, correlate errors
-
Adaptive Strategy - Adjust test focus based on risk signals
-
Root Cause Analysis - Link failures to code changes, suggest fixes
Agent Coordination
Memory Namespaces
aqe/test-plan/* - Test planning decisions aqe/coverage/* - Coverage analysis results aqe/quality/* - Quality metrics and gates aqe/learning/* - Patterns and Q-values aqe/coordination/* - Cross-agent state
Memory Operations (MCP Tools)
CRITICAL: Always use mcp__agentic-qe__memory_store with persist: true for learnings.
- Store data to persistent memory:
// Store test plan decisions (persisted to .agentic-qe/memory.db) mcp__agentic-qe__memory_store({ key: "aqe/test-plan/pr-123", namespace: "aqe/test-plan", value: { prNumber: 123, riskLevel: "medium", requiredCoverage: 85, testTypes: ["unit", "integration"], estimatedTime: 1800 }, persist: true, // ⚠️ REQUIRED for cross-session persistence ttl: 604800 // 7 days (0 = permanent) })
- Retrieve prior learnings before task:
// Query patterns before starting test generation const priorData = await mcp__agentic-qe__memory_retrieve({ key: "aqe/learning/patterns/test-generation/*", namespace: "aqe/learning", includeMetadata: true })
// Use patterns to guide current task
if (priorData.success) {
console.log(Loaded ${priorData.patterns.length} prior patterns);
}
- Store coverage analysis results:
mcp__agentic-qe__memory_store({ key: "aqe/coverage/auth-module", namespace: "aqe/coverage", value: { moduleId: "auth-module", currentCoverage: 78, gaps: ["error-handling", "edge-cases"], suggestedTests: 12, priority: "high" }, persist: true, ttl: 1209600 // 14 days })
Three-Phase Memory Protocol
For coordinated multi-agent tasks, use the STATUS → PROGRESS → COMPLETE pattern:
// PHASE 1: STATUS - Task starting mcp__agentic-qe__memory_store({ key: "aqe/coordination/task-123/status", namespace: "aqe/coordination", value: { status: "running", agent: "qe-test-generator", startTime: Date.now() }, persist: true })
// PHASE 2: PROGRESS - Intermediate updates mcp__agentic-qe__memory_store({ key: "aqe/coordination/task-123/progress", namespace: "aqe/coordination", value: { progress: 50, action: "generating-unit-tests", testsGenerated: 25 }, persist: true })
// PHASE 3: COMPLETE - Task finished mcp__agentic-qe__memory_store({ key: "aqe/coordination/task-123/complete", namespace: "aqe/coordination", value: { status: "complete", result: "success", testsGenerated: 47, coverageAchieved: 92.3, duration: 15000 }, persist: true })
Blackboard Events
Event Trigger Subscribers
test:generated
New tests created executor, coverage
coverage:gap
Gap detected test-generator
quality:decision
Gate evaluated fleet-commander
security:finding
Vulnerability found quality-gate
Example: PR Quality Pipeline
// 1. Risk analysis const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");
// 2. Generate tests for risks const tests = await Task("Generate tests", risks, "qe-test-generator");
// 3. Execute + analyze const results = await Task("Run tests", tests, "qe-test-executor"); const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");
// 4. Quality decision const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate"); // → GO/NO-GO with rationale
Implementation Phases
Phase Duration Goal Agent(s)
Experiment Weeks 1-4 Validate one use case 1 agent
Integrate Months 2-3 CI/CD pipeline 3-4 agents
Scale Months 4-6 Multiple use cases 8+ agents
Evolve Ongoing Continuous learning Full fleet
Phase 1 Example
Week 1: Deploy single agent
aqe agent spawn qe-test-generator
Weeks 2-3: Generate tests for 10 PRs
Track: bugs found, test quality, review time
Week 4: Measure impact
aqe agent metrics qe-test-generator
→ Tests: 150, Bugs: 12, Time saved: 8h
Limitations & Strengths
Agents Excel At
-
Volume: Scan thousands of logs in seconds
-
Patterns: Find correlations humans miss
-
Tireless: 24/7 testing and monitoring
-
Speed: Instant code change analysis
Agents Need Humans For
-
Business context and priorities
-
Ethical judgment and trade-offs
-
Creative exploration ("what if" scenarios)
-
Domain expertise (healthcare, finance, legal)
Best Practices
Do Don't
Start with one agent, one use case Deploy all 18 at once
Build feedback loops early Deploy and forget
Human reviews agent output Auto-merge without review
Measure bugs caught, time saved Track vanity metrics (test count)
Build trust gradually Give full autonomy immediately
Trust Progression
Month 1: Agent suggests → Human decides Month 2: Agent acts → Human reviews after Month 3: Agent autonomous on low-risk Month 4: Agent handles critical with oversight
Agent Coordination Hints
coordination: topology: hierarchical commander: qe-fleet-commander memory_namespace: aqe/coordination blackboard_topic: qe-fleet
preload_skills:
- agentic-quality-engineering # Always (this skill)
- risk-based-testing # For prioritization
- quality-metrics # For measurement
agent_assignments: qe-test-generator: [api-testing-patterns, tdd-london-chicago] qe-coverage-analyzer: [quality-metrics, risk-based-testing] qe-security-scanner: [security-testing, risk-based-testing] qe-performance-tester: [performance-testing]
Related Skills
-
holistic-testing-pact
-
PACT principles deep dive
-
risk-based-testing
-
Prioritize agent focus
-
quality-metrics
-
Measure agent effectiveness
-
api-testing-patterns , security-testing , performance-testing
-
Specialized testing
Resources
-
Agent definitions: .claude/agents/
-
CLI: aqe agent --help
-
Fleet status: aqe fleet status
Success Metric: Deploy 10x more frequently with same or better quality through intelligent agent collaboration.