Variant Analysis

Security Notice

AUTHORIZED USE ONLY: These skills are for DEFENSIVE security analysis and authorized research:

Authorized security assessments with written permission
Proactive vulnerability discovery in owned codebases
Post-incident variant hunting after a CVE is reported
Security research with proper disclosure
Educational purposes in controlled environments

NEVER use for:

Scanning systems without authorization
Developing exploits for unauthorized use
Circumventing security controls
Any illegal activities

Step 1: Seed Vulnerability Analysis

Start from a known vulnerability (CVE, bug report, or code pattern):

Extract the Vulnerability Pattern

Identify the bug class: What type of vulnerability is it? (SQL injection, XSS, buffer overflow, TOCTOU, etc.)
Identify the source: Where does untrusted data enter? (user input, network, file, environment)
Identify the sink: Where does the data cause harm? (SQL query, HTML output, memory write, system call)
Identify missing sanitization: What check/transform is absent between source and sink?
Abstract the pattern: Generalize beyond the specific instance

Example Seed Analysis

CVE-2024-XXXX: SQL Injection in user search

Bug class: CWE-089 (SQL Injection)
Source: HTTP request parameter q
Sink: String concatenation into SQL query
Missing: Parameterized query or input sanitization
Pattern: request.param → string concat → db.query()

Step 2: Pattern Generalization

Transform the seed into a query pattern:

Abstraction Levels

Level Description Example

Exact Same function, same file searchUsers(req.query.q)

Local Same pattern, different function Any db.query("..."+userInput)

Structural Same dataflow shape Any source-to-sink without sanitization

Semantic Same bug class, any syntax Any SQL injection variant

CodeQL Pattern Template

/**

@name Variant of CVE-XXXX: [description]
@description Finds code structurally similar to [seed vulnerability]
@kind path-problem
@problem.severity error
@security-severity 8.0
@precision high
@id js/variant-cve-xxxx
@tags security
```
  external/cwe/cwe-089
```

import javascript import DataFlow::PathGraph

class UntrustedSource extends DataFlow::Node { UntrustedSource() { // Define sources: HTTP parameters, request body, etc. this = any(Express::RequestInputAccess ria).flow() } }

class VulnerableSink extends DataFlow::Node { VulnerableSink() { // Define sinks: string concatenation in SQL context exists(DataFlow::CallNode call | call.getCalleeName() = "query" and this = call.getArgument(0) ) } }

class VariantConfig extends DataFlow::Configuration { VariantConfig() { this = "VariantConfig" }

override predicate isSource(DataFlow::Node source) { source instanceof UntrustedSource }

override predicate isSink(DataFlow::Node sink) { sink instanceof VulnerableSink }

override predicate isBarrier(DataFlow::Node node) { // Known sanitizers that prevent the vulnerability node = any(DataFlow::CallNode c | c.getCalleeName() = ["escape", "sanitize", "parameterize"] ).getAResult() } }

from VariantConfig config, DataFlow::PathNode source, DataFlow::PathNode sink where config.hasFlowPath(source, sink) select sink.getNode(), source, sink, "Potential variant of CVE-XXXX: untrusted data flows to SQL query without sanitization."

Semgrep Pattern Template

rules:

id: variant-cve-xxxx-sql-injection message: > Potential variant of CVE-XXXX: User input flows into SQL query via string concatenation without parameterization. severity: ERROR languages: [javascript, typescript] metadata: cwe: - CWE-089 confidence: HIGH impact: HIGH category: security technology: - express - node.js references: - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-XXXX patterns:
- pattern-either:
  - pattern: | $DB.query("..." + $USERINPUT + "...")
  - pattern: | $DB.query(...${$USERINPUT}...)
  - pattern: | $QUERY = "..." + $USERINPUT + "..." ... $DB.query($QUERY)
- pattern-not:
  - pattern: | $DB.query($QUERY, [...]) fix: | $DB.query($QUERY, [$USERINPUT])

Step 3: Variant Discovery

Run the Analysis

CodeQL variant scan

codeql database analyze codeql-db
--format=sarifv2.1.0
--output=variant-results.sarif
./variant-queries/

Semgrep variant scan

semgrep scan
--config=./variant-rules/
--sarif --output=variant-semgrep.sarif

Cross-repo CodeQL scan (GitHub)

codeql database analyze codeql-db-repo-1 codeql-db-repo-2
--format=sarifv2.1.0
--output=cross-repo-variants.sarif
./variant-queries/

Manual Pattern Search

When automated tools miss variants, use manual search:

Search for the syntactic pattern

grep -rn "db.query.+" --include=".js" --include="*.ts" .

Search for the function call pattern

grep -rn ".query\s*(" --include=".js" --include=".ts" . | grep -v "parameterized|escape|sanitize"

AST-based search with ast-grep

sg -p 'db.query("..." + $X)' --lang js

Step 4: Variant Classification

Triage Each Variant

For each discovered instance, classify:

Factor Question Impact on Priority

Reachability Can an attacker reach this code path? Critical if reachable

Exploitability Can the vulnerability be exploited? Critical if exploitable

Impact What damage can exploitation cause? Based on CIA triad

Confidence How certain is this a true positive? HIGH/MEDIUM/LOW

Similarity How structurally close to seed? Higher = higher confidence

Variant Family Tracking

Variant Family: CWE-089 SQL Injection

Seed: CVE-XXXX (src/api/users.js:42)

Pattern: request.param -> string concat -> db.query()

Variants Found:

V-001 src/api/products.js:78 (HIGH confidence)
- Same pattern, different endpoint
- Exploitable: YES
- Fix: Use parameterized query
V-002 src/api/orders.js:123 (MEDIUM confidence)
- Similar pattern, additional transform
- Exploitable: NEEDS INVESTIGATION
- Fix: Use parameterized query
V-003 src/legacy/search.js:45 (LOW confidence)
- Partial match, may be sanitized upstream
- Exploitable: UNLIKELY
- Fix: Verify sanitization chain

Step 5: Remediation and Report

Variant Analysis Report

Seed: [CVE/bug ID and description] Date: YYYY-MM-DD Scope: [repositories/directories analyzed] Tools: CodeQL, Semgrep, manual review

Executive Summary

Variants found: X
Critical: X | High: X | Medium: X | Low: X
False positives: X
Estimated remediation effort: X hours

Variant Details

[For each variant: location, classification, remediation]

Pattern Evolution

[How the pattern varies across the codebase]

Recommendations

Fix all CRITICAL/HIGH variants immediately
Add regression tests for each variant
Add CI/CD checks to prevent pattern recurrence
Consider architectural changes to eliminate the bug class

Common Vulnerability Seed Patterns

Injection Variants

Seed Pattern Variant Discovery Query

SQL injection via concatenation source -> string.concat -> db.query

Command injection via interpolation source -> template.literal -> exec

XSS via innerHTML source -> assignment -> innerHTML

Path traversal via user path source -> path.join -> fs.read

Authentication Variants

Seed Pattern Variant Discovery Query

Missing auth check route.handler without auth.middleware

Weak comparison password == input (not timing-safe)

Token reuse token.generate without uniqueness

Related Skills

static-analysis
CodeQL and Semgrep with SARIF output
semgrep-rule-creator
Custom vulnerability detection rules
differential-review
Security-focused diff analysis
insecure-defaults
Hardcoded credentials and fail-open detection
security-architect
STRIDE threat modeling

Agent Integration

security-architect (primary): Threat modeling and vulnerability assessment
code-reviewer (secondary): Pattern-aware code review
penetration-tester (secondary): Exploit verification for variants

Iron Laws

ALWAYS start from a confirmed seed vulnerability before writing any pattern queries
NEVER broaden a query without first verifying it matches the known seed vulnerability
ALWAYS test pattern queries against at least one known-vulnerable instance before scanning broadly
NEVER report a variant finding without manual triage confirming reachability and exploitability
ALWAYS check all related repositories when a variant is confirmed in one codebase

Anti-Patterns

Anti-Pattern Why It Fails Correct Approach

Exact-match queries only Misses refactored and syntactically different variants Abstract the pattern and test all four abstraction levels

No seed verification step Query may not match the known vulnerability Test query against seed instance first

Overly broad patterns High false positive rate wastes triage time Narrow with pattern-not for known-safe patterns

Single-repo scan Variant may exist in sibling repositories Scan all related repos with the same framework

Stopping after first variant found Leaves the bug class partially patched Perform exhaustive search across the full codebase

Memory Protocol (MANDATORY)

Before starting: Read .claude/context/memory/learnings.md

After completing:

New pattern -> .claude/context/memory/learnings.md
Issue found -> .claude/context/memory/issues.md
Decision made -> .claude/context/memory/decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

variant-analysis

Safety Notice

Copy this and send it to your AI assistant to learn