fuzz

Fuzz Input Generation

Generate intelligent, context-aware fuzz test inputs by analyzing input parsing code. Produces boundary values, type confusion inputs, encoding edge cases, format-specific attacks, and injection payloads tailored to the specific parser and data types in scope. Output is structured JSON test case sets ready for integration with test harnesses.

Supported Flags

Read ../../shared/schemas/flags.md for the full flag specification.

Flag Fuzz Behavior

--scope

Identifies which input handlers to generate fuzz inputs for. Default changed .

--depth quick

Standard boundary values and common injection strings only.

--depth standard

Context-aware inputs based on code analysis of the parser.

--depth deep

Standard + format-specific attacks, encoding mutations, and chained payloads.

--depth expert

Deep + adversarial inputs designed to bypass specific validation logic found in code.

--severity

Generate inputs targeting vulnerabilities at or above this severity.

--format

Default json . Use text for human-readable listing.

Workflow

Step 1: Identify Input Handlers

Locate input parsing and processing code in scope:

API endpoint handlers: Functions that read request body, query params, headers.
File parsers: Functions that parse uploaded files, config files, data imports.
CLI argument parsers: Argument parsing with argparse , commander , cobra , clap .
Message consumers: Functions processing messages from queues, WebSockets, SSE.
Deserialization points: JSON.parse, XML parsing, YAML loading, protobuf decoding.
Database query builders: Functions constructing queries from user input.

For each handler, identify:

Expected input type (string, number, array, object, file).
Validation rules (regex, schema, type checks, length limits).
How the input is used downstream (SQL, shell, HTML, file path, URL, regex).

Step 2: Analyze Input Constraints

Read the code to understand what the parser expects and what it guards against:

Type expectations: What types does the code assume? Where are type coercions?
Length limits: Are there explicit length checks? What happens at max length?
Character restrictions: Are certain characters filtered or escaped? Which ones?
Format requirements: Does the input need to match a pattern (email, URL, date)?
Range constraints: Numeric bounds, enum values, allowed file extensions.
Nested structure: How deep can objects/arrays nest? Are there recursion limits?

Step 3: Generate Boundary Value Inputs

For each input field, generate boundary value test cases:

Input Type Boundary Values

String Empty "" , single char "a" , max length, max length + 1, unicode BOM, null bytes "\x00"

Number 0, -1, MAX_INT, MIN_INT, MAX_INT+1, NaN, Infinity, -Infinity, float precision edge cases

Array Empty [] , single element, very large array (10000+), nested arrays, mixed types

Object Empty {} , deeply nested (100+ levels), circular reference attempt, prototype keys

Boolean true , false , 0 , 1 , "" , "false" , null , undefined

Date Epoch 0, negative timestamp, far future, invalid dates (Feb 30), timezone edge cases

File Empty file, 0-byte, huge file, wrong extension, polyglot file, symlink

Step 4: Generate Type Confusion Inputs

Inputs designed to exploit type coercion and type assumption bugs:

Generate inputs that send the wrong type: string where number expected, array where string expected, object with toString override, deeply nested arrays, null where required, boolean where string expected, numeric string where number expected, and prototype/constructor pollution objects (proto , constructor.prototype ).

Step 5: Generate Encoding Edge Cases

Inputs exploiting encoding and character set handling:

Unicode: Normalization forms (NFC, NFD, NFKC, NFKD), homoglyphs, right-to-left override, zero-width characters.
URL encoding: Double encoding (%2527 ), mixed encoding, overlong UTF-8.
HTML entities: Named (& ), numeric (& ), hex (& ), surrogate pairs.
Null bytes: Mid-string null bytes for truncation attacks.
Line endings: \r\n , \r , \n , \x0b , \x0c , \x85 , \u2028 , \u2029 .
Case mapping: Turkish locale I /i dotless variants, German ß /SS .

Step 6: Generate Context-Aware Injection Payloads

Based on how the input is used downstream (identified in Step 1), generate targeted payloads:

Sink Context Payload Category

SQL query SQL injection: UNION, boolean blind, time blind, stacked queries, comment-based

Shell command Command injection: semicolons, pipes, backticks, $() , newlines

HTML output XSS: script tags, event handlers, SVG/MathML, template injection

File path Path traversal: ../ , null bytes, long paths, reserved names (CON, NUL)

URL construction SSRF: localhost variants, IPv6, DNS rebinding, scheme confusion

Regex input ReDoS: catastrophic backtracking patterns, exponential quantifiers

XML parser XXE: external entity, parameter entity, SSRF via DTD

LDAP query LDAP injection: wildcards, boolean operators, null bytes

Header value Header injection: CRLF, response splitting

JSON parser JSON interoperability: duplicate keys, large numbers, deep nesting

Step 7: Generate Format-Specific Attacks

At --depth deep and above, generate inputs targeting specific file/data formats:

JSON: Duplicate keys (parser-dependent behavior), comments, trailing commas, BOM prefix.
XML: Billion laughs, quadratic blowup, external entities, CDATA abuse.
YAML: Anchor bombs, merge keys, tag deserialization (!!python/object ).
CSV: Formula injection (=CMD() ), field separator in values, newlines in quoted fields.
JWT: Algorithm none, key confusion (RS256/HS256), expired but valid signature.
GraphQL: Deep nesting, alias flooding, batch query abuse, introspection.
Multipart: Boundary manipulation, filename traversal, content-type mismatch.

Step 8: Output Test Case Sets

Organize all generated inputs into structured JSON test case sets:

{ "target": { "file": "src/api/users.ts", "function": "createUser", "input_field": "email", "expected_type": "string", "downstream_use": ["sql_query", "html_email"] }, "generated_at": "2026-02-14T10:30:00Z", "total_cases": 85, "test_cases": [ { "id": "FUZZ-001", "category": "boundary", "label": "empty_string", "input": "", "expected_behavior": "validation_error", "targets_cwe": "CWE-20" }, { "id": "FUZZ-002", "category": "injection_sql", "label": "union_select", "input": "test@test.com' UNION SELECT * FROM users--", "expected_behavior": "parameterized_query_prevents_injection", "targets_cwe": "CWE-89" } ] }

Write test case files to .appsec/fuzz/ organized by target.

Output Format

Fuzz inputs are not findings themselves but may reference CWEs they target.

Finding ID prefix: FUZZ (e.g., FUZZ-001 ) for test case identification.

metadata.tool : "fuzz"

If fuzz testing reveals an actual vulnerability (input causes unexpected behavior), emit a finding using ../../shared/schemas/findings.md .

Pragmatism Notes

Generate inputs relevant to the actual technology. Do not generate SQL injection payloads for code that never touches a database.
Respect the --depth flag. Quick depth should produce 10-20 inputs. Expert depth can produce hundreds.
Label each input clearly so testers understand what it targets and what behavior to expect.
Mark intentionally dangerous inputs (e.g., billion laughs XML) with a warning about resource consumption.
These are test inputs, not exploit code. Frame output as defensive testing material.
If the code already has strong validation visible in the source, generate inputs that specifically test the validation boundaries.
Include both inputs that should be rejected (malicious) and inputs that should be accepted (edge case valid) to test for false positives in validation.

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

spec-writer

attack-surface

api

file-upload