binary-analysis-patterns

Binary Analysis Patterns

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "binary-analysis-patterns" with this command: npx skills add oimiragieo/agent-studio/oimiragieo-agent-studio-binary-analysis-patterns

Binary Analysis Patterns

Security Notice

AUTHORIZED USE ONLY: These skills are for DEFENSIVE security analysis and authorized research:

  • Authorized pentesting engagements with written authorization

  • CTF competitions and security research

  • Defensive security and malware analysis

  • Security research with proper disclosure

  • Educational purposes in controlled environments

NEVER use for:

  • Creating or enhancing malicious code

  • Unauthorized access to systems

  • Bypassing software licensing illegitimately

  • Intellectual property theft

  • Any illegal activities

Comprehensive patterns and techniques for analyzing compiled binaries, understanding assembly code, and reconstructing program logic.

Disassembly Fundamentals

x86-64 Instruction Patterns

Function Prologue/Epilogue

; Standard prologue push rbp ; Save base pointer mov rbp, rsp ; Set up stack frame sub rsp, 0x20 ; Allocate local variables

; Leaf function (no calls) ; May skip frame pointer setup sub rsp, 0x18 ; Just allocate locals

; Standard epilogue mov rsp, rbp ; Restore stack pointer pop rbp ; Restore base pointer ret

; Leave instruction (equivalent) leave ; mov rsp, rbp; pop rbp ret

Calling Conventions

System V AMD64 (Linux, macOS)

; Arguments: RDI, RSI, RDX, RCX, R8, R9, then stack ; Return: RAX (and RDX for 128-bit) ; Caller-saved: RAX, RCX, RDX, RSI, RDI, R8-R11 ; Callee-saved: RBX, RBP, R12-R15

; Example: func(a, b, c, d, e, f, g) mov rdi, [a] ; 1st arg mov rsi, [b] ; 2nd arg mov rdx, [c] ; 3rd arg mov rcx, [d] ; 4th arg mov r8, [e] ; 5th arg mov r9, [f] ; 6th arg push [g] ; 7th arg on stack call func

Microsoft x64 (Windows)

; Arguments: RCX, RDX, R8, R9, then stack ; Shadow space: 32 bytes reserved on stack ; Return: RAX

; Example: func(a, b, c, d, e) sub rsp, 0x28 ; Shadow space + alignment mov rcx, [a] ; 1st arg mov rdx, [b] ; 2nd arg mov r8, [c] ; 3rd arg mov r9, [d] ; 4th arg mov [rsp+0x20], [e] ; 5th arg on stack call func add rsp, 0x28

ARM Assembly Patterns

ARM64 (AArch64) Calling Convention

; Arguments: X0-X7 ; Return: X0 (and X1 for 128-bit) ; Frame pointer: X29 ; Link register: X30

; Function prologue stp x29, x30, [sp, #-16]! ; Save FP and LR mov x29, sp ; Set frame pointer

; Function epilogue ldp x29, x30, [sp], #16 ; Restore FP and LR ret

ARM32 Calling Convention

; Arguments: R0-R3, then stack ; Return: R0 (and R1 for 64-bit) ; Link register: LR (R14)

; Function prologue push {fp, lr} add fp, sp, #4

; Function epilogue pop {fp, pc} ; Return by popping PC

Control Flow Patterns

Conditional Branches

; if (a == b) cmp eax, ebx jne skip_block ; ... if body ... skip_block:

; if (a < b) - signed cmp eax, ebx jge skip_block ; Jump if greater or equal ; ... if body ... skip_block:

; if (a < b) - unsigned cmp eax, ebx jae skip_block ; Jump if above or equal ; ... if body ... skip_block:

Loop Patterns

; for (int i = 0; i < n; i++) xor ecx, ecx ; i = 0 loop_start: cmp ecx, [n] ; i < n jge loop_end ; ... loop body ... inc ecx ; i++ jmp loop_start loop_end:

; while (condition) jmp loop_check loop_body: ; ... body ... loop_check: cmp eax, ebx jl loop_body

; do-while loop_body: ; ... body ... cmp eax, ebx jl loop_body

Switch Statement Patterns

; Jump table pattern mov eax, [switch_var] cmp eax, max_case ja default_case jmp [jump_table + eax*8]

; Sequential comparison (small switch) cmp eax, 1 je case_1 cmp eax, 2 je case_2 cmp eax, 3 je case_3 jmp default_case

Data Structure Patterns

Array Access

; array[i] - 4-byte elements mov eax, [rbx + rcx*4] ; rbx=base, rcx=index

; array[i] - 8-byte elements mov rax, [rbx + rcx*8]

; Multi-dimensional array[i][j] ; arr[i][j] = base + (i * cols + j) * element_size imul eax, [cols] add eax, [j] mov edx, [rbx + rax*4]

Structure Access

struct Example { int a; // offset 0 char b; // offset 4 // padding // offset 5-7 long c; // offset 8 short d; // offset 16 };

; Accessing struct fields mov rdi, [struct_ptr] mov eax, [rdi] ; s->a (offset 0) movzx eax, byte [rdi+4] ; s->b (offset 4) mov rax, [rdi+8] ; s->c (offset 8) movzx eax, word [rdi+16] ; s->d (offset 16)

Linked List Traversal

; while (node != NULL) list_loop: test rdi, rdi ; node == NULL? jz list_done ; ... process node ... mov rdi, [rdi+8] ; node = node->next (assuming next at offset 8) jmp list_loop list_done:

Common Code Patterns

String Operations

; strlen pattern xor ecx, ecx strlen_loop: cmp byte [rdi + rcx], 0 je strlen_done inc ecx jmp strlen_loop strlen_done: ; ecx contains length

; strcpy pattern strcpy_loop: mov al, [rsi] mov [rdi], al test al, al jz strcpy_done inc rsi inc rdi jmp strcpy_loop strcpy_done:

; memcpy using rep movsb mov rdi, dest mov rsi, src mov rcx, count rep movsb

Arithmetic Patterns

; Multiplication by constant ; x * 3 lea eax, [rax + rax*2]

; x * 5 lea eax, [rax + rax*4]

; x * 10 lea eax, [rax + rax*4] ; x * 5 add eax, eax ; * 2

; Division by power of 2 (signed) mov eax, [x] cdq ; Sign extend to EDX:EAX and edx, 7 ; For divide by 8 add eax, edx ; Adjust for negative sar eax, 3 ; Arithmetic shift right

; Modulo power of 2 and eax, 7 ; x % 8

Bit Manipulation

; Test specific bit test eax, 0x80 ; Test bit 7 jnz bit_set

; Set bit or eax, 0x10 ; Set bit 4

; Clear bit and eax, ~0x10 ; Clear bit 4

; Toggle bit xor eax, 0x10 ; Toggle bit 4

; Count leading zeros bsr eax, ecx ; Bit scan reverse xor eax, 31 ; Convert to leading zeros

; Population count (popcnt) popcnt eax, ecx ; Count set bits

Decompilation Patterns

Variable Recovery

; Local variable at rbp-8 mov qword [rbp-8], rax ; Store to local mov rax, [rbp-8] ; Load from local

; Stack-allocated array lea rax, [rbp-0x40] ; Array starts at rbp-0x40 mov [rax], edx ; array[0] = value mov [rax+4], ecx ; array[1] = value

Function Signature Recovery

; Identify parameters by register usage func: ; rdi used as first param (System V) mov [rbp-8], rdi ; Save param to local ; rsi used as second param mov [rbp-16], rsi ; Identify return by RAX at end mov rax, [result] ret

Type Recovery

; 1-byte operations suggest char/bool movzx eax, byte [rdi] ; Zero-extend byte movsx eax, byte [rdi] ; Sign-extend byte

; 2-byte operations suggest short movzx eax, word [rdi] movsx eax, word [rdi]

; 4-byte operations suggest int/float mov eax, [rdi] movss xmm0, [rdi] ; Float

; 8-byte operations suggest long/double/pointer mov rax, [rdi] movsd xmm0, [rdi] ; Double

Ghidra Analysis Tips

Improving Decompilation

// In Ghidra scripting // Fix function signature Function func = getFunctionAt(toAddr(0x401000)); func.setReturnType(IntegerDataType.dataType, SourceType.USER_DEFINED);

// Create structure type StructureDataType struct = new StructureDataType("MyStruct", 0); struct.add(IntegerDataType.dataType, "field_a", null); struct.add(PointerDataType.dataType, "next", null);

// Apply to memory createData(toAddr(0x601000), struct);

Pattern Matching Scripts

Find all calls to dangerous functions

for func in currentProgram.getFunctionManager().getFunctions(True): for ref in getReferencesTo(func.getEntryPoint()): if func.getName() in ["strcpy", "sprintf", "gets"]: print(f"Dangerous call at {ref.getFromAddress()}")

IDA Pro Patterns

IDAPython Analysis

import idaapi import idautils import idc

Find all function calls

def find_calls(func_name): for func_ea in idautils.Functions(): for head in idautils.Heads(func_ea, idc.find_func_end(func_ea)): if idc.print_insn_mnem(head) == "call": target = idc.get_operand_value(head, 0) if idc.get_func_name(target) == func_name: print(f"Call to {func_name} at {hex(head)}")

Rename functions based on strings

def auto_rename(): for s in idautils.Strings(): for xref in idautils.XrefsTo(s.ea): func = idaapi.get_func(xref.frm) if func and "sub_" in idc.get_func_name(func.start_ea): # Use string as hint for naming pass

Best Practices

Analysis Workflow

  • Initial triage: File type, architecture, imports/exports

  • String analysis: Identify interesting strings, error messages

  • Function identification: Entry points, exports, cross-references

  • Control flow mapping: Understand program structure

  • Data structure recovery: Identify structs, arrays, globals

  • Algorithm identification: Crypto, hashing, compression

  • Documentation: Comments, renamed symbols, type definitions

Common Pitfalls

  • Optimizer artifacts: Code may not match source structure

  • Inline functions: Functions may be expanded inline

  • Tail call optimization: jmp instead of call

  • ret
  • Dead code: Unreachable code from optimization

  • Position-independent code: RIP-relative addressing

Iron Laws

  • ALWAYS perform static analysis before any dynamic execution — disassemble and map control flow first; executing untrusted binaries without prior static analysis is a security risk and destroys reproducible evidence.

  • NEVER trust decompiler output as ground truth — decompilers produce approximations; always cross-reference decompiled output with the raw disassembly for security-critical paths.

  • NEVER assume sequential execution — indirect jumps, virtual dispatch tables, and JIT-compiled code all break linear flow; always check cross-references and jump tables before tracing a code path.

  • ALWAYS document architecture and calling convention at the start — x86, x64, ARM, and MIPS have different calling conventions; misidentifying them causes every parameter and return-value analysis to be wrong.

  • ALWAYS mark assumptions as unverified until confirmed by dynamic analysis — static analysis is incomplete; flag every inferred behavior with [STATIC-ONLY] and validate with dynamic trace evidence when possible.

Anti-Patterns

Anti-Pattern Why It Fails Correct Approach

Executing binary before static analysis Destroys forensic state; safety risk Map entry points and control flow statically first

Trusting decompiler output verbatim Decompilers introduce artifacts and errors Cross-reference with raw disassembly for critical paths

Assuming linear code flow Misses indirect jumps, vtables, JIT paths Check all xrefs and jump tables before tracing

Skipping architecture documentation Wrong calling convention invalidates all analysis Document arch + ABI before starting any analysis

Treating static inferences as confirmed Inferences can be wrong without dynamic validation Mark as [STATIC-ONLY] until runtime trace confirms

Memory Protocol (MANDATORY)

Before starting: Read C:\dev\projects\agent-studio.claude\context\memory\learnings.md

After completing:

  • New pattern -> C:\dev\projects\agent-studio.claude\context\memory\learnings.md

  • Issue found -> C:\dev\projects\agent-studio.claude\context\memory\issues.md

  • Decision made -> C:\dev\projects\agent-studio.claude\context\memory\decisions.md

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

research-synthesis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

static-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

variant-analysis

No summary provided by upstream source.

Repository SourceNeeds Review
Research

debug-log-analysis

No summary provided by upstream source.

Repository SourceNeeds Review