Binary Triage (Phase 1)
Purpose
Quick fingerprinting to establish baseline facts before deeper analysis. Runs in seconds, not minutes.
When to Use
-
First contact with an unknown binary
-
Need architecture/ABI info for tool selection
-
Quick capability assessment
-
Before committing to expensive analysis
Key Principle
Gather facts fast, defer analysis.
This phase identifies WHAT the binary is, not HOW it works.
Triage Sequence
Step 1: File Identification
Basic identification
file binary
Expected output patterns:
ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3
ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1
Extract:
-
Architecture (ARM, ARM64, x86_64, MIPS)
-
Bit width (32/64)
-
Endianness (LSB/MSB)
-
Link type (static/dynamic)
-
Interpreter path (libc indicator)
Step 2: Structured Metadata (rabin2)
All metadata as JSON
rabin2 -q -j -I binary | jq .
Key fields:
.arch - "arm", "x86", "mips"
.bits - 32 or 64
.endian - "little" or "big"
.os - "linux", "none"
.machine - "ARM", "AARCH64"
.stripped - true/false
.static - true/false
Step 3: ABI Detection
Interpreter detection
readelf -p .interp binary 2>/dev/null
Or via rabin2
rabin2 -I binary | grep interp
ARM-specific: float ABI
readelf -A binary | grep "Tag_ABI_VFP_args"
hard-float: "VFP registers"
soft-float: missing or "compatible"
Interpreter → Libc mapping:
Interpreter Libc Notes
/lib/ld-linux-armhf.so.3
glibc ARM hard-float
/lib/ld-linux.so.3
glibc ARM soft-float
/lib/ld-musl-arm.so.1
musl ARM 32-bit
/lib/ld-musl-aarch64.so.1
musl ARM 64-bit
/lib/ld-uClibc.so.0
uClibc Embedded
/lib64/ld-linux-x86-64.so.2
glibc x86_64
Step 4: Dependencies
Library dependencies
rabin2 -q -j -l binary | jq '.libs[]'
Common patterns:
libcurl.so.* → HTTP client
libssl.so.* → TLS/crypto
libpthread.so.* → Threading
libz.so.* → Compression
libsqlite3.so.* → Local database
Step 5: Entry Points & Exports
Entry points
rabin2 -q -j -e binary | jq .
Exports (for shared libraries)
rabin2 -q -j -E binary | jq '.exports[] | {name, vaddr}'
Step 6: Quick String Scan
All strings with metadata
rabin2 -q -j -zz binary | jq '.strings | length' # Count first
Filter interesting strings (URLs, paths, errors)
rabin2 -q -j -zz binary | jq ' .strings[] | select(.length > 8) | select(.string | test("http|ftp|/etc|/var|error|fail|pass|key|token"; "i")) '
Step 7: Import Analysis
All imports
rabin2 -q -j -i binary | jq '.imports[] | {name, lib}'
Group by capability
rabin2 -q -j -i binary | jq ' .imports | group_by(.lib) | map({lib: .[0].lib, functions: [.[].name]}) '
Capability Mapping
Import Pattern Capability
socket , connect , send
Network client
bind , listen , accept
Network server
open , read , write
File I/O
fork , exec* , system
Process spawning
pthread_*
Multi-threading
SSL_* , EVP_*
Cryptography
dlopen , dlsym
Dynamic loading
mmap , mprotect
Memory manipulation
Output Format
After triage, record structured facts:
{ "artifact": { "path": "/path/to/binary", "sha256": "abc123...", "size_bytes": 245760 }, "identification": { "arch": "arm", "bits": 32, "endian": "little", "os": "linux", "stripped": true, "static": false }, "abi": { "interpreter": "/lib/ld-musl-arm.so.1", "libc": "musl", "float_abi": "hard" }, "dependencies": [ "libcurl.so.4", "libssl.so.1.1", "libz.so.1" ], "capabilities_inferred": [ "network_client", "tls_encryption", "compression" ], "strings_of_interest": [ {"value": "https://api.vendor.com/telemetry", "type": "url"}, {"value": "/etc/config.json", "type": "path"} ], "complexity_estimate": { "functions": "unknown (stripped)", "strings": 847, "imports": 156 } }
Knowledge Journaling
After triage completes, record findings for episodic memory:
[BINARY-RE:triage] {filename} (sha256: {hash})
Identification: Architecture: {arch} {bits}-bit {endian} Libc: {glibc|musl|uclibc} ({interpreter_path}) Stripped: {yes|no} Size: {bytes}
FACT: Links against {library} (source: rabin2 -l) FACT: Contains {N} strings of interest (source: rabin2 -zz) FACT: Imports {function} from {library} (source: rabin2 -i)
Capabilities inferred:
- {capability_1} (evidence: {import/string})
- {capability_2} (evidence: {import/string})
HYPOTHESIS: {what binary likely does} (confidence: {0.0-1.0})
QUESTION: {open unknown that needs investigation}
Next phase: {static-analysis|dynamic-analysis} Sysroot needed: {path or "extract from device"}
Example Journal Entry
[BINARY-RE:triage] thermostat_daemon (sha256: a1b2c3d4...)
Identification: Architecture: ARM 32-bit LE Libc: musl (/lib/ld-musl-arm.so.1) Stripped: yes Size: 153,600 bytes
FACT: Links against libcurl.so.4 (source: rabin2 -l) FACT: Links against libssl.so.1.1 (source: rabin2 -l) FACT: Contains string "api.thermco.com" (source: rabin2 -zz) FACT: Imports curl_easy_perform (source: rabin2 -i)
Capabilities inferred:
- HTTP client (evidence: libcurl import)
- TLS encryption (evidence: libssl import)
- Network communication (evidence: URL string)
HYPOTHESIS: Telemetry client that reports to api.thermco.com (confidence: 0.6)
QUESTION: What data does it collect and transmit?
Next phase: static-analysis Sysroot needed: musl ARM (extract from device or Alpine)
Decision Points
After triage, determine:
-
Sysroot selection - Based on arch + libc
-
Analysis tool chain - r2 vs Ghidra vs both
-
Dynamic analysis feasibility - QEMU viability based on arch
-
Initial hypotheses - What does this binary likely do?
Next Steps
→ Proceed to binary-re-static-analysis for function enumeration → Or binary-re-dynamic-analysis if behavior observation is priority