| name | ctf-reverse |
| description | Reverse engineering: ELF/PE/Mach-O, WASM, .NET, APK (Flutter/Dart), Python bytecode, Go/Rust/Swift/Kotlin, custom VMs, anti-debug/anti-VM, VMProtect/Themida, eBPF, Ghidra/IDA/radare2/Frida/angr/Qiling. Dispatch on file magic + loader signature. |
CTF Reverse Engineering
Quick reference for RE challenges. For detailed techniques, see supporting files.
Additional Resources
- tools.md — GDB, Ghidra, radare2, IDA, Binary Ninja, Unicorn, WASM, pyc, packed
- tools-dynamic.md — Frida, angr, lldb, x64dbg, Qiling, Triton, Pin instruction-counting
- tools-advanced.md — VMProtect/Themida, BinDiff, D-810/GOOMBA, TTF GSUB, AVX2 Z3 lift
- tools-advanced-2.md — 2025-26: GB-scale PE Unicorn+angr hybrid (VirtualProtect-gated unpackers)
- anti-analysis.md — Linux/Windows anti-debug, anti-VM, anti-DBI, MBA, self-hashing
- patterns.md — custom VMs, nanomites, LLVM obfuscation, S-box, SECCOMP/BPF, multi-thread
- patterns-ctf.md — comp patterns part 1: hidden opcodes, LD_PRELOAD, GBA MITM, maze kmod
- patterns-ctf-2.md — part 2: multi-layer brute, CVP integer, decision-tree, perf oracle, VM misident
- patterns-ctf-3.md — 2025-26: genetic algorithm / hill-climb over opaque additive scoring
- languages.md — Python bytecode, pyarmor, UEFI, esolangs, HarmonyOS, Godot, Electron
- languages-compiled.md — Go (GoReSym), Rust, Swift, Kotlin/JVM, C++ vtables, .pyc forgery
- platforms.md — Mach-O, iOS jailbreak, embedded firmware, kernel drivers, game engines, CAN
Pattern Recognition Index
Dispatch on observable binary features, not challenge titles.
Signal (from file, readelf, strings, nm) | Technique → file |
|---|
ELF with __libc_start_main, small main, direct syscalls | Basic RE patterns → patterns.md |
| ELF with large unrecognised opcode-dispatch loop (switch on byte → handler) | Custom VM reversing → patterns.md |
readelf -l shows RWX segment + self-writes to .text | Self-modifying / multi-layer decryption → patterns-ctf-2.md |
| Binary that modifies its round constants and re-encrypts output | Binary-as-keystream-oracle (patch I/O boundary) → patterns-ctf-2.md |
ptrace(PTRACE_TRACEME) / /proc/self/status TracerPid / rdtsc timing | Anti-debug detection → anti-analysis.md |
__Py_* or PyMarshal strings | Python bytecode / pyc reversing → languages.md |
runtime. prefix in strings, go.buildinfo | Go reversing (GoReSym) → languages-compiled.md |
Rust demangling (_ZN/_RN), core::panicking | Rust reversing → languages-compiled.md |
Mach-O header FEEDFACE/FEEDFACF | macOS/iOS RE → platforms.md |
.wasm magic (00 61 73 6D) | WASM → languages.md, ctf-misc/games-and-vms.md |
.apk/classes.dex, libflutter.so, kernel.dill | APK / Flutter reversing → languages.md |
| Unicorn/QEMU used as a sandbox with host-side memory read helpers | Host/guest hook divergence → patterns-ctf-2.md (and ctf-pwn/advanced-exploits-2.md) |
.rodata blob + XOR loop with known constants / stored expected bytes | Stack-string deobfuscation → patterns-ctf-2.md |
SHA-NI instructions, per-layer key read from stdin | Multi-layer brute-force JIT → patterns-ctf-2.md |
| Per-char early-exit compare loop + local execution allowed | perf_event_open instruction-count oracle → patterns-ctf-2.md |
| Custom VM whose handlers are pop/push but docs claim "register-based" + banned bytes | Arch misidentification + banned-byte synthesis → patterns-ctf-2.md |
.pyc with loader that checks only first 16 bytes | PEP-552 magic-header forgery → languages-compiled.md |
Go binary with runtime.itab symbols intact but stripped strings | GoReSym/typelinks restore → languages-compiled.md |
bpftool prog list shows non-standard eBPF prog | eBPF FSM syscall-sequence decomp → languages-compiled.md (+ ctf-pwn/sandbox-escape.md) |
TTF/OTF with abnormally dense GSUB; glyphs named hex_*/one/zero | GSUB ligature stego DAG reverse → tools-advanced.md |
AVX2 vpaddb/vpshufb in tight loop over input | Lane-wise Z3 lifting → tools-advanced.md |
PE ≥ 500 MB, multiple VirtualProtect(...,RWX) + inline decrypt + call/jmp rax after each | Unicorn layer-graph + per-layer angr solve → tools-advanced-2.md |
Flat chain of hundreds of if (input[i] op const) score += kN; win if score >= THR | Separability probe → hill-climb → GA → patterns-ctf-3.md |
Recognize the artefact or opcode pattern. The title is noise.
For inline code/cheatsheet quick references (grep patterns, one-liners, common payloads), see quickref.md. The Pattern Recognition Index above is the dispatch table — always consult it first; load quickref.md only if you need a concrete snippet after dispatch.
CTF Reverse - Anti-Analysis Techniques & Bypasses
Comprehensive reference for anti-debugging, anti-VM, anti-DBI, and integrity-check techniques encountered in CTF challenges, with practical bypasses.
Table of Contents
Linux Anti-Debug (Advanced)
ptrace-Based
Self-ptrace (most common):
if (ptrace(PTRACE_TRACEME, 0, 0, 0) == -1) exit(1);
Bypasses:
LD_PRELOAD=./hook.so ./binary
python3 -c "
from pwn import *
elf = ELF('./binary', checksec=False)
elf.asm(elf.symbols.ptrace, 'xor eax, eax; ret')
elf.save('patched')
"
gdb ./binary
(gdb) catch syscall ptrace
(gdb) run
(gdb) set $rax = 0
(gdb) continue
echo 0 > /proc/sys/kernel/yama/ptrace_scope
Double-ptrace pattern:
pid_t child = fork();
if (child == 0) {
ptrace(PTRACE_ATTACH, getppid(), 0, 0);
} else {
}
Bypass: Kill the watchdog child process, then attach debugger.
/proc Filesystem Checks
FILE *f = fopen("/proc/self/status", "r");
readlink("/proc/self/exe", buf, sizeof(buf));
grep("frida", "/proc/self/maps");
Bypasses:
unshare -m bash -c 'mount --bind /dev/null /proc/self/status && ./binary'
(gdb) b fopen
(gdb) run
(gdb) set {char[20]} $rdi = "/dev/null"
(gdb) continue
Timing-Based Detection
uint64_t start = __rdtsc();
uint64_t delta = __rdtsc() - start;
if (delta > THRESHOLD) exit(1);
struct timespec ts1, ts2;
clock_gettime(CLOCK_MONOTONIC, &ts1);
clock_gettime(CLOCK_MONOTONIC, &ts2);
struct timeval tv1, tv2;
gettimeofday(&tv1, NULL);
Bypasses:
(gdb) set {unsigned char[2]} 0x401234 = {0x90, 0x90}
LD_PRELOAD=/usr/lib/faketime/libfaketime.so.1 FAKETIME="2024-01-01" ./binary
Signal-Based Anti-Debug
signal(SIGTRAP, handler);
__asm__("int3");
signal(SIGALRM, kill_handler);
alarm(5);
signal(SIGSEGV, real_logic_handler);
*(int*)0 = 0;
Bypasses:
(gdb) handle SIGTRAP nostop pass
(gdb) handle SIGALRM ignore
(gdb) handle SIGSEGV nostop pass
Syscall-Level Evasion
long ret;
asm volatile("syscall" : "=a"(ret) : "a"(101), "D"(0), "S"(0), "d"(0), "r"(0));
Bypass: Must patch the binary itself or use ptrace to intercept at syscall level.
(gdb) catch syscall 101
(gdb) commands
> set $rax = 0
> continue
> end
Windows Anti-Debug (Advanced)
PEB-Based Checks
bool debugged = NtCurrentPeb()->BeingDebugged;
DWORD flags = *(DWORD*)((BYTE*)NtCurrentPeb() + 0xBC);
if (flags & 0x70) exit(1);
Bypass (x64dbg):
# ScyllaHide plugin auto-patches PEB fields
# Manual: dump PEB, zero BeingDebugged and NtGlobalFlag
NtQueryInformationProcess
DWORD_PTR debugPort = 0;
NtQueryInformationProcess(GetCurrentProcess(), 7, &debugPort, sizeof(debugPort), NULL);
if (debugPort != 0) exit(1);
HANDLE debugObj = NULL;
NTSTATUS status = NtQueryInformationProcess(GetCurrentProcess(), 0x1E, &debugObj, sizeof(debugObj), NULL);
if (status == 0) exit(1);
DWORD noDebug = 0;
NtQueryInformationProcess(GetCurrentProcess(), 0x1F, &noDebug, sizeof(noDebug), NULL);
if (noDebug == 0) exit(1);
Bypass: Hook NtQueryInformationProcess to return fake values, or use ScyllaHide.
Heap Flags
PHEAP heap = (PHEAP)GetProcessHeap();
if (heap->Flags != 0x2 || heap->ForceFlags != 0) exit(1);
TLS Callbacks
Key technique: TLS (Thread Local Storage) callbacks execute BEFORE main() / entry point.
void NTAPI TlsCallback(PVOID DllHandle, DWORD Reason, PVOID Reserved) {
if (Reason == DLL_PROCESS_ATTACH) {
if (IsDebuggerPresent()) {
ExitProcess(1);
}
}
}
#pragma comment(linker, "/INCLUDE:_tls_used")
#pragma data_seg(".CRT$XLB")
PIMAGE_TLS_CALLBACK callbacks[] = { TlsCallback, NULL };
Detection in IDA/Ghidra: Check PE TLS Directory → AddressOfCallBacks. Functions listed there run before EP.
Bypass: Set breakpoint on TLS callback in x64dbg (Options → Events → TLS Callbacks), or patch the TLS directory entry.
Hardware Breakpoint Detection
CONTEXT ctx;
ctx.ContextFlags = CONTEXT_DEBUG_REGISTERS;
GetThreadContext(GetCurrentThread(), &ctx);
if (ctx.Dr0 || ctx.Dr1 || ctx.Dr2 || ctx.Dr3) exit(1);
Bypass:
Software Breakpoint Detection (INT3 Scanning)
unsigned char *code = (unsigned char*)function_addr;
uint32_t checksum = 0;
for (int i = 0; i < code_size; i++) {
checksum += code[i];
if (code[i] == 0xCC) exit(1);
}
if (checksum != EXPECTED_CHECKSUM) exit(1);
Bypass: Use hardware breakpoints (DR0-DR3) instead of software breakpoints. Or hook the scanning function.
Exception-Based Anti-Debug
SetUnhandledExceptionFilter(handler);
RaiseException(EXCEPTION_ACCESS_VIOLATION, 0, 0, NULL);
__asm { int 2dh }
NtSetInformationThread (Thread Hiding)
typedef NTSTATUS(NTAPI *pNtSIT)(HANDLE, ULONG, PVOID, ULONG);
pNtSIT NtSIT = (pNtSIT)GetProcAddress(GetModuleHandle("ntdll"), "NtSetInformationThread");
NtSIT(GetCurrentThread(), 0x11 , NULL, 0);
Bypass: Hook NtSetInformationThread to ignore class 0x11, or patch the call.
Anti-VM / Anti-Sandbox
CPUID Hypervisor Bit
int regs[4];
__cpuid(regs, 1);
if (regs[2] & (1 << 31)) {
exit(1);
}
__cpuid(regs, 0x40000000);
char brand[13] = {0};
memcpy(brand, ®s[1], 12);
Bypass: Patch cpuid results or use LD_PRELOAD to hook wrapper functions.
MAC Address / Hardware Fingerprinting
Known VM MAC prefixes:
VMware: 00:0C:29, 00:50:56
VirtualBox: 08:00:27
Hyper-V: 00:15:5D
Parallels: 00:1C:42
QEMU: 52:54:00
Timing-Based VM Detection
uint64_t start = __rdtsc();
__cpuid(regs, 0);
uint64_t delta = __rdtsc() - start;
if (delta > 500) { }
File / Registry Artifacts
Files: C:\Windows\System32\drivers\vm*.sys, vbox*.dll, VBoxService.exe
Registry: HKLM\SOFTWARE\VMware, Inc.\VMware Tools
Services: VMTools, VBoxService
Processes: vmtoolsd.exe, VBoxTray.exe, qemu-ga.exe
Linux: /sys/class/dmi/id/product_name contains "VirtualBox"|"VMware"
dmesg | grep -i "hypervisor detected"
Resource Checks (CPU Count, RAM, Disk)
SYSTEM_INFO si;
GetSystemInfo(&si);
if (si.dwNumberOfProcessors < 2) exit(1);
MEMORYSTATUSEX ms;
ms.dwLength = sizeof(ms);
GlobalMemoryStatusEx(&ms);
if (ms.ullTotalPhys < 2ULL * 1024 * 1024 * 1024) exit(1);
GetDiskFreeSpaceEx("C:\\", NULL, &total, NULL);
Bypass: Use a VM configured with adequate resources (4+ CPUs, 8GB+ RAM, 100GB+ disk).
Anti-DBI (Dynamic Binary Instrumentation)
Frida Detection
FILE *f = fopen("/proc/self/maps", "r");
while (fgets(line, sizeof(line), f)) {
if (strstr(line, "frida") || strstr(line, "gadget")) exit(1);
}
int sock = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in addr = {.sin_family=AF_INET, .sin_port=htons(27042), .sin_addr.s_addr=inet_addr("127.0.0.1")};
if (connect(sock, (struct sockaddr*)&addr, sizeof(addr)) == 0) exit(1);
unsigned char *strcmp_bytes = (unsigned char *)strcmp;
if (strcmp_bytes[0] == 0xE9 || strcmp_bytes[0] == 0xFF) exit(1);
DIR *dir = opendir("/proc/self/task");
while ((entry = readdir(dir))) {
char comm_path[256];
snprintf(comm_path, sizeof(comm_path), "/proc/self/task/%s/comm", entry->d_name);
}
Frida bypass of Frida detection:
Interceptor.attach(Module.findExportByName(null, "strstr"), {
onEnter(args) {
this.haystack = Memory.readUtf8String(args[0]);
this.needle = Memory.readUtf8String(args[1]);
},
onLeave(retval) {
if (this.needle && (this.needle.includes("frida") || this.needle.includes("gadget"))) {
retval.replace(ptr(0));
}
}
});
Pin/DynamoRIO Detection
Code Integrity / Self-Hashing
uint32_t crc = compute_crc32(text_start, text_size);
if (crc != EXPECTED_CRC) exit(1);
unsigned char hash[32];
SHA256(function_addr, function_size, hash);
if (memcmp(hash, expected_hash, 32) != 0) exit(1);
Bypasses:
- Hardware breakpoints (don't modify code, DR0-DR3)
- Patch the comparison to always succeed
- Hook the hash function to return expected value
- Emulate instead of debug (Unicorn/Qiling — no code modification)
- Snapshot + restore: dump memory before and after, diff to find checks
Self-checksumming in loops:
void *watchdog(void *arg) {
while (1) {
if (compute_crc32(text_start, text_end - text_start) != saved_crc) {
memset(flag_buffer, 0, flag_len);
exit(1);
}
usleep(100000);
}
}
Bypass: Kill the watchdog thread or patch its sleep to infinite.
Anti-Disassembly Techniques
Opaque Predicates
; Condition that always evaluates the same way but looks data-dependent
mov eax, [some_memory]
imul eax, eax ; x^2
and eax, 1 ; x^2 mod 2 is always 0 for any x
jnz fake_branch ; Never taken, but disassembler doesn't know
; real code here
Identification: Z3/SMT can prove branch is always/never taken.
Junk Bytes / Overlapping Instructions
jmp real_code
db 0xE8 ; Looks like start of CALL to linear disassembler
real_code:
mov eax, 1 ; Real code — disassembler may misalign here
Fix: Switch to graph-mode disassembly (Ghidra/IDA handle this well). Manual: undefine and re-analyze from correct offset.
Jump-in-the-Middle
; Jumps into the middle of a multi-byte instruction
eb 01 ; jmp +1 (skip next byte)
e8 ; fake CALL opcode — disassembler tries to decode as call
90 ; real: NOP (landed here from jmp)
Function Chunking / Scattered Code
Functions split into non-contiguous chunks connected by unconditional jumps. Defeats linear function boundary detection.
Tool: IDA's "Append function tail" or Ghidra's "Create function" at each chunk.
Control Flow Flattening (Advanced)
Beyond basic switch-case (see patterns.md): modern OLLVM variants use:
- Bogus control flow: Fake branches with opaque predicates
- Instruction substitution:
a + b → a - (-b), a ^ b → (a | b) & ~(a & b)
- String encryption: Strings decrypted at runtime, cleared after use
Deobfuscation tools:
- D-810 (IDA plugin): Pattern-based deobfuscation, MBA simplification
- GOOMBA (Ghidra): Automated deobfuscation for OLLVM
- Miasm: Symbolic execution for deobfuscation
- Arybo / SiMBA: MBA expression simplification
Mixed Boolean-Arithmetic (MBA) Identification & Simplification
from simba import simplify_mba
expr = "(a | b) + (a & b) - (~a & b)"
print(simplify_mba(expr))
Comprehensive Bypass Strategies
Universal Bypass Checklist
- Identify all anti-analysis checks — search for:
ptrace, IsDebuggerPresent, rdtsc, cpuid, NtQuery, GetTickCount, CheckRemoteDebuggerPresent, /proc/self, SIGTRAP, alarm
- Static patching — NOP/patch checks with pwntools or Ghidra before running
- LD_PRELOAD (Linux) — hook libc functions returning fake values
- ScyllaHide (Windows x64dbg) — patches PEB, hooks NT functions automatically
- Emulation (Unicorn/Qiling) — no debugger artifacts to detect
- Kernel-level bypass — modify
/proc/sys/kernel/yama/ptrace_scope, use prctl
Layered Anti-Debug (Real-World Pattern)
Many CTF challenges stack multiple checks:
1. TLS callback → IsDebuggerPresent (before main)
2. main() → ptrace(TRACEME)
3. Watchdog thread → timing check + /proc scan
4. Code section → self-CRC32 integrity
5. Signal handler → real logic in SIGSEGV handler
Approach: Identify ALL checks before patching. Patch or hook each one systematically. Run under emulator if too many to patch individually.
Quick Reference: Check to Bypass
| Anti-Debug Check | Platform | Bypass |
|---|
ptrace(TRACEME) | Linux | LD_PRELOAD, patch to ret 0, catch syscall |
IsDebuggerPresent | Windows | ScyllaHide, Frida hook, PEB patch |
NtQueryInformationProcess | Windows | ScyllaHide, hook ntdll |
rdtsc timing | Both | NOP rdtsc, Frida time hook, Pin |
/proc/self/status | Linux | Mount namespace, hook fopen |
alarm(N) | Linux | handle SIGALRM ignore in GDB |
SIGTRAP handler | Linux | handle SIGTRAP nostop pass |
| TLS callback | Windows | Break on TLS in x64dbg, patch |
| DR register scan | Windows | Use software BPs, hook GetThreadContext |
| INT3 scan / CRC | Both | Hardware BPs, patch CRC comparison |
| Frida detection | Both | Early-load gadget, hook strstr |
| CPUID hypervisor | Both | Patch CPUID result, bare metal |
| Thread hiding | Windows | Hook NtSetInformationThread |
CTF Reverse - Compiled Language Reversing (Go, Rust)
Table of Contents
Go Binary Reversing
Go binaries are increasingly common in CTF challenges due to Go's popularity for CLI tools, network services, and malware.
Recognition
file binary | grep -i "go"
strings binary | grep "go.buildid"
strings binary | grep "runtime.gopanic"
strings binary | grep "^go1\."
Key indicators:
- Very large static binary (even "hello world" is ~2MB)
- Embedded
go.buildid string
runtime.* symbols (even in stripped binaries, some remain)
main.main as entry point (not main)
- Strings like
GOROOT, GOPATH, /usr/local/go/src/
Symbol Recovery
Go embeds rich type and function information even in stripped binaries:
./GoReSym -d binary > symbols.json
python3 -c "
import json
with open('symbols.json') as f:
data = json.load(f)
for fn in data.get('UserFunctions', []):
print(f\"{fn['Start']:#x} {fn['FullName']}\")
"
Ghidra with golang-loader:
redress (Go binary analysis):
redress -src binary
redress -pkg binary
redress -type binary
redress -interface binary
Go Memory Layout
Understanding Go's data structures in decompilation:
# String: {pointer, length} (16 bytes on 64-bit)
# NOT null-terminated! Length field is critical.
struct GoString {
char *ptr;
int64 len;
};
# Slice: {pointer, length, capacity} (24 bytes on 64-bit)
struct GoSlice {
void *ptr;
int64 len;
int64 cap;
};
# Interface: {type_descriptor, data_pointer} (16 bytes)
struct GoInterface {
void *type;
void *data;
};
# Map: pointer to runtime.hmap struct
# Channel: pointer to runtime.hchan struct
In Ghidra/IDA: When you see a function taking (ptr, int64) — it's likely a Go string. Three-field (ptr, int64, int64) is a slice.
Goroutine and Concurrency Analysis
strings binary | grep "runtime.newproc"
gdb ./binary
(gdb) source /usr/local/go/src/runtime/runtime-gdb.py
(gdb) info goroutines
(gdb) goroutine 1 bt
Channel operations in disassembly:
runtime.chansend1 → ch <- value
runtime.chanrecv1 → value = <-ch
runtime.selectgo → select { case ... }
runtime.closechan → close(ch)
Common Go Patterns in Decompilation
Defer mechanism:
runtime.deferproc → registers deferred function
runtime.deferreturn → executes deferred functions at function exit
- Deferred calls execute in LIFO order — relevant for cleanup/crypto key wiping
Error handling (the if err != nil pattern):
# In disassembly, this appears as:
# call some_function → returns (result, error) as two values
# test rax, rax → check if error (second return value) is nil
# jne error_handler
String concatenation:
runtime.concatstrings → s1 + s2 + s3
fmt.Sprintf → formatted string building
- Look for format strings in
.rodata: "%s%d", "%x"
Common stdlib patterns in CTF:
Go Binary Reversing Workflow
1. file binary
2. GoReSym -d binary > syms.json
3. strings binary | grep -i flag
4. Load in Ghidra with golang-loader
5. Find main.main
6. Identify string comparisons
7. Trace crypto operations
8. Check for embedded resources
Go embed.FS (Go 1.16+): Binaries can embed files at compile time:
strings binary | grep "embed"
Key insight: Go's runtime embeds extensive metadata even in stripped binaries. Use GoReSym before any manual analysis — it often recovers 90%+ of function names, making decompilation dramatically easier. Go strings are {ptr, len} tuples, not null-terminated — Ghidra's default string analysis will miss them without the golang-loader plugin.
Detection: Large static binary (2MB+ for simple programs), go.buildid, runtime.gopanic, source paths like /home/user/go/src/.
Rust Binary Reversing
Rust binaries are common in modern CTFs, especially for crypto, systems, and security tooling challenges.
Recognition
strings binary | grep -c "rust"
strings binary | grep "rustc"
strings binary | grep "/rustc/"
strings binary | grep "core::panicking"
Key indicators:
core::panicking::panic in strings
- Mangled symbols starting with
_ZN (Itanium ABI) — e.g., _ZN4main4main17h...
.rustc section in ELF
- References to
/rustc/<commit_hash>/library/
- Large binary size (Rust statically links by default)
Symbol Demangling
cargo install rustfilt
nm binary | rustfilt | grep "main"
nm binary | c++filt | grep "main"
Common Rust Patterns in Decompilation
Option/Result enum:
# Option<T> in memory: {discriminant (0=None, 1=Some), value}
# Result<T, E>: {discriminant (0=Ok, 1=Err), union{ok_val, err_val}}
# In disassembly:
# cmp byte [rbp-0x10], 0 → check if None/Err
# je handle_none_case
Vec (same as Go slice):
struct RustVec {
void *ptr;
uint64 cap;
uint64 len;
};
String / &str:
# String (owned): {ptr, capacity, length} — 24 bytes, heap-allocated
# &str (borrowed): {ptr, length} — 16 bytes, can point anywhere
# In decompilation, look for:
# alloc::string::String::from → String creation
# core::str::from_utf8 → byte slice to str
Iterator chains:
# .iter().map().filter().collect() compiles to loop fusion
# In disassembly: tight loop with inlined closures
# Look for: core::iter::adapters::map, filter, etc.
Panic unwinding:
strings binary | grep "panicked at"
strings binary | grep "called .unwrap().. on"
Rust-Specific Analysis Tools
cargo install cargo-bloat
cargo bloat --release -n 50
Key insight: Rust panic messages are goldmines — they contain source file paths, line numbers, and descriptive error strings even in release builds. Always strings binary | grep "panicked" first. Rust's monomorphization means generic functions get duplicated per type — expect many similar-looking functions.
Detection: core::panicking, .rustc section, /rustc/ paths, _ZN mangled symbols with Rust-style module paths.
Swift Binary Reversing
See platforms.md for full Swift reversing guide including demangling, runtime structures, and Ghidra integration. Key quick reference:
strings binary | grep "swift"
otool -l binary | grep "swift"
swift demangle 's14MyApp0A8ClassC10checkInput6resultSbSS_tF'
Detection: __swift5_* sections in Mach-O, swift_ runtime symbols, s prefix in mangled names.
Kotlin / JVM Binary Reversing
Kotlin compiles to JVM bytecode or native (via Kotlin/Native). Common in Android and server-side CTF.
JVM Bytecode (Android/Server)
strings classes.dex | grep "kotlin"
jadx classes.dex
cfr classes.jar --kotlin
fernflower classes.jar output/
Kotlin coroutines in disassembly:
# Coroutines compile to state machines:
# invokeSuspend(result) {
# switch (this.label) {
# case 0: this.label = 1; return suspendFunction();
# case 1: processResult(result); return Unit;
# }
# }
# Each suspend point becomes a state in the switch.
# Follow the state machine to understand async flow.
Kotlin/Native
strings binary | grep "konan"
Detection: kotlin.Metadata annotations (JVM), konan strings (Native), kotlin/ package paths.
C++ Binary Reversing (Quick Reference)
While C++ RE is well-covered by general tools, these patterns are CTF-specific:
vtable Reconstruction
# Virtual function tables (vtables):
# First 8 bytes of object → pointer to vtable
# vtable entries: [typeinfo_ptr, destructor, method1, method2, ...]
# In Ghidra: Data → Create Pointer at vtable address
# Identify polymorphic dispatch:
# mov rax, [rdi] # Load vtable from this pointer
# call [rax + 0x18] # Call 4th virtual method (0x18/8 = 3rd after typeinfo+dtor)
RTTI (Run-Time Type Information)
strings binary | grep -E "^[0-9]+[A-Z]"
c++filt _ZTI7MyClass
Standard Library Patterns
std::string (libstdc++):
SSO (Small String Optimization): inline buffer for ≤15 chars
Layout: {char* ptr, size_t size, union{size_t cap, char buf[16]}}
std::vector<T>:
{T* begin, T* end, T* capacity_end}
std::map<K,V>:
Red-black tree: each node has {left, right, parent, color, key, value}
std::unordered_map<K,V>:
Hash table: {bucket_array, size, load_factor_max, ...}
.pyc PEP-552 Magic-Header Forgery (source: pwn.college AoP 2025 day 9)
Trigger: loader validates the first 16 bytes of a .pyc (magic + timestamp + source-hash tag) but doesn't verify the bytecode body.
Signals: custom importlib.abc.Loader that checks data[:16]; no hashlib call on data[16:].
Mechanic: Python 3.7+ PEP-552 pyc layout: 4B magic | 4B flags | 8B (timestamp+size OR hash). Prepend the expected header verbatim, append attacker bytecode, loader accepts. Trivial template:
import importlib.util, marshal
magic = importlib.util.MAGIC_NUMBER
with open('out.pyc','wb') as f:
f.write(magic + b'\x00\x00\x00\x00' + b'\x00'*8 + marshal.dumps(my_code))
Go Interface/itab Restore via GoReSym (source: HTB University 2025 Starshard Reassembly)
Trigger: Go binary with interface-typed method calls (itab tables); standard strings/objdump yield little; Go runtime typelinks intact.
Signals: runtime.itab symbol present; go.funcinfo.* section; dispatch through [itab+OFF].
Mechanic: run GoReSym or go-symbol-restore — recovers the typelinks table and resolves each itab to its concrete type + method set. Feed into Ghidra with type propagation to see virtual dispatch as plain calls. Pattern applies to any stripped Go 1.18+ binary.
eBPF kprobe FSM Gated by Syscall Sequence (source: pwn.college AoP 2025 day 4)
Trigger: eBPF attached to a kprobe that mutates a BPF_MAP_TYPE_HASH based on syscall arg hashes; flag only releases when map reaches a specific state.
Signals: bpftool prog list non-standard entry; bpftool prog dump xlated id N shows state-machine transitions.
Mechanic: see ctf-pwn/sandbox-escape.md cross-ref for the same technique from the pwn angle. For RE: lift bytecode via angr's bpf-ir loader, symbolic-execute to find a sequence of (syscall, arg) tuples reaching accept state.
CTF Reverse - Language & Platform-Specific Techniques
Table of Contents
For Go and Rust binary reversing, see languages-compiled.md.
Python Bytecode Reversing (dis.dis output)
Common Pattern: XOR Validation with Split Indices
Challenge gives raw CPython bytecode (dis.dis disassembly). Common pattern:
- Check flag length
- XOR chars at even indices with key1, compare to list p1
- XOR chars at odd indices with key2, compare to list p2
Reversing:
flag = [''] * flag_length
for i in range(len(p1)):
flag[2*i] = chr(p1[i] ^ key1)
flag[2*i+1] = chr(p2[i] ^ key2)
print(''.join(flag))
Bytecode Analysis Tips
LOAD_CONST followed by COMPARE_OP reveals expected values
BINARY_XOR identifies the transformation
BUILD_TUPLE/BUILD_LIST with constants = expected output array
- Loop structure:
FOR_ITER + BINARY_SUBSCR = iterating over flag chars
CALL_FUNCTION on ord = character-to-int conversion
Python Opcode Remapping
Identification
Decompiler fails with opcode errors.
Recovery
- Find modified
opcode.pyc in PyInstaller bundle
- Compare with original Python opcodes
- Build mapping:
{new_opcode: original_opcode}
- Patch target .pyc
- Decompile normally
Shortcut (Hack.lu CTF 2013): If the challenge bundles its own modified Python interpreter (e.g., a custom ./py binary), install uncompyle2/uncompyle6 into that interpreter's environment and decompile using the challenge's own runtime. The modified interpreter understands its own opcode mapping, so standard decompilation tools work without manual opcode recovery.
Pyarmor 8/9 Static Unpack (1shot)
- Tool:
Lil-House/Pyarmor-Static-Unpack-1shot
- Use for Pyarmor 8.x/9.x armored scripts without executing sample code
- Quick signature check: payload typically starts with
PY + six digits (Pyarmor 7 and earlier PYARMOR format is not supported)
Workflow:
- Ensure target directory contains armored scripts and matching
pyarmor_runtime library.
- Run one-shot unpack to emit
.1shot. outputs (disassembly + experimental decompile).
- Treat disassembly as ground truth; verify decompiled source with bytecode when inconsistent.
python /path/to/oneshot/shot.py /path/to/scripts
Optional flags:
python /path/to/oneshot/shot.py /path/to/scripts -r /path/to/pyarmor_runtime.so
python /path/to/oneshot/shot.py /path/to/scripts -o /path/to/output
Notes:
oneshot/pyarmor-1shot executable must exist before running shot.py.
- PyInstaller bundles or archives should be unpacked first, then processed with 1shot.
DOS Stub Analysis
PE files can hide code in DOS stub:
- Check for large DOS stub in Ghidra/IDA
- Run in DOSBox
- Load in IDA as 16-bit DOS
- Look for
int 16h (keyboard input)
Unity IL2CPP Games
- Use Il2CppDumper to dump symbols
- If Il2CppDumper fails, consider that
global-metadata.dat may be encrypted; search strings/xrefs in the main binary and inspect the metadata loading path for custom decryption before dump.
- Look for
Start() functions
- Key derivation:
key = SHA256(companyName + "\n" + productName)
- Decrypt server responses with derived key
Please note most of that the executable file for the PC platform is GameAssembly.dll or *Assembly.dll, for the Android is libil2cpp.so.
HarmonyOS HAP/ABC Reverse (abc-decompiler)
- Target files:
.hap package and embedded .abc bytecode
- Tool:
https://github.com/ohos-decompiler/abc-decompiler
- Download
jadx-dev-all.jar from releases
Critical startup note:
java -jar may enter GUI mode
- For CLI mode, always use:
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI [options] <input>
Most common commands:
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI -d "out" ".abc"
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI -m simple -d "out_hap" "modules.abc"
Recommended parameters for this challenge:
-m simple: reduce high-level reconstruction to avoid SSA/PHI-heavy failures
--log-level ERROR: keep only critical errors
- Full recommended command:
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI -m simple --log-level ERROR -d "out_abc_simple" "modules.abc"
Parameter quick reference:
-d output directory
--help help
Notes:
.hap is a package: extract it first (zip), then locate and analyze .abc
- Quote paths containing spaces or non-ASCII characters
- Use a new output directory name per run to avoid stale results
- Errors do not always mean full failure; prioritize
out_xxx/sources/
- If
auto fails, switch to -m simple first
Standard workflow:
- Run with
-m simple --log-level ERROR
- Inspect key business files in output (for example
pages/Index.java)
- If cleaner output is needed, retry with
-m auto or -m restructure
- If some methods still fail, keep the
simple output and continue logic analysis via alternate paths
Brainfuck/Esolangs
- Check if compiled with known tools (BF-it)
- Understand tape/memory model
- Static analysis of cell operations
UEFI Binary Analysis
7z x firmware.bin -oextracted/
file extracted/* | grep "PE32+"
- Bootkit replaces boot loader
- Custom VM protects decryption
- Lift VM bytecode to C
Transpilation to C
For heavily obfuscated code:
for opcode, args in instructions:
if opcode == 'XOR':
print(f"r{args[0]} ^= r{args[1]};")
elif opcode == 'ADD':
print(f"r{args[0]} += r{args[1]};")
Compile with -O3 for constant folding.
Code Coverage Side-Channel Attack
Pattern (Coverup, Nullcon 2026): PHP challenge provides XDebug code coverage data alongside encrypted output.
How it works:
- PHP code uses
xdebug_start_code_coverage(XDEBUG_CC_UNUSED | XDEBUG_CC_DEAD_CODE | XDEBUG_CC_BRANCH_CHECK)
- Encryption uses data-dependent branches:
if ($xored == chr(0)) ... if ($xored == chr(1)) ...
- Coverage JSON reveals which branches were executed during encryption
- This leaks the set of XOR intermediate values that occurred
Exploitation:
import json
with open('coverage.json') as f:
cov = json.load(f)
executed_xored = set()
for line_no, hit_count in cov['encrypt.php']['lines'].items():
if hit_count > 0:
executed_xored.add(extract_value_from_line(line_no))
for pos in range(len(ciphertext)):
candidates = []
for key_byte in range(256):
xored = plaintext_byte ^ key_byte
if xored in executed_xored:
candidates.append(key_byte)
Key insight: Code coverage is a powerful oracle — it tells you which conditional paths were taken. Any encryption with data-dependent branching leaks information through coverage.
Mitigation detection: Look for branchless/constant-time crypto implementations that defeat this attack.
Functional Language Reversing (OPAL)
Pattern (Opalist, Nullcon 2026): Binary compiled from OPAL (Optimized Applicative Language), a purely functional language.
Recognition markers:
.impl (implementation) and .sign (signature) source files
IMPLEMENTATION / SIGNATURE keywords
- Nested
IF..THEN..ELSE..FI structures
- Functions named
f1, f2, ... fN (numeric naming)
- Heavy use of
seq[nat], string, denotation types
Reversing approach:
- Pure functions are mathematically invertible — reverse each step in the pipeline
- Identify the transformation chain:
f_final(f_n(...f_2(f_1(input))...))
- For each function, build the inverse
Aggregate brute-force for scramble functions:
When a transformation accumulates state that depends on original (unknown) values:
decoded = base64_decode(target)
for total_offset_S in range(256):
candidate = [(b - total_offset_S) % 256 for b in decoded]
recomputed_S = sum(contribution(i, candidate[i]) for i in range(len(candidate))) % 256
if recomputed_S == total_offset_S:
result = apply_inverse_substitution(candidate)
if all(32 <= c < 127 for c in result):
print(bytes(result))
Key lesson: When a scramble function has a chicken-and-egg dependency (result depends on original, which is unknown), brute-force the aggregate effect (often mod 256 = 256 possibilities) rather than all possible states (exponential).
Python Version-Specific Bytecode (VuwCTF 2025)
Pattern (A New Machine): Challenge targets specific Python version (e.g., 3.14.0 alpha).
Key requirement: Compile that exact Python version to disassemble bytecode — alpha/beta versions have different opcodes than stable releases.
wget https://www.python.org/ftp/python/3.14.0/Python-3.14.0a4.tar.xz
tar xf Python-3.14.0a4.tar.xz
cd Python-3.14.0a4 && ./configure && make -j$(nproc)
./python -c "import dis, marshal; dis.dis(marshal.loads(open('challenge.pyc','rb').read()[16:]))"
Common validation: Flag compared against tuple of squared ASCII values:
import math
flag = ''.join(chr(int(math.isqrt(v))) for v in expected_values)
Non-Bijective Substitution Cipher Reversing
Pattern (Coverup, Nullcon 2026): S-box/substitution table has collisions (multiple inputs map to same output).
Detection:
sbox = [...]
if len(set(sbox)) < len(sbox):
print("Non-bijective! Collisions exist.")
Building reverse lookup:
from collections import defaultdict
rev_sub = defaultdict(list)
for i, v in enumerate(sbox):
rev_sub[v].append(i)
Disambiguation strategies:
- Known plaintext format (e.g.,
ENO{, flag{) fixes key bytes at known positions
- Side-channel data (code coverage, timing) eliminates impossible candidates
- Printable ASCII constraint (32-126) reduces candidate space
- Re-encrypt candidates and verify against known ciphertext
Roblox Place File Analysis
Pattern (MazeRunna, 0xFun 2026): Roblox game with flag hidden in older version; latest version contains decoy.
Version history via Asset Delivery API:
curl -H "Cookie: .ROBLOSECURITY=..." \
"https://assetdelivery.roblox.com/v2/assetId/{placeId}/version/1"
Binary format parsing: .rbxlbin files contain chunks:
- INST — class buckets and referent IDs
- PROP — per-instance fields (including
Script.Source)
- PRNT — parent-child relationships (object tree)
Decode chunk payloads, walk PROP entries for Source field, dump Script.Source / LocalScript.Source per version, then diff.
Key lesson: Always check version history. Latest version may contain decoy flag while real flag is in an older version. Diff script sources across versions.
Godot Game Asset Extraction
Pattern (Steal the Xmas): Encrypted Godot .pck packages.
Tools:
- gdsdecomp - Extract Godot packages
- KeyDot - Extract encryption key from Godot executables
Workflow:
- Run KeyDot against game executable → extract encryption key
- Input key into gdsdecomp
- Extract and open project in Godot editor
- Search scripts/resources for flag data
Rust serde_json Schema Recovery
Pattern (Curly Crab, PascalCTF 2026): Rust binary reads JSON from stdin, deserializes via serde_json, prints success/failure emoji.
Approach:
- Disassemble serde-generated
Visitor implementations
- Each visitor's
visit_map / visit_seq reveals expected keys and types
- Look for string literals in deserializer code (field names like
"pascal", "CTF")
- Reconstruct nested JSON schema from visitor call hierarchy
- Identify value types from visitor method names:
visit_str = string, visit_u64 = number, visit_bool = boolean, visit_seq = array
{"pascal":"CTF","CTF":2026,"crab":{"I_":true,"cr4bs":1337,"crabby":{"l0v3_":["rust"],"r3vv1ng_":42}}}
Key insight: Flag is the concatenation of JSON keys in schema order. Reading field names in order reveals the flag.
Verilog/Hardware Reverse Engineering (srdnlenCTF 2026)
Pattern (Rev Juice): Verilog HDL source for a vending machine with hidden product unlocked by specific coin insertion and selection sequence.
Approach:
- Analyze Verilog modules to understand state machine and history tracking
- Identify hidden conditions (e.g., product 8 enabled only when
COINS_HISTORY array has specific values at specific taps)
- Build timing model for each action type (how many clock cycles each operation takes)
- Work backward from required history values to construct the correct input sequence
Timing model construction:
TIMING = {
"insert_coin": 3,
"select_success": 7,
"select_fail": 5,
"cancel_with_coins": 4,
"cancel_at_zero": 2,
}
Key insight: Hardware challenges require understanding the exact timing model — each operation takes a specific number of clock cycles, and shift registers record history at fixed tap positions. Work backward from the required tap values to determine what action must have occurred at each cycle. The solution is often a specific sequence notation (e.g., I9C_SP6_CNL_I2C_SP2_I6C_SP6_SP6_SP5_CNL_I4C_SP1).
Detection: Look for .v or .sv (Verilog/SystemVerilog) files, always @(posedge clk) blocks, shift register patterns, and state machine case statements with hidden conditions gated on history values.
Prefix-by-Prefix Hash Reversal (Nullcon 2026)
See patterns-ctf-2.md for the full technique. This section covers language-specific considerations.
Language-specific notes:
- Hash algorithm may be uncommon (MD2, custom) — don't need to identify it, just match outputs by running the binary
- Use
subprocess.run() with timeout=2 to handle binaries that hang on bad input
- For stripped binaries, check if
ltrace reveals the hash function name (e.g., MD2_Update)
Android JNI RegisterNatives Obfuscation (HTB WonderSMS)
Pattern: Android app loads native library with System.loadLibrary(), but uses RegisterNatives in JNI_OnLoad instead of standard JNI naming convention (Java_com_pkg_Class_method). This hides which C++ function handles each Java native method.
Identification:
static { System.loadLibrary("audio"); }
private final native ProcessedMessage processMessage(SmsMessage msg);
Standard JNI would have a symbol Java_com_rloura_wondersms_SmsReceiver_processMessage. If that symbol is missing from the .so, RegisterNatives is being used.
Finding the real handler in Ghidra:
- Locate
JNI_OnLoad (exported symbol, always present)
- Trace to
RegisterNatives(env, clazz, methods, count) call
- The
methods array contains {name, signature, fnPtr} structs
- Follow
fnPtr to find the actual native function
static JNINativeMethod methods[] = {
{"processMessage", "(Landroid/telephony/SmsMessage;)LProcessedMessage;", (void*)real_handler}
};
(*env)->RegisterNatives(env, clazz, methods, 1);
Architecture selection for analysis:
unzip WonderSMS.apk -d extracted/
ls extracted/lib/x86_64/
Key insight: RegisterNatives is a deliberate obfuscation technique — it decouples Java method names from native symbol names, making it impossible to find handlers by string search alone. Always check JNI_OnLoad first when reversing Android native libraries with stripped symbols.
Detection: Native method declared in Java + no matching JNI symbol in .so + JNI_OnLoad present. The library is typically stripped (no debug symbols).
Ruby/Perl Polyglot Constraint Satisfaction (BearCatCTF 2026)
Pattern (Polly's Key): A single file valid in both Ruby and Perl. Each language imposes different validation constraints on a 50-character key. Satisfy both simultaneously to decrypt the flag.
Polyglot structure exploits:
- Ruby:
=begin...=end is a block comment
- Perl:
=begin...=cut is POD (Plain Old Documentation), =end is ignored
- Different code runs in each language based on comment block boundaries
Typical constraints:
- Ruby: Character set must form a mathematical property (e.g., all 50 printable ASCII chars except
^ used exactly once, each satisfying XOR(val, (val-16) % 257) is a primitive root mod 257)
- Perl: Ordering constraint via insertion sort inversion count (hardcoded inversion table determines exact permutation)
Solution approach:
- Find the valid character set (mathematical constraint from one language)
- Use the ordering constraint (from other language) to determine exact arrangement
- Compute key hash (e.g., MD5) and decrypt
def reconstruct_from_inversions(chars, inv_counts):
result = []
remaining = sorted(chars)
for i in range(len(chars) - 1, -1, -1):
idx = inv_counts[i]
result.insert(idx, remaining.pop(i))
return result
Key insight: Polyglot files exploit language-specific comment/block syntax to run different code in each interpreter. The constraints from both languages intersect to uniquely determine the key. Identify which code runs in which language by testing the file with both interpreters and comparing behavior.
Detection: File that runs under multiple interpreters (ruby file && perl file). Challenge mentions "polyglot" or provides a file ending in .rb that also looks like Perl.
Electron App + Native Binary Reversing (RootAccess2026)
Pattern (Rootium Browser): Electron desktop app bundles a native ELF/DLL binary for sensitive operations (vault, crypto, auth). The Electron layer is a wrapper; the real flag logic is in the native binary.
Extraction workflow:
- Unpack Electron ASAR archive:
npm install -g @electron/asar
asar extract resources/app.asar app_extracted/
ls app_extracted/
- Locate native binary: Search for ELF/DLL files called from JavaScript:
find app_extracted/ -name "*.node" -o -name "*.so" -o -name "*vault*" -o -name "*auth*"
grep -r "spawn\|execFile\|ffi\|require.*native" app_extracted/
- Reverse the native binary (XOR + rotation cipher example):
def decrypt_password(encrypted_bytes, key):
"""Common pattern: XOR with constant + bit rotation + key XOR."""
result = []
for i, byte in enumerate(encrypted_bytes):
decrypted = ((byte ^ 0x42) >> 3) ^ key[i % len(key)]
result.append(chr(decrypted))
return ''.join(result)
def decrypt_flag(encrypted_flag, password):
"""Flag uses password as key with position-dependent rotation."""
result = []
for i, byte in enumerate(encrypted_flag):
key_byte = ord(password[i % len(password)])
decrypted = ((byte ^ 0x7E) >> (i % 8)) ^ key_byte
result.append(chr(decrypted))
return ''.join(result)
Key insight: Electron apps are JavaScript wrapping native code. Extract with asar, then focus on the native binary. The JS layer often contains the password verification flow in plaintext, revealing what the native binary expects. Look for encrypted data in the .data or .rodata sections of the ELF.
Detection: .asar files in resources/ directory, Electron framework files, package.json with electron dependency.
Node.js npm Package Runtime Introspection (RootAccess2026)
Pattern (RootAccess CLI): Obfuscated npm package with RC4 encoding, control flow flattening, and flag split across multiple fragments. Static analysis is impractical — use runtime introspection instead.
Dynamic analysis approach:
#!/usr/bin/env node
const cryptoMod = require('target-package/dist/lib/crypto.js');
const vaultMod = require('target-package/dist/lib/vault.js');
for (const mod of [cryptoMod, vaultMod]) {
for (const key of Object.keys(mod)) {
const obj = mod[key];
console.log(`Export: ${key}`);
const props = Object.getOwnPropertyNames(obj);
const proto = Object.getOwnPropertyNames(obj.prototype || {});
console.log(' Own:', props);
console.log(' Proto:', proto);
}
}
const Engine = cryptoMod.CryptoEngine;
const total = Engine.getTotalFragments();
let flag = '';
for (let i = 1; i <= total; i++) {
flag += Engine.getFragment(i);
}
console.log('Flag:', flag);
const hidden = Object.getOwnPropertyNames(Engine)
.filter(p => p.startsWith('__') || p.startsWith('_'));
console.log('Hidden methods:', hidden);
Key insight: Heavily obfuscated JavaScript (control flow flattening, RC4 string encoding, dead code) makes static analysis prohibitively slow. Runtime introspection via Object.getOwnPropertyNames() reveals all methods including hidden ones. The module's own decryption runs automatically when loaded — just call the decoded functions directly.
Detection: npm package with minified/obfuscated dist/ directory, challenge says "reverse engineer the CLI tool", package.json with custom commands.
CTF Reverse - Competition-Specific Patterns (Part 2)
Table of Contents
Multi-Layer Self-Decrypting Binary (DiceCTF 2026)
Pattern (another-onion): Binary with N layers (e.g., 256), each reading 2 key bytes, deriving keystream via SHA-256 NI instructions, XOR-decrypting the next layer, then jumping to it. Must solve within a time limit (e.g., 30 minutes).
Oracle for correct key: Wrong key bytes produce garbage code. Correct key bytes produce code with exactly 2 call read@plt instructions (next layer's reads). Brute-force all 65536 candidates per layer using this oracle.
JIT execution approach (fastest):
void *text = mmap((void*)0x400000, text_size, PROT_RWX, MAP_FIXED|MAP_PRIVATE, fd, 0);
void *bss = mmap((void*)bss_addr, bss_size, PROT_RW, MAP_FIXED|MAP_SHARED, shm_fd, 0);
for (int candidate = 0; candidate < 65536; candidate++) {
pid_t pid = fork();
if (pid == 0) {
mmap(bss_addr, bss_size, PROT_RW, MAP_FIXED|MAP_PRIVATE, shm_fd, 0);
inject_key(candidate >> 8, candidate & 0xff);
((void(*)())layer_addr)();
if (count_read_calls(next_layer_addr) == 2) signal_found(candidate);
_exit(0);
}
}
Performance tiers:
| Approach | Speed | 256-layer estimate |
|---|
| Python subprocess | ~2/s | days |
| Ptrace fork injection | ~119/s | 6+ hours |
| JIT + fork-per-candidate | ~1000/s | 140 min |
| JIT + shared BSS + 32 workers | ~3500/s | ~17 min |
Shared BSS optimization: BSS (16MB+) stored in /dev/shm as MAP_SHARED in parent. Children remap as MAP_PRIVATE for COW. Reduces fork overhead from 16MB page-table setup to ~4KB.
Key insight: Multi-layer decryption challenges are fundamentally about building fast brute-force engines. JIT execution (mapping binary memory into solver, running code directly as function calls) is orders of magnitude faster than ptrace. Fork-based COW provides free memory isolation per candidate.
Gotchas:
- Real binary may use
call (0xe8) instead of jmp (0xe9) for layer transitions — adjust tail patching
- BSS may extend beyond ELF MemSiz via kernel brk mapping — map extra space
- SHA-NI instructions work even when not advertised in
/proc/cpuinfo
Embedded ZIP + XOR License Decryption (MetaCTF 2026)
Pattern (License To Rev): Binary requires a license file as argument. Contains an embedded ZIP archive with the expected license, and an XOR-encrypted flag.
Recognition:
strings reveals EMBEDDED_ZIP and ENCRYPTED_MESSAGE symbols
- Binary is not stripped —
nm or readelf -s shows data symbols in .rodata
file shows PIE executable, source file named licensed.c
Analysis workflow:
- Find data symbols:
readelf -s binary | grep -E "EMBEDDED|ENCRYPTED|LICENSE"
- Extract embedded ZIP:
import struct
with open('binary', 'rb') as f:
data = f.read()
zip_start = data.find(b'PK\x03\x04')
open('embedded.zip', 'wb').write(data[zip_start:zip_start+384])
- Extract license from ZIP:
unzip embedded.zip
- XOR decrypt the flag:
license = open('license.txt', 'rb').read()
enc_msg = open('encrypted_msg.bin', 'rb').read()
flag = bytes(a ^ b for a, b in zip(enc_msg, license))
print(flag.decode())
Key insight: No need to run the binary or bypass the expiry date check. The embedded ZIP and encrypted message are both in .rodata — extract and XOR offline.
Disassembly confirms:
memcmp(user_license, decompressed_embedded_zip, size) — license validation
- Date parsing with
sscanf("%d-%d-%d") on EXPIRY_DATE= field
- XOR loop:
ENCRYPTED_MESSAGE[i] ^ license[i] → putc() per byte
Lesson: When a binary has named symbols (EMBEDDED_*, ENCRYPTED_*), extract data directly from the binary without execution. XOR with known plaintext (the license) is trivially reversible.
Stack String Deobfuscation from .rodata XOR Blob (Nullcon 2026)
Pattern (stack_strings_1/2): Binary mmaps a blob from .rodata, XOR-deobfuscates it, then uses the blob to validate input. Flag is recovered by reimplementing the verification loop.
Recognition:
mmap() call followed by XOR loop over .rodata data
- Verification loop with running state (
eax, ebx, r9) updated with constants like 0x9E3779B9, 0x85EBCA6B, 0xA97288ED
rol32() operations with position-dependent shifts
- Expected bytes stored in deobfuscated buffer
Approach:
- Extract
.rodata blob with pyelftools:
from elftools.elf.elffile import ELFFile
with open(binary, "rb") as f:
elf = ELFFile(f)
ro = elf.get_section_by_name(".rodata")
blob = ro.data()[offset:offset+size]
- Recover embedded constants (length, magic values) by XOR with known keys from disassembly
- Reimplement the byte-by-byte verification loop:
- Each iteration: compute two hash-like values from running state
- XOR them together and with expected byte to recover input byte
- Update running state with constant additions
Variant (stack_strings_2): Adds position permutation + state dependency on previous character:
- Position permutation: byte
i may go to position pos[i] in the output
- State dependency:
need = (expected - rol8(prev_char, 1)) & 0xFF
- Must track
state variable that updates to current character each iteration
Key constants to look for:
0x9E3779B9 (golden ratio fractional, common in hash functions)
0x85EBCA6B (MurmurHash3 finalizer constant)
0xA97288ED (related hash constant)
rol32() with shift i & 7
Prefix Hash Brute-Force (Nullcon 2026)
Pattern (Hashinator): Binary hashes every prefix of the input independently and outputs one digest per prefix. Given N output digests, the flag has N-1 characters.
Attack: Recover input one character at a time:
for pos in range(1, len(target_hashes)):
for ch in charset:
candidate = known_prefix + ch + padding
hashes = run_binary(candidate)
if hashes[pos] == target_hashes[pos]:
known_prefix += ch
break
Key insight: If each prefix hash is independent (no chaining/HMAC), the problem decomposes into N x |charset| binary executions. This is the hash equivalent of byte-at-a-time block cipher attacks.
Detection: Binary outputs multiple hash lines. Changing last character only changes last hash. Different input lengths produce different numbers of output lines.
CVP/LLL Lattice for Constrained Integer Validation (HTB ShadowLabyrinth)
Pattern: Binary validates flag via matrix multiplication where grouped input characters are multiplied by coefficient matrices and checked against expected 64-bit results. Standard algebra fails because solutions must be printable ASCII (32-126). Lattice-based CVP (Closest Vector Problem) with LLL reduction solves this efficiently.
Identification:
- Binary groups input characters (e.g., 4 at a time)
- Each group is multiplied by a coefficient matrix
- Results compared against hardcoded 64-bit values
- Need integer solutions in a constrained range (printable ASCII)
SageMath CVP solver:
from sage.all import *
def solve_constrained_matrix(coefficients, targets, char_range=(32, 126)):
"""
coefficients: list of coefficient rows (e.g., 4 values per group)
targets: expected output values
char_range: valid character range (printable ASCII)
"""
n = len(coefficients[0])
mid = (char_range[0] + char_range[1]) // 2
M = matrix(ZZ, n + len(targets), n + len(targets))
scale = 1000
for i, row in enumerate(coefficients):
for j, c in enumerate(row):
M[j, i] = c
M[n + i, i] = 1
for j in range(n):
M[j, len(targets) + j] = scale
target_vec = vector(ZZ, [t - sum(c * mid for c in row)
for row, t in zip(coefficients, targets)]
+ [0] * n)
L = M.LLL()
closest = L * L.solve_left(target_vec)
solution = [closest[len(targets) + j] // scale + mid for j in range(n)]
return bytes(solution)
Two-phase validation pattern:
- Phase 1 (matrix math): Solve via CVP/LLL → recovers first N characters
- First N characters become AES key → decrypt
file.bin (XOR last 16 bytes + AES-256-CBC + zlib decompress)
- Phase 2 (custom VM): Decrypted bytecode runs in custom VM, validates remaining characters via another linear system (mod 2^32)
Modular linear system solving (Phase 2 — VM validation):
import numpy as np
from sympy import Matrix
M_mod = Matrix(coefficients) % (2**32)
v_mod = Matrix(targets) % (2**32)
solution = M_mod.solve(v_mod)
Key insight: When a binary validates input through linear combinations with large coefficients and the solution must be in a small range (printable ASCII), this is a lattice problem in disguise. LLL reduction + CVP finds the nearest lattice point, recovering the constrained solution. Cross-reference: invoke /ctf-crypto for LLL/CVP fundamentals (advanced-math.md in ctf-crypto).
Detection: Binary performs matrix-like operations on grouped input, compares against 64-bit constants, and a brute-force search space is too large (e.g., 256^4 per group × 12 groups).
Decision Tree Function Obfuscation (HTB WonderSMS)
Pattern: Binary routes input through ~200+ auto-generated functions, each computing a polynomial expression from input positions, comparing against a constant, and branching left/right. The tree makes static analysis impractical without scripted extraction.
Identification:
- Large number of similar functions with random-looking names (e.g.,
f315732804)
- Each function computes arithmetic on specific input positions
- Functions call other tree functions or a final validation function
- Decompiled code shows
if (expr cmp constant) call_left() else call_right()
Ghidra headless scripting for mass extraction:
from ghidra.program.model.listing import *
from ghidra.program.model.symbol import *
fm = currentProgram.getFunctionManager()
results = []
for func in fm.getFunctions(True):
name = func.getName()
if name.startswith('f') and name[1:].isdigit():
inst_iter = currentProgram.getListing().getInstructions(func.getBody(), True)
for inst in inst_iter:
if inst.getMnemonicString() == 'CMP':
operand = inst.getOpObjects(1)
if operand:
results.append((name, int(operand[0].getValue())))
Constraint propagation from known output format:
- Start from known output bytes (e.g.,
http://HTB{...}) → fix several input positions
- Fixed positions cascade through arithmetic constraints → determine dependent positions
- Tree root equation pins down remaining free variables
- Recognize English words in partial flag to disambiguate multiple solutions
Key insight: Auto-generated decision trees look overwhelming but are repetitive by construction. Script the extraction (Ghidra, Binary Ninja, radare2) rather than reversing each function manually. The tree is just a dispatcher — the real logic is in the leaf function and its constraints.
Detection: Binary with hundreds of similarly-structured functions, 3-5 input position references per function, branching to two other functions or a common leaf.
GLSL Shader VM with Self-Modifying Code (ApoorvCTF 2026)
Pattern (Draw Me): A WebGL2 fragment shader implements a Turing-complete VM on a 256x256 RGBA texture. The texture is both program memory and display output.
Texture layout:
- Row 0: Registers (pixel 0 = instruction pointer, pixels 1-32 = general purpose)
- Rows 1-127: Program memory (RGBA = opcode, arg1, arg2, arg3)
- Rows 128-255: VRAM (display output)
Opcodes: NOP(0), SET(1), ADD(2), SUB(3), XOR(4), JMP(5), JNZ(6), VRAM-write(7), STORE(8), LOAD(9). 16 steps per frame.
Self-modifying code: Phase 1 (decryption) uses STORE opcode to XOR-patch program memory that Phase 2 (drawing) then executes. The decryption overwrites SET instructions with correct pixel color values before the drawing code runs.
Why GPU rendering fails: The GPU runs all pixels in parallel per frame, but the shader tracks only ONE write target per pixel per frame. With multiple VRAM writes per frame, only the last survives — losing 75%+ of pixels. Similarly, STORE patches conflict during parallel decryption.
Solve via sequential emulation:
from PIL import Image
import numpy as np
img = Image.open('program.png').convert('RGBA')
state = np.array(img, dtype=np.int32).copy()
regs = [0] * 33
x, y = start_x, start_y
while True:
r, g, b, a = state[y][x]
opcode = int(r)
if opcode == 1: regs[g] = b & 255
elif opcode == 4: regs[g] = regs[b] ^ regs[a]
elif opcode == 8:
tx, ty = regs[g], regs[b]
state[ty][tx] = [regs[a], regs[a+1], regs[a+2], regs[a+3]]
elif opcode == 5: break
x += 1
if x > 255: x, y = 0, y + 1
vram = np.zeros((128, 256), dtype=np.uint8)
Image.fromarray(vram, mode='L').save('output.png')
Key insight: GLSL shaders are Turing-complete but GPU parallelism causes write conflicts. Self-modifying code (STORE patches) compounds the problem — patches from parallel executions overwrite each other. Sequential emulation in Python recovers the full output. The program.png file IS the bytecode.
Detection: WebGL/shader challenge with a PNG "program" file, challenge says "nothing renders" or output is garbled. Look for custom opcode tables in GLSL source.
GF(2^8) Gaussian Elimination for Flag Recovery (ApoorvCTF 2026)
Pattern (Forge): Stripped binary performs Gaussian elimination over GF(2^8) (Galois Field with 256 elements, using the AES polynomial). A matrix and augmentation vector are embedded in .rodata. The solution vector is the flag.
GF(2^8) arithmetic with AES polynomial (x^8+x^4+x^3+x+1 = 0x11b):
def gf_mul(a, b):
"""Multiply in GF(2^8) with AES reduction polynomial."""
p = 0
for _ in range(8):
if b & 1:
p ^= a
hi = a & 0x80
a = (a << 1) & 0xff
if hi:
a ^= 0x1b
b >>= 1
return p
def gf_inv(a):
"""Brute-force multiplicative inverse (fine for 256 elements)."""
if a == 0: return 0
for x in range(1, 256):
if gf_mul(a, x) == 1:
return x
return 0
Solving the linear system:
N = 56
for col in range(N):
pivot = next((r for r in range(col, N) if aug[r][col] != 0), -1)
if pivot != col:
aug[col], aug[pivot] = aug[pivot], aug[col]
inv = gf_inv(aug[col][col])
aug[col] = [gf_mul(v, inv) for v in aug[col]]
for row in range(N):
if row == col: continue
factor = aug[row][col]
if factor == 0: continue
aug[row] = [v ^ gf_mul(factor, aug[col][j]) for j, v in enumerate(aug[row])]
flag = bytes(aug[i][N] for i in range(N))
Key insight: GF(2^8) is NOT regular integer arithmetic — addition is XOR, multiplication uses polynomial reduction. The AES polynomial (0x11b) is the most common; look for the constant 0x1b in disassembly. The binary may encrypt the result with AES-GCM afterward, but the raw solution vector (pre-encryption) is the flag.
Detection: Binary with a large matrix in .rodata (N² bytes), XOR-based row operations, constants 0x1b or 0x11b, and flag length matching sqrt of matrix size.
Z3 for Single-Line Python Boolean Circuit (BearCatCTF 2026)
Pattern (Captain Morgan): Single-line Python (2000+ semicolons) validates flag via walrus operator chains decomposing input as a big-endian integer, with bitwise operations producing a boolean circuit.
Identification:
- Single-line Python with semicolons separating statements
- Walrus operator
:= chains: (x := expr)
- Obfuscated XOR:
(x | i) & ~(x & i) instead of x ^ i
- Input treated as a single large integer, decomposed via bit-shifting
Z3 solution:
from z3 import *
n_bytes = 29
ari = BitVec('ari', n_bytes * 8)
s = Solver()
s.add(bfu == 0)
if s.check() == sat:
m = s.model()
val = m[ari].as_long()
flag = val.to_bytes(n_bytes, 'big').decode('ascii')
Key insight: Single-line Python obfuscation creates a boolean circuit over input bits. The walrus operator chains are just variable assignments — split on semicolons and translate each to Z3 symbolically. Obfuscated XOR (a | b) & ~(a & b) is just a ^ b. Z3 solves these circuits in under a second. Look for __builtins__ access or ord()/chr() calls to identify the input→integer conversion.
Detection: Single-line Python with 1000+ semicolons, walrus operators, bitwise operations, and a final comparison to 0 or True.
Sliding Window Popcount Differential Propagation (BearCatCTF 2026)
Pattern (Treasure Hunt 4): Binary validates input via expected popcount (number of set bits) for each position of a 16-bit sliding window over the input bits.
Differential propagation:
When the window slides by 1 bit:
popcount(window[i+1]) - popcount(window[i]) = bit[i+16] - bit[i]
So: bit[i+16] = bit[i] + (data[i+1] - data[i])
expected = [...]
total_bits = 337 + 15
for start_val in range(0x10000):
if bin(start_val).count('1') != expected[0]:
continue
bits = [0] * total_bits
for j in range(16):
bits[j] = (start_val >> (15 - j)) & 1
valid = True
for i in range(len(expected) - 1):
new_bit = bits[i] + (expected[i + 1] - expected[i])
if new_bit not in (0, 1):
valid = False
break
bits[i + 16] = new_bit
if valid:
flag_bytes = bytes(int(''.join(map(str, bits[i:i+8])), 2)
for i in range(0, total_bits, 8))
if b'BCCTF' in flag_bytes or flag_bytes[:5].isascii():
print(flag_bytes.decode(errors='replace'))
break
Key insight: Sliding window popcount differences create a recurrence relation: each new bit is determined by the bit 16 positions back plus the popcount delta. Only the first 16 bits are free (constrained by initial popcount). Brute-force the ~4000-8000 valid initial windows — for each, the entire bit sequence is deterministic. Runs in under a second.
Detection: Binary computing popcount/hamming weight on fixed-size windows. Expected value array with length ≈ input_bits - window_size + 1. Values in array are small integers (0 to window_size).
Morse Code from Keyboard LEDs via ioctl (PlaidCTF 2013)
Pattern: Binary uses ioctl(fd, KDSETLED, value) to blink keyboard LEDs (Num/Caps/Scroll Lock). Timing patterns encode Morse code.
python3 -c "
data = open('binary','rb').read()
data = data[:0x72b] + b'\x90'*5 + data[:0x730] # NOP the ptrace call
open('patched','wb').write(data)
"
strace -e ioctl ./patched 2>&1 | grep KDSETLED > leds.txt
import re
morse_map = {'.-':'A', '-...':'B', '-.-.':'C', '-..':'D', '.':'E',
'..-.':'F', '--.':'G', '....':'H', '..':'I', '.---':'J',
'-.-':'K', '.-..':'L', '--':'M', '-.':'N', '---':'O',
'.--.':'P', '--.-':'Q', '.-.':'R', '...':'S', '-':'T',
'..-':'U', '...-':'V', '.--':'W', '-..-':'X', '-.--':'Y',
'--..':'Z', '-----':'0', '.----':'1'}
Key insight: KDSETLED controls physical keyboard LEDs on Linux (/dev/console). The binary must run with console access. Use strace -e ioctl to capture all LED state changes without needing physical observation. Timing between calls determines dot vs dash.
See also: patterns-ctf.md for Part 1 (hidden emulator opcodes, SPN static extraction, image XOR smoothness, byte-at-a-time cipher, mathematical convergence bitmap, Windows PE XOR bitmap OCR, two-stage RC4+VM loaders, GBA ROM meet-in-the-middle, Sprague-Grundy game theory, kernel module maze solving, multi-threaded VM channels).
Binary Repurposed as Keystream Oracle (DEF CON 33 Quals 2025)
Pattern ("Dialects"): Challenge ships a modified crypto primitive (e.g. tweaked SM3 / SM4 round constants) and asks you to break a ciphertext. The canonical move in 2025+ is not to reverse the primitive's math — it's to patch the binary itself into a keystream-dumping oracle:
- Identify the core loop that produces round output / keystream bytes.
- Patch in a small trampoline: write those bytes to stdout (or a file) before the next round consumes them.
- Run the patched binary with attacker-chosen inputs.
- Use the dumped keystream / intermediate state to solve the challenge algebraically (often just XOR).
Why it beats reversing:
- Saves hours of math on a non-standard primitive.
- The binary author already implemented the primitive correctly — reuse their work.
- Works even when the primitive is obfuscated / VM-wrapped, as long as you can locate the I/O boundary.
Patch-in-place recipes:
lief / radare2 / patchelf to inject a small write(1, state_ptr, 64) before ret.
- Or replace a
call encrypt with call hook; call encrypt where hook dumps state to a memfd.
Also relevant: protocol downgrade pattern used in Dialects — upgrade an in-stream socket to TLS mid-session to evade "flag-share" detectors in remote judges. If the remote binary trusts the peer after the upgrade, you can exfil state over the upgraded channel.
Source: blog.cloudlabs.ufscar.br/sec/defcon-quals-2025.
Unicorn Host/Guest Memory Divergence — see ctf-pwn
When a challenge uses the Unicorn engine as a sandbox, uc_mem_read() from the host controller does NOT trip guest memory hooks. Any primitive that turns a guest operation into a host read bypasses the sandbox. Full writeup: ctf-pwn/advanced-exploits-2.md — Unicorn Emulator Host/Guest Hook Divergence.
perf_event_open Instruction-Count Side-Channel Byte Oracle (source: idekCTF 2025 constructor)
Trigger: local flag-check binary where correct prefix runs longer (more retired instructions) than incorrect; perf stat -e instructions available, or perf_event_open permission.
Signals: binary compares chars in a loop with an early-exit; distinct branches on match vs mismatch; no explicit timing protection.
Mechanic: run perf stat -e instructions ./bin <candidate> per character hypothesis; correct prefix yields a deterministic extra N-instruction cost per matched byte. Byte-by-byte oracle at the hardware-counter level — more stable than wall time.
Automation:
for c in {a..z} {A..Z} {0..9} '{' '_' '}'; do
n=$(perf stat -e instructions -x, -- ./bin "${PREFIX}${c}" 2>&1 | tail -1 | cut -d',' -f1)
echo "$c $n"
done | sort -k2 -n | tail
VM Architecture Misidentification + Banned-Byte Synthesis (source: idekCTF 2025 Lazy VM)
Trigger: custom VM where the decompiler banner claims "register-based" but opcode dispatch actually goes through a single pop/push pair; additionally some opcode bytes are banned in the input bytecode file.
Signals: opcode handler functions each start with a stack pop and end with a push; dispatch table indexed by *pc++; challenge says "you cannot use bytes 0x61 0x66 0x67 0x6c".
Mechanic: re-decompile with stack semantics (forget the "register" framing). For banned opcodes, synthesise them from allowed ones: ADD K; SUB (K-target) to produce the forbidden opcode byte at runtime; or use XOR pairs to construct any byte from a small allowed alphabet. Generic rule: if ISA claims are inconsistent with handler shape, trust the handler; any byte-ban constraint is solvable via arithmetic synthesis.
CTF Reverse — Patterns (2025-2026 era, continued)
Third-era patterns. Base patterns in patterns-ctf.md; 2024-era additions in patterns-ctf-2.md.
Table of Contents
Genetic Algorithm over Opaque Scoring Function (source: PlaidCTF 2025 Prospectin')
Trigger:
- ELF/PE where
main() reads a small input (≤ 256 bytes), then feeds it through a long chain of if (input[i] op const) score += kN; else score += kM; style predicates — often dozens to hundreds of such checks.
- The binary prints success when
score >= THRESHOLD (e.g. 0x119, 0x5f3); no obvious single-byte flag check, no hashing, just additive scoring.
- Ghidra decompilation shows a flat chain of arithmetic / bitwise / boolean checks with branch-free
score += w updates that all depend on input[i].
Signals to grep:
objdump -d bin | awk '/add\s+\$0x[0-9a-f]+,\s*%e[a-z]x/' | wc -l # many score-bumping adds
grep -oE 'score\s*[+\-*]=\s*\w+' decompile.c | sort -u # diverse weights
# fitness-style control flow: single exit with if (score >= X) win();
Why not symbolic execution: angr chokes on hundreds of branches; z3 solve time blows up because there's no feasible conjunction of constraints — the scoring is additive, not equational. You don't need all checks to pass, just enough to hit the threshold.
Recipe:
-
Lift the scoring loop to a callable — two options:
- (a) Ghidra → decompile → paste C →
gcc -shared -fPIC → ctypes.CDLL.
- (b)
patchelf --add-needed a tiny shim, or use Qiling/Unicorn to wrap only the score() function as a Python callable.
- Cross-arch? Decompile aarch64 → paste into amd64 .so; the scoring body is pure arithmetic and ports 1-1.
-
GA driver:
import random, ctypes
scorer = ctypes.CDLL("./score.so").score
POP, GENS, LEN, THR = 500, 2000, INPUT_LEN, 0x5f3
def fitness(b):
buf = ctypes.create_string_buffer(b, LEN)
return scorer(buf, LEN)
def mutate(b):
i = random.randrange(LEN)
return b[:i] + bytes([random.randrange(256)]) + b[i+1:]
def crossover(a, b):
k = random.randrange(LEN)
return a[:k] + b[k:]
pop = [bytes(random.randrange(256) for _ in range(LEN)) for _ in range(POP)]
for gen in range(GENS):
pop.sort(key=fitness, reverse=True)
if fitness(pop[0]) >= THR:
print(pop[0].hex()); break
elite = pop[:POP//10]
pop = elite + [mutate(crossover(random.choice(elite), random.choice(elite)))
for _ in range(POP - len(elite))]
-
Tuning knobs that matter:
- Tournament selection > elitism if the fitness landscape is rugged.
- Byte-wise mutation rate 1-3% optimal; too high destroys converged structure, too low stalls.
- When charset is restricted (hex digits, printable ASCII), constrain
random.randrange(...) accordingly — a 5-10× speedup.
- If the scorer has independent per-byte contributions, replace GA with hill-climbing: vary one byte at a time, keep if score doesn't decrease. Often 100× faster than GA and provably optimal for separable scoring.
-
Detecting separability (skip GA):
- Run the scorer with input
b"A"*N, then flip byte i through 0..255. If the per-byte optimum is independent of other bytes, the scoring is separable — solve each byte independently.
- Build a
256 × N table best[i][v] = score contribution of byte i = v; pick argmax per column.
Generalizes to: license validators, "submit 32-byte key to unlock"-style crackmes with weighted scoring, ML-style classifier wrappers, puzzle games with score-based win conditions. First separability-probe, then hill-climb, then GA as fallback.
CTF Reverse - Competition-Specific Patterns (Part 1)
Table of Contents
Hidden Emulator Opcodes + LD_PRELOAD Key Extraction (0xFun 2026)
Pattern (CHIP-8): Non-standard opcode FxFF triggers hidden superChipRendrer() → AES-256-CBC decryption. Key derived from binary constants.
Technique:
- Check all instruction dispatch branches for non-standard opcodes
- Hidden opcode may trigger crypto functions (OpenSSL)
- Use
LD_PRELOAD hook on EVP_DecryptInit_ex to capture AES key at runtime:
#include <openssl/evp.h>
int EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type,
ENGINE *impl, const unsigned char *key,
const unsigned char *iv) {
for (int i = 0; i < 32; i++) printf("%02x", key[i]);
printf("\n");
return ((typeof(EVP_DecryptInit_ex)*)dlsym(RTLD_NEXT, "EVP_DecryptInit_ex"))
(ctx, type, impl, key, iv);
}
gcc -shared -fPIC -ldl -lssl hook.c -o hook.so
LD_PRELOAD=./hook.so ./emulator rom.ch8
Spectre-RSB SPN Cipher — Static Parameter Extraction (0xFun 2026)
Pattern: Binary uses cache side channels to implement S-boxes, but ALL cipher parameters (round keys, S-box tables, permutation) are in the binary's data section.
Key insight: Don't try to run on special hardware. Extract parameters statically:
- 8 S-boxes × 8 output bits, 256 entries each
- Values
0x340 = bit 1, 0x100 = bit 0
- 64-byte permutation table, 8 round keys
import struct
sbox = [[0]*256 for _ in range(8)]
for i in range(8):
for j in range(256):
val = struct.unpack('<I', data[sbox_offset + (i*256+j)*4 : ...])[0]
sbox[i][j] = 1 if val == 0x340 else 0
Lesson: Side-channel implementations embed lookup tables in memory. Extract statically.
Image XOR Mask Recovery via Smoothness (VuwCTF 2025)
Pattern (Trianglification): Image divided into triangle regions, each XOR-encrypted with key = (mask * x - y) & 0xFF where mask is unknown (0-255).
Recovery: Natural images have smooth gradients. Brute-force mask (256 values per region), score by neighbor pixel differences:
import numpy as np
from PIL import Image
img = np.array(Image.open('encrypted.png'))
def score_smoothness(region_pixels, mask, positions):
decrypted = []
for (x, y), pixel in zip(positions, region_pixels):
key = (mask * x - y) & 0xFF
decrypted.append(pixel ^ key)
return -sum(abs(decrypted[i] - decrypted[i+1]) for i in range(len(decrypted)-1))
for region in regions:
best_mask = max(range(256), key=lambda m: score_smoothness(region, m, positions))
Search space: 256 candidates × N regions = trivial. Smoothness is a reliable scoring metric for natural images.
Shellcode in Data Section via mmap RWX (VuwCTF 2025)
Pattern (Missing Function): Binary relocates data to RWX memory (mmap with PROT_READ|PROT_WRITE|PROT_EXEC) and jumps to it.
Detection: Look for mmap with PROT_EXEC flag. Embedded shellcode often uses XOR with rotating key.
Analysis: Extract data section, apply XOR key (try 3-byte rotating), disassemble result.
Recursive execve Subtraction (VuwCTF 2025)
Pattern (String Inspector): Binary recursively calls itself via execve, subtracting constants each time.
Solution: Find base case and work backward. Often a mathematical relationship like N * M + remainder.
Byte-at-a-Time Block Cipher Attack (UTCTF 2024)
Pattern (PES-128): First output byte depends only on first input byte (no diffusion).
Attack: For each position, try all 256 byte values, compare output byte with target ciphertext. One match per byte = full plaintext recovery without knowing the key.
Detection: Change one input byte → only corresponding output byte changes. This means zero cross-byte diffusion = trivially breakable.
Mathematical Convergence Bitmap (EHAX 2026)
Pattern (Compute It): Binary classifies complex-plane coordinates by Newton's method convergence. The classification results, arranged as a grid, spell out the flag in ASCII art.
Recognition:
- Input file with coordinate pairs (x, y)
- Binary iterates a mathematical function (e.g., z^3 - 1 = 0) and outputs pass/fail
- Grid dimensions hinted by point count (e.g., 2600 = 130×20)
- 5-pixel-high ASCII art font common in CTFs
Newton's method for z^3 - 1:
def newton_converges_to_one(px, py, max_iter=50, target_count=12):
"""Returns True if Newton's method converges to z=1 in exactly target_count steps."""
x, y = px, py
count = 0
for _ in range(max_iter):
f_real = x**3 - 3*x*y**2 - 1.0
f_imag = 3*x**2*y - y**3
J_rr = 3.0 * (x**2 - y**2)
J_ri = 6.0 * x * y
det = J_rr**2 + J_ri**2
if det < 1e-9:
break
x -= (f_real * J_rr + f_imag * J_ri) / det
y -= (f_imag * J_rr - f_real * J_ri) / det
count += 1
if abs(x - 1.0) < 1e-6 and abs(y) < 1e-6:
break
return count == target_count
points = [(float(x), float(y)) for x, y in ...]
bits = [1 if newton_converges_to_one(px, py) else 0 for px, py in points]
WIDTH = 130
for r in range(len(bits) // WIDTH):
print(''.join('#' if bits[r*WIDTH+c] else '.' for c in range(WIDTH)))
Key insight: The binary is a mathematical classifier, not a flag checker. The flag is in the visual pattern of classifications, not in the binary's output. Reverse-engineer the math, apply to all coordinates, and visualize as bitmap.
Windows PE XOR Bitmap Extraction + OCR (srdnlenCTF 2026)
Pattern (Artistic Warmup): Binary renders input text, compares rendered bitmap against expected pixel data stored XOR'd with constant in .rdata. No need to compute — extract expected pixels directly.
Attack:
- Reverse the core check function to identify rendering and comparison logic
- Find the expected pixel blob in
.rdata (look for large data block referenced near comparison)
- XOR with constant (e.g., 0xAA) to recover expected rendered DIB
- Save as image and OCR to recover flag text
import numpy as np
from PIL import Image
with open("binary.exe", "rb") as f:
data = f.read()
blob_offset = 0xC3620
blob_size = 0x15F90
blob = np.frombuffer(data[blob_offset:blob_offset + blob_size], dtype=np.uint8)
expected = blob ^ 0xAA
img = expected.reshape(50, 450, 4)
channel = img[:, :, 0]
Image.fromarray(channel, "L").save("target.png")
import subprocess
result = subprocess.run(
["tesseract", "target.png", "stdout", "-c",
"tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789{}_"],
capture_output=True, text=True)
print(result.stdout)
Key insight: When a binary renders text and compares pixels, the expected pixel data is the flag rendered as an image. Extract it directly from the binary data section without needing to understand the rendering logic. OCR with charset whitelist improves accuracy for CTF flag characters.
Two-Stage Loader: RC4 Gate + VM Constraints (srdnlenCTF 2026)
Pattern (Cornflake v3.5): Two-stage malware loader — stage 1 uses RC4 username gate, stage 2 downloaded from C2 contains VM-based password validation.
Stage 1 — RC4 username recovery:
def rc4(key, data):
s = list(range(256))
j = 0
for i in range(256):
j = (j + s[i] + key[i % len(key)]) & 0xFF
s[i], s[j] = s[j], s[i]
i = j = 0
out = bytearray()
for b in data:
i = (i + 1) & 0xFF
j = (j + s[i]) & 0xFF
s[i], s[j] = s[j], s[i]
out.append(b ^ s[(s[i] + s[j]) & 0xFF])
return bytes(out)
username = rc4(b"s3cr3t_k3y_v1", bytes.fromhex("46f5289437bc009c17817e997ae82bfbd065545d"))
Stage 2 — VM constraint extraction:
- Download stage 2 from C2 endpoint (e.g.,
/updates/check.php)
- Reverse VM bytecode interpreter (typically 15-20 opcodes)
- Extract linear equality constraints over flag characters
- Solve constraint system (Z3 or manual)
Key insight: Multi-stage loaders often use simple crypto (RC4) for the first gate and more complex validation (custom VM) for the second. The VM memory may be uninitialized (all zeros), drastically simplifying constraint extraction since memory-dependent operations become constants.
GBA ROM VM Hash Inversion via Meet-in-the-Middle (srdnlenCTF 2026)
Pattern (Dante's Trial): Game Boy Advance ROM implements a custom VM. Hash function uses FNV-1a variant with uninitialized memory (stays all zeros). Meet-in-the-middle attack splits the search space.
Hash function structure:
P = 0x100000001b3
CUP = 0x9e3779b185ebca87
MASK64 = (1 << 64) - 1
def fmix64(h):
"""Finalization mixer."""
h ^= h >> 33; h = (h * 0xff51afd7ed558ccd) & MASK64
h ^= h >> 33; h = (h * 0xc4ceb9fe1a85ec53) & MASK64
h ^= h >> 33
return h
def hash_input(chars, seed_lo=0x84222325, seed_hi=0xcbf29ce4):
hlo, hhi, ptr = seed_lo, seed_hi, 0
for c in chars:
delta = ((ord(c) * CUP) ^ (0 * P)) & MASK64
hlo = ((hlo ^ (delta & 0xFFFFFFFF)) * (P & 0xFFFFFFFF)) & 0xFFFFFFFF
hhi = ((hhi ^ (delta >> 32)) * (P >> 32)) & 0xFFFFFFFF
ptr = (ptr + 1) & 0xFF
combined = ((hhi << 32) | (hlo ^ ptr)) & MASK64
return fmix64((combined * P) & MASK64)
Meet-in-the-middle attack:
import string
TARGET = 0x73f3ebcbd9b4cd93
LENGTH = 6
SPLIT = 3
charset = [c for c in string.printable if 32 <= ord(c) < 127]
forward = {}
for c1 in charset:
for c2 in charset:
for c3 in charset:
state = hash_forward(seed, [c1, c2, c3])
forward[state] = c1 + c2 + c3
inv_target = invert_fmix64(TARGET)
for c4 in charset:
for c5 in charset:
for c6 in charset:
state = hash_backward(inv_target, [c4, c5, c6])
if state in forward:
print(f"Found: {forward[state]}{c4}{c5}{c6}")
Key insight: Meet-in-the-middle reduces search from 95^6 ≈ 7.4×10^11 to 2×95^3 ≈ 1.7×10^6 — a factor of ~430,000x speedup. Critical when the hash function is invertible from the output side (i.e., fmix64 and the final multiply can be undone). Also: uninitialized VM memory that stays zero simplifies the hash function by removing a variable.
Sprague-Grundy Game Theory Binary (DiceCTF 2026)
Pattern (Bedtime): Stripped Rust binary plays N rounds of bounded Nim. Each round has piles and max-move parameter k. Binary uses a PRNG for moves when in a losing position; user must respond optimally so the PRNG eventually generates an invalid move (returns 1). Sum of return values must equal a target.
Game theory identification:
- Bounded Nim: remove 1 to k items from any pile per turn
- Grundy value per pile:
pile_value % (k+1)
- XOR of all Grundy values: non-zero = winning (N-position), zero = losing (P-position)
- N-positions: computer wins automatically (returns 0)
- P-positions: computer uses PRNG, may make invalid move (returns 1)
PRNG state tracking through user feedback:
MASK64 = (1 << 64) - 1
def prng_step(state, pile_count, k):
"""Computer's PRNG move. Returns (pile_idx, amount, new_state)."""
r12 = state[2] ^ 0x28027f28b04ccfa7
rax = (state[1] + r12) & MASK64
s0_new = ROL64((state[0] ** 2 + rax) & MASK64, 32)
r12_upd = (r12 + rax) & MASK64
s0_final = ROL64((s0_new ** 2 + r12_upd) & MASK64, 32)
pile_idx = rax % pile_count
amount = (r12_upd % k) + 1
return pile_idx, amount, [s0_final, r12_upd, state[2]]
Solving approach:
- Dump game data from GDB (all entries with pile values and parameters)
- Classify: count P-positions (return 1) vs N-positions (return 0)
- Simulate each P-position: PRNG moves → user responds optimally → track state[2]
- Encode user moves as input format (4-digit decimal pairs, reversed order)
Key insight: When a game binary's PRNG state depends on user input, you must simulate the full feedback loop — not just solve the game theory. Use GDB hardware watchpoints to discover which state variables are affected by user vs computer moves.
Kernel Module Maze Solving (DiceCTF 2026)
Pattern (Explorer): Rust kernel module implements a 3D maze via /dev/challenge ioctls. Navigate the maze, avoid decoy exits (status=2), find the real exit (status=1), read the flag.
Ioctl enumeration:
| Command | Description |
|---|
0x80046481-83 | Get maze dimensions (3 axes, 8-16 each) |
0x80046485 | Get status: 0=playing, 1=WIN, 2=decoy |
0x80046486 | Get wall bitfield (6 directions) |
0x80406487 | Get flag (64 bytes, only when status=1) |
0x40046488 | Move in direction (0-5) |
0x6489 | Reset position |
DFS solver with decoy avoidance:
int visited[16][16][16];
int bad[16][16][16];
void dfs(int fd, int x, int y, int z) {
if (visited[x][y][z] || bad[x][y][z]) return;
visited[x][y][z] = 1;
int status = ioctl_get_status(fd);
if (status == 1) { read_flag(fd); exit(0); }
if (status == 2) { bad[x][y][z] = 1; return; }
int walls = ioctl_get_walls(fd);
int dx[] = {1,-1,0,0,0,0}, dy[] = {0,0,1,-1,0,0}, dz[] = {0,0,0,0,1,-1};
int opp[] = {2,3,0,1,5,4};
for (int dir = 0; dir < 6; dir++) {
if (!(walls & (1 << dir))) continue;
ioctl_move(fd, dir);
dfs(fd, x+dx[dir], y+dy[dir], z+dz[dir]);
ioctl_move(fd, opp[dir]);
}
}
Remote deployment: Upload binary via base64 chunks over netcat shell, decode, execute.
Key insight: For kernel module challenges, injecting test binaries into initramfs and probing ioctls dynamically is faster than static RE of stripped kernel modules. Keep solver binary minimal (raw syscalls, no libc) for fast upload.
Multi-Threaded VM with Channel Synchronization (DiceCTF 2026)
Pattern (locked-in): Custom stack-based VM runs 16 concurrent threads verifying a 30-char flag. Threads communicate via futex-based channels. Pipeline: input → XOR scramble → transformation → base-4 state machine → final check.
Analysis approach:
- Identify thread roles by tracing channel read/write patterns in GDB
- Extract constants (XOR scramble values, lookup tables) via breakpoints on specific opcodes
- Watch for inverted logic: validity check returns 0 for valid, non-zero for blocked (opposite of intuition)
- Detect futex quirks:
unlock_pi on unowned mutex returns EPERM=1, which can change all computations
BFS state space search for constrained state machines:
from collections import deque
def solve_flag(scramble_vals, lookup_table, initial_state, target_state):
"""BFS through state machine to find valid flag bytes."""
flag = [None] * 30
flag[0:5] = list(b'dice{')
flag[29] = ord('}')
states = {initial_state}
for pos in range(28, 4, -1):
next_states = {}
for state in states:
for ch in range(32, 127):
transformed = transform(ch, scramble_vals[pos])
digits = to_base4(transformed)
new_state = apply_digits(state, digits, lookup_table)
if new_state is not None:
next_states.setdefault(new_state, []).append((state, ch))
states = set(next_states.keys())
Key insight: Multi-threaded VMs require tracing data flow across thread boundaries. Channel-based communication creates a pipeline — identify each thread's role (input, transform, validate, output) by watching which channels it reads/writes. Constants that affect computation may come from unexpected sources (futex return values, thread IDs).
Backdoored Shared Library Detection via String Diffing (Hack.lu CTF 2012)
Pattern (Zombie Lockbox): A setuid binary uses strcmp for password validation. The expected password is visible via strings and works under GDB (which drops suid), but fails when run normally. The binary links against a non-standard libc that patches function behavior based on suid status.
Detection steps:
- Check for non-standard library paths with
ldd:
ldd ./binary
- Diff strings between the suspicious and system libc:
strings /lib/libc/libc.so.6 > suspicious_strings
strings /lib32/libc-2.15.so > normal_strings
diff suspicious_strings normal_strings
- Disassemble the patched function (e.g.,
puts) to find injected code:
gdb /lib/libc/libc.so.6
(gdb) disas puts
Key insight: When a binary behaves differently under GDB vs. normal execution, check ldd for non-standard library paths. Suid binaries drop privileges under debuggers, so a backdoored libc can detect this via getuid/geteuid syscalls and change program behavior accordingly. The strings | diff approach quickly reveals injected data without full disassembly.
See also: patterns-ctf-2.md for Part 2 (multi-layer self-decrypting binary, embedded ZIP+XOR license, stack string deobfuscation, prefix hash brute-force, CVP/LLL lattice, decision tree obfuscation, GLSL shader VM, GF(2^8) Gaussian elimination, Z3 boolean circuit, sliding window popcount).
CTF Reverse - Patterns & Techniques
Table of Contents
Custom VM Reversing
Analysis Steps
- Identify VM structure: registers, memory, instruction pointer
- Reverse
executeIns/runvm function for opcode meanings
- Write a disassembler to parse bytecode
- Decompile disassembly to understand algorithm
Common VM Patterns
switch (opcode) {
case 1: *R[op1] *= op2; break;
case 2: *R[op1] -= op2; break;
case 3: *R[op1] = ~*R[op1]; break;
case 4: *R[op1] ^= mem[op2]; break;
case 5: *R[op1] = *R[op2]; break;
case 7: if (R0) IP += op1; break;
case 8: putc(R0); break;
case 10: R0 = getc(); break;
}
RVA-Based Opcode Dispatching
- Opcodes are RVAs pointing to handler functions
- Handler performs operation, reads next RVA, jumps
- Map all handlers by following RVA chain
State Machine VMs (90K+ states)
var agenda = new ArrayDeque<State>();
agenda.add(new State(0, ""));
while (!agenda.isEmpty()) {
var current = agenda.remove();
if (current.path.length() == TARGET_LENGTH) {
println(current.path);
continue;
}
for (var transition : machine.get(current.state).entrySet()) {
agenda.add(new State(transition.getValue(),
current.path + (char)transition.getKey()));
}
}
Anti-Debugging Techniques
Common Checks
IsDebuggerPresent() (Windows)
ptrace(PTRACE_TRACEME) (Linux)
/proc/self/status TracerPid
- Timing checks (
rdtsc, time())
- Registry checks (Windows)
Bypass Technique
- Identify
test instructions after debug checks
- Set breakpoint at the
test
- Modify register to bypass conditional
db 0x401234
dc
dr eax=0
dc
LD_PRELOAD Hook
#define _GNU_SOURCE
#include <dlfcn.h>
#include <sys/ptrace.h>
long int ptrace(enum __ptrace_request req, ...) {
long int (*orig)(enum __ptrace_request, pid_t, void*, void*);
orig = dlsym(RTLD_NEXT, "ptrace");
return orig(req, pid, addr, data);
}
Compile: gcc -shared -fPIC -ldl hook.c -o hook.so
Run: LD_PRELOAD=./hook.so ./binary
pwntools Binary Patching (Crypto-Cat)
Patch out anti-debug calls directly using pwntools — replaces function with ret instruction:
from pwn import *
elf = ELF('./challenge', checksec=False)
elf.asm(elf.symbols.ptrace, 'ret')
elf.save('patched')
Other common patches:
elf.asm(addr, 'nop')
elf.asm(addr, 'xor eax, eax; ret')
elf.asm(addr, 'mov eax, 1; ret')
Nanomites
Linux (Signal-Based)
SIGTRAP (int 3) → Custom operation
SIGILL (ud2) → Custom operation
SIGFPE (idiv 0) → Custom operation
SIGSEGV (null deref) → Custom operation
Windows (Debug Events)
EXCEPTION_DEBUG_EVENT → Main handler
- Parent modifies child via
PTRACE_POKETEXT
- Magic markers:
0x1337BABE, 0xDEADC0DE
Analysis
- Check for
fork() + ptrace(PTRACE_TRACEME)
- Find
WaitForDebugEvent loop
- Map EAX values to operations
- Log operations to reconstruct algorithm
Self-Modifying Code
Pattern: XOR Decryption
lea rax, next_block
mov dl, [rcx] ; Input char
xor_loop:
xor [rax+rbx], dl
inc rbx
cmp rbx, BLOCK_SIZE
jnz xor_loop
jmp rax ; Execute decrypted
Solution: Known opcode at block start reveals XOR key (flag char).
Known-Plaintext XOR (Flag Prefix)
Pattern: Encrypted bytes given; flag format known (e.g., 0xL4ugh{).
Approach:
- Assume repeating XOR key.
- Use known prefix (and any hint phrase) to recover key bytes.
- Try small key lengths and validate printable output.
enc = bytes.fromhex("...")
known = b"0xL4ugh{say_yes_to_me"
for klen in range(2, 33):
key = bytearray(klen)
ok = True
for i, b in enumerate(known):
if i >= len(enc):
break
ki = i % klen
v = enc[i] ^ b
if key[ki] != 0 and key[ki] != v:
ok = False
break
key[ki] = v
if not ok:
continue
pt = bytes(enc[i] ^ key[i % klen] for i in range(len(enc)))
if all(32 <= c < 127 for c in pt):
print(klen, key, pt)
Note: Challenge hints often appear verbatim in the flag body (e.g., "say_yes_to_me").
Variant: XOR with Position Index
Pattern: cipher[i] = plain[i] ^ key[i % k] ^ i (or ^ (i & 0xff)).
Symptoms:
- Repeating-key XOR almost fits known prefix but breaks at later positions
- XOR with known prefix yields a "key" that changes by +1 per index
Fix: Remove index first, then recover key with known prefix.
enc = bytes.fromhex("...")
known = b"0xL4ugh{say_yes_to_me"
for klen in range(2, 33):
key = bytearray(klen)
ok = True
for i, b in enumerate(known):
if i >= len(enc):
break
ki = i % klen
v = (enc[i] ^ i) ^ b
if key[ki] != 0 and key[ki] != v:
ok = False
break
key[ki] = v
if not ok:
continue
pt = bytes((enc[i] ^ i) ^ key[i % klen] for i in range(len(enc)))
if all(32 <= c < 127 for c in pt):
print(klen, key, pt)
Mixed-Mode (x86-64 / x86) Stagers
Pattern: 64-bit ELF jumps into a 32-bit blob via far return (retf/retfq), often after anti-debug.
Identification:
- Bytes
0xCB (retf) or 0xCA (retf imm16), sometimes preceded by 0x48 (retfq)
- 32-bit disasm shows SSE ops (
psubb, pxor, paddb) in a tight loop
- Computed jumps into the 32-bit region
Gotchas:
retf pops 6 bytes: 4-byte EIP + 2-byte CS (not 8)
- 32-bit blob may rely on inherited XMM state and EFLAGS
- Missing XMM/flags transfer when switching emulators yields wrong output
Bypass/Emulation Tips:
- Create a UC_MODE_32 emulator, copy memory + GPRs, EFLAGS, and XMM regs
- Run 32-bit block, then copy memory + regs back to 64-bit
- If anti-debug uses
fork/ptrace + patching, emulate parent to log POKEs and apply them in child
LLVM Obfuscation (Control Flow Flattening)
Pattern
while (1) {
if (i == 0xA57D3848) { }
if (i != 0xA5AA2438) break;
i = 0x39ABA8E6;
}
De-obfuscation
- GDB script to break at
je instructions
- Log state variable values
- Map state transitions
- Reconstruct true control flow
S-Box / Keystream Generation
Fisher-Yates Shuffle (Xorshift32)
def gen_sbox():
sbox = list(range(256))
state = SEED
for i in range(255, -1, -1):
state = ((state << 13) ^ state) & 0xffffffff
state = ((state >> 17) ^ state) & 0xffffffff
state = ((state << 5) ^ state) & 0xffffffff
j = state % (i + 1) if i > 0 else 0
sbox[i], sbox[j] = sbox[j], sbox[i]
return sbox
Xorshift64* Keystream
def gen_keystream():
ks = []
state = SEED_64
mul = 0x2545f4914f6cdd1d
for _ in range(256):
state ^= (state >> 12)
state ^= (state << 25)
state ^= (state >> 27)
state = (state * mul) & 0xffffffffffffffff
ks.append((state >> 56) & 0xff)
return ks
Identifying Patterns
- Xorshift32: shifts 13, 17, 5 (no multiplication constant)
- Xorshift64*: shifts 12, 25, 27, then multiply by
0x2545f4914f6cdd1d
- Other common constant:
0x9e3779b97f4a7c15 (golden ratio)
SECCOMP/BPF Filter Analysis
seccomp-tools dump ./binary
BPF Analysis
A = sys_number followed by comparisons
mem[N] = A, A = mem[N] for memory ops
- Map to constraint equations, solve with z3
from z3 import *
flag = [BitVec(f'c{i}', 32) for i in range(14)]
s = Solver()
s.add(flag[0] >= 0x20, flag[0] < 0x7f)
if s.check() == sat:
m = s.model()
print(''.join(chr(m[c].as_long()) for c in flag))
Exception Handler Obfuscation
RtlInstallFunctionTableCallback
- Dynamic exception handler registration
- Handler installs new handler, modifies code
- Use x64dbg with exception handler breaks
Vectored Exception Handlers (VEH)
AddVectoredExceptionHandler installs handler
- Handler decrypts code at exception address
- Step through, dump decrypted code
Memory Dump Analysis
When Binary Dumps Memory
- Check for
/proc/self/maps reads
- Check for
/proc/self/mem reads
- Heap data often appended to dump
Known Plaintext Attack
prologue = bytes([0xf3, 0x0f, 0x1e, 0xfa, 0x55, 0x48, 0x89, 0xe5])
encrypted = data[func_offset:func_offset+8]
partial_key = bytes(a ^ b for a, b in zip(encrypted, prologue))
Byte-Wise Uniform Transforms
Pattern: Output buffer depends on each input byte independently (no cross-byte coupling).
Detection:
- Change one input position → only one output position changes
- Fill input with a single byte → output buffer becomes constant
Solve:
- For each byte value 0..255, run the program with that byte repeated
- Record output byte → build mapping and inverse mapping
- Apply inverse mapping to static target bytes to recover the flag
x86-64 Gotchas
Sign Extension
esi = 0xffffffc7
esi_xor = esi & 0xff
r12 = (r13 + esi) & 0xffffffff
Loop Boundary State Updates
Assembly often splits state updates across loop boundaries:
jmp loop_middle ; First iteration in middle!
loop_top: ; State for iterations 2+
mov r13, sbox[a & 0xf]
; Uses OLD 'a', not new!
loop_middle:
; Main computation
inc a
jne loop_top
Custom Mangle Function Reversing
Pattern (Flag Appraisal): Binary mangles input 2 bytes at a time with intermediate state, compares to static target.
Approach:
- Extract static target bytes from
.rodata section
- Understand mangle: processes pairs with running state value
- Write inverse function (process in reverse, undo each operation)
- Feed target bytes through inverse → recovers flag
Position-Based Transformation Reversing
Pattern (PascalCTF 2026): Binary transforms input by adding/subtracting position index.
Reversing:
expected = [...]
flag = ''
for i, b in enumerate(expected):
if i % 2 == 0:
flag += chr(b - i)
else:
flag += chr(b + i)
Hex-Encoded String Comparison
Pattern (Spider's Curse): Input converted to hex, compared against hex constant.
Quick solve: Extract hex constant from strings/Ghidra, decode:
echo "4d65746143..." | xxd -r -p
Signal-Based Binary Exploration
Pattern (Signal Signal Little Star): Binary uses UNIX signals as a binary tree navigation mechanism.
Identification:
- Multiple
sigaction() calls with SA_SIGINFO
sigaltstack() setup (alternate signal stack)
- Handler decodes embedded payload, installs next pair of signals
- Two types: Node (installs children) vs Leaf (prints message + exits)
Solving approach:
- Hook
sigaction via LD_PRELOAD to log signal installations
- DFS through the binary tree by sending signals
- At each stage, observe which 2 signals are installed
- Send one, check if program exits (leaf) or installs 2 more (node)
- If wrong leaf, backtrack and try sibling
int sigaction(int signum, const struct sigaction *act, ...) {
if (act && (act->sa_flags & SA_SIGINFO))
log("SET %d SA_SIGINFO=1\n", signum);
return real_sigaction(signum, act, oldact);
}
Malware Anti-Analysis Bypass via Patching
Pattern (Carrot): Malware with multiple environment checks before executing payload.
Common checks to patch:
| Check | Technique | Patch |
|---|
ptrace(PTRACE_TRACEME) | Anti-debug | Change cmp -1 to cmp 0 |
sleep(150) | Anti-sandbox timing | Change sleep value to 1 |
/proc/cpuinfo "hypervisor" | Anti-VM | Flip JNZ to JZ |
| "VMware"/"VirtualBox" strings | Anti-VM | Flip JNZ to JZ |
getpwuid username check | Environment | Flip comparison |
LD_PRELOAD check | Anti-hook | Skip check |
| Fan count / hardware check | Anti-VM | Flip JLE to JGE |
| Hostname check | Environment | Flip JNZ to JZ |
Ghidra patching workflow:
- Find check function, identify the conditional jump
- Click on instruction →
Ctrl+Shift+G → modify opcode
- For
JNZ (0x75) → JZ (0x74), or vice versa
- For immediate values: change operand bytes directly
- Export: press
O → choose "Original File" format
chmod +x the patched binary
Server-side validation bypass:
- If patched binary sends system info to remote server, patch the data too
- Modify string addresses in data-gathering functions
- Change format strings to embed correct values directly
Multi-Stage Shellcode Loaders
Pattern (I Heard You Liked Loaders): Nested shellcode with XOR decode loops and anti-debug.
Debugging workflow:
- Break at
call rax in launcher, step into shellcode
- Bypass ptrace anti-debug: step to syscall,
set $rax=0
- Step through XOR decode loop (or break on
int3 if hidden)
- Repeat for each stage until final payload
Flag extraction from mov instructions:
values = [0x6174654d, 0x7b465443, ...]
flag = b''.join(v.to_bytes(4, 'little') for v in values)
Timing Side-Channel Attack
Pattern (Clock Out): Validation time varies per correct character (longer sleep on match).
Exploitation:
import time
from pwn import *
flag = ""
for pos in range(flag_length):
best_char, best_time = '', 0
for c in string.printable:
io = remote(host, port)
start = time.time()
io.sendline((flag + c).ljust(total_len, 'X'))
io.recvall()
elapsed = time.time() - start
if elapsed > best_time:
best_time = elapsed
best_char = c
io.close()
flag += best_char
Multi-Thread Anti-Debug with Decoy + Signal Handler MBA (ApoorvCTF 2026)
Pattern (A Golden Experience Requiem): Multi-threaded binary with layered anti-analysis: Thread 1 performs decoy operations (fake AES + deliberate crash via ud2), Thread 2 does the real flag computation in a SIGSEGV signal handler using Mixed Boolean Arithmetic (MBA), Thread 3 erases memory to prevent post-mortem analysis.
Thread layout:
| Thread | Purpose | Trap |
|---|
| Thread 1 | Decoy: AES-looking operations → ud2 crash | Analysts waste time reversing fake crypto |
| Thread 2 | Real flag: SIGSEGV handler with MBA transforms | Hidden in signal handler, not main code path |
| Thread 3 | Memory eraser: zeros out flag data after computation | Prevents memory dumping |
| Main | rdtsc-based anti-debug timing check | Penalizes debugger-attached execution |
Solving approach — pure Python emulation of MBA logic:
def mba_add(a, b): return (a + b) & 0xff
def mba_xor(a, b): return (a ^ b) & 0xff
def mba_transform(i):
"""Position-dependent transform from signal handler."""
val = (i * 7 + 0x3f) & 0xff
rotated = ((i << 3) | (i >> 5)) & 0xff
return mba_xor(val, rotated)
SBOX = [0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,
0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19]
def sbox_lookup(i):
idx = i & 7
shift = ((i >> 3) & 3) * 8
return (SBOX[idx] >> shift) & 0xff
rodata1 = bytes.fromhex("39407691b717c97879013adf3a2adea11c2b04e0")
rodata2 = bytes.fromhex("bb19b025e37eaa786c4116e7aeea00c9c623940d")
flag = []
for i in range(40):
t = mba_transform(i)
s = sbox_lookup(i)
mem = rodata1[i // 2] if i % 2 == 0 else rodata2[i // 2]
flag.append(chr(t ^ s ^ mem))
print(''.join(flag))
Key insight: The real flag logic is in the signal handler (SIGSEGV/SIGILL), not the main thread. Thread 1's AES-like code and ud2 crash are intentional misdirection. The rdtsc timing check detects debuggers and corrupts output. Bypass by extracting the MBA logic from assembly and reimplementing in Python — never run the binary under a debugger.
Detection indicators:
- Multiple
pthread_create calls with different handler functions
signal(SIGSEGV, handler) or sigaction setup
ud2 instruction (deliberate illegal instruction)
rdtsc instructions for timing checks
- SHA-256 constants (0x6a09e667...) used as lookup tables, not for hashing
CTF Reverse - Platform-Specific Reversing
macOS/iOS, embedded/IoT firmware, kernel driver, automotive, and game engine reverse engineering.
Table of Contents
macOS / iOS Reversing
Mach-O Binary Format
file binary
otool -l binary
otool -L binary
lipo -info universal_binary
lipo universal_binary -thin arm64 -output binary_arm64
otool -l binary | grep -A5 "segment\|section"
Key Mach-O concepts:
- Load commands drive the dynamic linker (
dyld)
LC_MAIN → entry point (replaces LC_UNIXTHREAD)
LC_LOAD_DYLIB → shared library dependencies
LC_CODE_SIGNATURE → code signing blob
__DATA_CONST.__got → Global Offset Table
__DATA.__la_symbol_ptr → Lazy symbol pointers (like PLT)
Code Signing & Entitlements
codesign -dvvv binary
codesign --verify binary
codesign -d --entitlements - binary
codesign --remove-signature binary
codesign -f -s - binary
CTF relevance: Patched binaries need re-signing to run on macOS. Ad-hoc signing (-s -) works for local testing.
Objective-C Runtime RE
class-dump binary > classes.h
(lldb) expression -l objc -O -- [NSClassFromString(@"ClassName") new]
(lldb) expression -l objc -O -- [[ClassName alloc] init]
Objective-C in disassembly:
# objc_msgSend(receiver, selector, ...) is THE dispatch mechanism
# RDI = self (receiver), RSI = selector (char* method name)
# In Ghidra/IDA, look for:
objc_msgSend(obj, "checkPassword:", input)
# Selector strings are in __objc_methname section
# Cross-reference selectors to find implementations
class-dump alternatives:
dsdump — faster, supports Swift + Objective-C
otool -oV binary — dump Objective-C segments
- Ghidra: Enable "Objective-C" analyzer in Analysis Options
Swift Binary Reversing
strings binary | grep "swift"
otool -l binary | grep "swift"
swift demangle 's14MyApp0A8ClassC10checkInput6resultSbSS_tF'
Swift in disassembly:
# Swift uses value witness tables (VWT) for type operations
# Protocol witness tables (PWT) for dynamic dispatch (like vtables)
# Key runtime functions to watch:
swift_allocObject → heap allocation
swift_release → reference count decrement
swift_bridgeObjectRetain → bridged (ObjC ↔ Swift) retain
swift_once → lazy initialization (like dispatch_once)
# String layout:
# Small strings (≤15 bytes): inline in 16-byte buffer, tagged pointer
# Large strings: heap-allocated, pointer + length + flags
# Array<T>: pointer to ContiguousArrayStorage (header + elements)
# Dictionary<K,V>: hash table with open addressing
Ghidra for Swift: Enable "Swift" language module. Swift metadata sections (__swift5_types, __swift5_proto) contain type descriptors that Ghidra can parse.
iOS App Analysis
unzip app.ipa -d extracted/
ls extracted/Payload/*.app/
otool -l extracted/Payload/*.app/binary | grep -A4 "LC_ENCRYPTION_INFO"
frida-ios-dump -H jailbroken_ip -p 22 "App Name"
class-dump decrypted_binary > headers.h
Jailbreak detection and bypass:
var paths = ["/Applications/Cydia.app", "/bin/sh", "/etc/apt",
"/private/var/lib/apt", "/usr/bin/ssh"];
Interceptor.attach(Module.findExportByName(null, "access"), {
onEnter(args) {
this.path = Memory.readUtf8String(args[0]);
},
onLeave(retval) {
if (paths.some(p => this.path && this.path.includes(p))) {
retval.replace(-1);
}
}
});
dyld / Dynamic Linking
DYLD_PRINT_LIBRARIES=1 ./binary
DYLD_INSERT_LIBRARIES=hook.dylib ./binary
dyld_shared_cache_util -list /System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e
Embedded / IoT Firmware RE
Firmware Extraction
binwalk firmware.bin
binwalk -e firmware.bin
binwalk -Me firmware.bin
binwalk --dd='.*' firmware.bin
strings firmware.bin | head -50
hexdump -C firmware.bin | grep "hsqs"
hexdump -C firmware.bin | grep "UBI#"
Hardware extraction methods (physical access):
UART: Serial console — often gives root shell or bootloader access
Tools: USB-UART adapter, baudrate detection (usually 115200)
Identify: 4 pins (GND, TX, RX, VCC), use multimeter
JTAG: Direct CPU debug — read/write flash, halt CPU, set breakpoints
Tools: OpenOCD, J-Link, Bus Pirate
Identify: 10/14/20-pin header, use JTAGulator for auto-detection
SPI Flash: Direct chip read — dump entire firmware
Tools: flashrom, CH341A programmer
Identify: 8-pin SOIC chip (Winbond, Macronix, etc.)
eMMC: Embedded MMC — common in routers, phones
Tools: eMMC reader, direct solder to test pads
Firmware Unpacking
unsquashfs -d output/ squashfs-root.sqfs
jefferson -d output/ jffs2.img
ubireader_extract_images firmware.ubi
ubireader_extract_files ubifs.img
cpio -idv < initramfs.cpio
dtc -I dtb -O dts -o output.dts device_tree.dtb
binwalk -e firmware.bin
Architecture-Specific Notes
ARM (most common in IoT):
apt install gcc-arm-linux-gnueabihf gdb-multiarch
qemu-arm -L /usr/arm-linux-gnueabihf/ ./arm_binary
qemu-arm -g 1234 ./arm_binary
gdb-multiarch -ex 'target remote :1234' ./arm_binary
MIPS (routers, embedded):
file binary
qemu-mips -L /usr/mips-linux-gnu/ ./mips_binary
qemu-mipsel -L /usr/mipsel-linux-gnu/ ./mipsel_binary
RISC-V: See main tools.md for Capstone disassembly and RISC-V Advanced below.
RTOS Analysis
FreeRTOS:
- Tasks (like threads): xTaskCreate → function pointer + stack
- Strings: "IDLE", "Tmr Svc", task names
- xQueueSend/xQueueReceive → inter-task communication
- Look for vTaskDelay() for timing, xSemaphoreTake() for sync
Zephyr:
- k_thread_create → kernel thread creation
- k_msgq_put/k_msgq_get → message queues
- CONFIG_* symbols reveal kernel configuration
Bare metal (no OS):
- Interrupt vector table at address 0x0 or 0x08000000 (STM32)
- main loop pattern: while(1) { read_input(); process(); output(); }
- Peripheral registers at memory-mapped addresses (check datasheet)
Kernel Driver Reversing
Linux Kernel Modules
file module.ko
modinfo module.ko
nm module.ko | grep -v " U "
strings module.ko | grep -i "flag\|secret\|ioctl\|device"
Common kernel module CTF patterns:
alloc_chrdev_region(&dev, 0, 1, "challenge");
cdev_init(&cdev, &fops);
long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg) {
switch (cmd) {
case CUSTOM_CMD_1: break;
case CUSTOM_CMD_2: break;
}
}
copy_from_user(kernel_buf, (void __user *)arg, size);
copy_to_user((void __user *)arg, kernel_buf, size);
Debugging kernel modules:
qemu-system-x86_64 -kernel bzImage -initrd initrd.cpio -s -S \
-append "console=ttyS0 nokaslr" -nographic
gdb vmlinux
(gdb) target remote :1234
(gdb) lx-symbols
(gdb) add-symbol-file module.ko 0x<loaded_address>
eBPF Programs
bpftool prog list
bpftool prog dump xlated id <N>
bpftool prog dump jited id <N>
llvm-objdump -d ebpf_prog.o
Windows Kernel Drivers
Game Engine Reversing
Unreal Engine
unrealpak.exe extract GameName.pak -output extracted/
Blueprint reversing:
Blueprints compile to bytecode in .uasset files.
- UAssetGUI / FModel to browse Blueprint assets
- Kismet bytecode → visual scripting logic
- Look for: K2_SetTimer, DoOnce, Branch, Custom Events
- Flag logic often in Blueprint event graphs, not C++
UE4/UE5 C++ reversing:
Unity (Beyond IL2CPP)
See languages.md for IL2CPP basics.
Mono-based Unity (not IL2CPP):
dnspy Assembly-CSharp.dll
ilspy Assembly-CSharp.dll
Unity asset extraction:
Anti-Cheat Analysis
For CTF challenges involving game anti-cheat:
EasyAntiCheat (EAC):
- Kernel driver (EasyAntiCheat_EOS.sys)
- User-mode module injected into game
- Integrity checks on game memory
- Bypass: kernel-level memory R/W (for research only)
BattlEye:
- BEService.exe → BEClient.dll injected
- Communication via encrypted channel
- Screenshot capture, process scanning
- Module: BEClient2.dll
Valve Anti-Cheat (VAC):
- User-mode only (no kernel driver)
- Module hashing, memory scanning
- Network-based detection (server-side)
- Delayed bans (not immediate)
CTF approach:
1. Identify which anti-cheat (strings, loaded modules)
2. For CTF: usually need to bypass specific check, not full anti-cheat
3. Memory patching: find game state in memory, modify values
4. Save file manipulation: often easier than runtime cheating
Lua-Scripted Games
luadec bytecode.luac > decompiled.lua
unluac bytecode.luac > decompiled.lua
luajit -bl bytecode.lua
Automotive / CAN Bus RE
sudo ip link set can0 type can bitrate 500000
sudo ip link set up can0
candump can0
candump -l can0
cansniffer can0
canplayer -I logfile.log can0
cansend can0 7DF#0201000000000000
CTF automotive patterns:
- Seed-key bypass: Reverse the key derivation algorithm from ECU firmware
- CAN message replay: Capture legitimate command, replay to unlock feature
- Firmware extraction from ECU via UDS/KWP2000
RISC-V (Advanced)
Beyond basic disassembly (see tools.md):
Custom Extensions
Bitmanip extensions (Zbb, Zbc, Zbs):
clz, ctz, cpop → count leading/trailing zeros, popcount
orc.b, rev8 → byte-level bit manipulation
andn, orn, xnor → negated logic operations
clmul, clmulh, clmulr → carry-less multiplication (crypto)
bset, bclr, binv, bext → single-bit operations
Crypto extensions (Zk*):
aes32esi, aes32dsmi → AES round operations
sha256sig0, sha512sum0 → SHA hash acceleration
sm3p0, sm4ed → Chinese crypto standards
Privileged Modes
Machine mode (M): Highest privilege, firmware/bootloader
Supervisor mode (S): OS kernel
User mode (U): Applications
CSR registers to watch:
mstatus/sstatus → privilege level, interrupt enable
mtvec/stvec → trap handler address
mepc/sepc → exception return address
mcause/scause → trap cause
satp → page table root (virtual memory)
RISC-V Debugging
openocd -f interface/jlink.cfg -f target/riscv.cfg
riscv64-unknown-elf-gdb binary
(gdb) target remote :3333
qemu-riscv64 -g 1234 -L /usr/riscv64-linux-gnu/ ./binary
riscv64-linux-gnu-gdb -ex 'target remote :1234' ./binary
ctf-reverse — Quick Reference
Inline code snippets and quick-reference tables. Loaded on demand from SKILL.md. All detailed techniques live in the category-specific support files listed in SKILL.md#additional-resources.
Problem-Solving Workflow
- Start with strings extraction - many easy challenges have plaintext flags
- Try ltrace/strace - dynamic analysis often reveals flags without reversing
- Try Frida hooking - hook strcmp/memcmp to capture expected values without reversing
- Try angr - symbolic execution solves many flag-checkers automatically
- Try Qiling - emulate foreign-arch binaries or bypass heavy anti-debug without artifacts
- Map control flow before modifying execution
- Automate manual processes via scripting (r2pipe, Frida, angr, Python)
- Validate assumptions by comparing decompiler outputs (dogbolt.org for side-by-side)
Quick Wins (Try First!)
strings binary | grep -E "flag\{|CTF\{|pico"
strings binary | grep -iE "flag|secret|password"
rabin2 -z binary | grep -i "flag"
ltrace ./binary
strace -f -s 500 ./binary
xxd binary | grep -i flag
./binary AAAA
echo "test" | ./binary
Initial Analysis
file binary
checksec --file=binary
chmod +x binary
Memory Dumping Strategy
Key insight: Let the program compute the answer, then dump it. Break at final comparison (b *main+OFFSET), enter any input of correct length, then x/s $rsi to dump computed flag.
Decoy Flag Detection
Pattern: Multiple fake targets before real check.
Identification:
- Look for multiple comparison targets in sequence
- Check for different success messages
- Trace which comparison is checked LAST
Solution: Set breakpoint at FINAL comparison, not earlier ones.
GDB PIE Debugging
PIE binaries randomize base address. Use relative breakpoints:
gdb ./binary
start
b *main+0xca
run
Comparison Direction (Critical!)
Two patterns:
transform(flag) == stored_target - Reverse the transform
transform(stored_target) == flag - Flag IS the transformed data!
Pattern 2 solution: Don't reverse - just apply transform to stored target.
Common Encryption Patterns
- XOR with single byte - try all 256 values
- XOR with known plaintext (
flag{, CTF{)
- RC4 with hardcoded key
- Custom permutation + XOR
- XOR with position index (
^ i or ^ (i & 0xff)) layered with a repeating key
Quick Tool Reference
r2 -d ./binary
aaa
afl
pdf @ main
analyzeHeadless project/ tmp -import binary -postScript script.py
ida64 binary
Binary Types
Python .pyc
Disassemble with marshal.load() + dis.dis(). Header: 8 bytes (2.x), 12 (3.0-3.6), 16 (3.7+). See languages.md.
WASM
wasm2c checker.wasm -o checker.c
gcc -O3 checker.c wasm-rt-impl.c -o checker
wasm2wat main.wasm -o main.wat
wat2wasm main.wat -o patched.wasm
WASM game patching (Tac Tic Toe, Pragyan 2026): If proof generation is independent of move quality, patch minimax (flip i64.lt_s → i64.gt_s, change bestScore sign) to make AI play badly while proofs remain valid. Invoke /ctf-misc for full game patching patterns (games-and-vms).
Android APK
apktool d app.apk -o decoded/ for resources; jadx app.apk for Java decompilation. Check decoded/res/values/strings.xml for flags. See tools.md.
Flutter APK (Dart AOT)
If lib/arm64-v8a/libapp.so + libflutter.so present, use Blutter: python3 blutter.py path/to/app/lib/arm64-v8a out_dir. Outputs reconstructed Dart symbols + Frida script. See tools.md.
.NET
- dnSpy - debugging + decompilation
- ILSpy - decompiler
Packed (UPX)
upx -d packed -o unpacked
If unpacking fails, inspect UPX metadata first: verify UPX section names, header fields, and version markers are intact. If metadata looks tampered or uncertain, review UPX source on GitHub to identify likely modification points.
Tauri Packed Desktop Apps
Tauri embeds Brotli-compressed frontend assets in the executable. Find index.html xrefs to locate asset index table, dump blobs, Brotli decompress. Reference: tauri-codegen/src/embedded_assets.rs.
Anti-Debugging Bypass
Common checks:
IsDebuggerPresent() / PEB.BeingDebugged / NtQueryInformationProcess (Windows)
ptrace(PTRACE_TRACEME) / /proc/self/status TracerPid (Linux)
- TLS callbacks (run before main — check PE TLS Directory)
- Timing checks (
rdtsc, clock_gettime, GetTickCount)
- Hardware breakpoint detection (DR0-DR3 via GetThreadContext)
- INT3 scanning / code self-hashing (CRC over .text section)
- Signal-based: SIGTRAP handler, SIGALRM timeout, SIGSEGV for real logic
- Frida/DBI detection:
/proc/self/maps scan, port 27042, inline hook checks
Bypass: Set breakpoint at check, modify register to bypass conditional.
pwntools patch: elf.asm(elf.symbols.ptrace, 'ret') to replace function with immediate return. See patterns.md.
For comprehensive anti-analysis techniques and bypasses (30+ methods with code), see anti-analysis.md.
S-Box / Keystream Patterns
Xorshift32: Shifts 13, 17, 5
Xorshift64: Shifts 12, 25, 27
Magic constants: 0x2545f4914f6cdd1d, 0x9e3779b97f4a7c15
Custom VM Analysis
- Identify structure: registers, memory, IP
- Reverse
executeIns for opcode meanings
- Write disassembler mapping opcodes to mnemonics
- Often easier to bruteforce than fully reverse
- Look for the bytecode file loaded via command-line arg
See patterns.md for VM workflow, opcode tables, and state machine BFS.
Python Bytecode Reversing
XOR flag checkers with interleaved even/odd tables are common. See languages.md for bytecode analysis tips and reversing patterns.
Signal-Based Binary Exploration
Binary uses UNIX signals as binary tree navigation; hook sigaction via LD_PRELOAD, DFS by sending signals. See patterns.md.
Malware Anti-Analysis Bypass via Patching
Flip JNZ/JZ (0x75/0x74), change sleep values, patch environment checks in Ghidra (Ctrl+Shift+G). See patterns.md.
Expected Values Tables
Locating:
objdump -s -j .rodata binary | less
x86-64 Gotchas
Sign extension and 32-bit truncation pitfalls. See patterns.md for details and code examples.
Iterative Solver Pattern
Try each byte (0-255) per position, match against expected output. Uniform transform shortcut: if one input byte only changes one output byte, build 0..255 mapping then invert. See patterns.md for full implementation.
Unicorn Emulation (Complex State)
from unicorn import * -- map segments, set up stack, hook to trace. Mixed-mode pitfall: 64-bit stub jumping to 32-bit via retf requires switching to UC_MODE_32 and copying GPRs + EFLAGS + XMM regs. See tools.md.
Multi-Stage Shellcode Loaders
Nested shellcode with XOR decode loops; break at call rax, bypass ptrace with set $rax=0, extract flag from mov instructions. See patterns.md.
Timing Side-Channel Attack
Validation time varies per correct character; measure elapsed time per candidate to recover flag byte-by-byte. See patterns.md.
Godot Game Asset Extraction
Use KeyDot to extract encryption key from executable, then gdsdecomp to extract .pck package. See languages.md.
Roblox Place File Analysis
Query Asset Delivery API for version history; parse .rbxlbin chunks (INST/PROP/PRNT) to diff script sources across versions. See languages.md.
Unstripped Binary Information Leaks
Pattern (Bad Opsec): Debug info and file paths leak author identity.
Quick checks:
strings binary | grep "/home/"
strings binary | grep "/Users/"
file binary
readelf -S binary | grep debug
Custom Mangle Function Reversing
Binary mangles input 2 bytes at a time with running state; extract target from .rodata, write inverse function. See patterns.md.
Rust serde_json Schema Recovery
Disassemble serde Visitor implementations to recover expected JSON schema; field names in order reveal flag. See languages.md.
Position-Based Transformation Reversing
Binary adds/subtracts position index; reverse by undoing per-index offset. See patterns.md.
Hex-Encoded String Comparison
Input converted to hex, compared against constant. Decode with xxd -r -p. See patterns.md.
Embedded ZIP + XOR License Decryption
Binary with named symbols (EMBEDDED_ZIP, ENCRYPTED_MESSAGE) in .rodata → extract ZIP containing license, XOR encrypted message with license bytes to recover flag. No execution needed. See patterns-ctf-2.md.
Stack String Deobfuscation (.rodata XOR Blob)
Binary mmaps .rodata blob, XOR-deobfuscates, uses it to validate input. Reimplement verification loop with pyelftools to extract blob. Look for 0x9E3779B9, 0x85EBCA6B constants and rol32(). See patterns-ctf-2.md.
Prefix Hash Brute-Force
Binary hashes every prefix independently. Recover one character at a time by matching prefix hashes. See patterns-ctf-2.md.
Mathematical Convergence Bitmap
Pattern: Binary classifies coordinate pairs by Newton's method convergence (e.g., z^3-1=0). Grid of pass/fail results renders ASCII art flag. Key: the binary is a classifier, not a checker — reverse the math and visualize. See patterns-ctf.md.
RISC-V Binary Analysis
Statically linked, stripped RISC-V ELF. Use Capstone with CS_MODE_RISCVC | CS_MODE_RISCV64 for mixed compressed instructions. Emulate with qemu-riscv64. Watch for fake flags and XOR decryption with incremental keys. See tools.md.
Sprague-Grundy Game Theory Binary
Game binary plays bounded Nim with PRNG for losing-position moves. Identify game framework (Grundy values = pile % (k+1), XOR determines position), track PRNG state evolution through user input feedback. See patterns-ctf.md.
Kernel Module Maze Solving
Rust kernel module implements maze via device ioctls. Enumerate commands dynamically, build DFS solver with decoy avoidance, deploy as minimal static binary (raw syscalls, no libc). See patterns-ctf.md.
Multi-Threaded VM with Channels
Custom VM with 16+ threads communicating via futex channels. Trace data flow across thread boundaries, extract constants from GDB, watch for inverted validity logic, solve via BFS state space search. See patterns-ctf.md.
CVP/LLL Lattice for Constrained Integer Validation (HTB ShadowLabyrinth)
Binary validates flag via matrix multiplication with 64-bit coefficients; solutions must be printable ASCII. Use LLL reduction + CVP in SageMath to find nearest lattice point in the constrained range. Two-phase pattern: Phase 1 recovers AES key, Phase 2 decrypts custom VM bytecode with another linear system (mod 2^32). See patterns-ctf-2.md.
Decision Tree Function Obfuscation (HTB WonderSMS)
~200+ auto-generated functions routing input through polynomial comparisons. Script extraction via Ghidra headless rather than reversing each function manually. Constraint propagation from known output format cascades through arithmetic constraints. See patterns-ctf-2.md.
Android JNI RegisterNatives Obfuscation (HTB WonderSMS)
RegisterNatives in JNI_OnLoad hides which C++ function handles each Java native method (no standard Java_com_pkg_Class_method symbol). Find the real handler by tracing JNI_OnLoad → RegisterNatives → fnPtr. Use x86_64 .so from APK for best Ghidra decompilation. See languages.md.
Multi-Layer Self-Decrypting Binary
N-layer binary where each layer decrypts the next using user-provided key bytes + SHA-NI. Use oracle (correct key → valid code with expected pattern). JIT execution with fork-per-candidate COW isolation for speed. See patterns-ctf-2.md.
GLSL Shader VM with Self-Modifying Code
Pattern: WebGL2 fragment shader implements Turing-complete VM on a 256x256 RGBA texture (program memory + VRAM). Self-modifying code (STORE opcode) patches drawing instructions. GPU parallelism causes write conflicts — emulate sequentially in Python to recover full output. See patterns-ctf-2.md.
GF(2^8) Gaussian Elimination for Flag Recovery
Pattern: Binary performs Gaussian elimination over GF(2^8) with the AES polynomial (0x11b). Matrix + augmentation vector in .rodata; solution vector is the flag. Look for constant 0x1b in disassembly. Addition is XOR, multiplication uses polynomial reduction. See patterns-ctf-2.md.
Z3 for Single-Line Python Boolean Circuit
Pattern: Single-line Python (2000+ semicolons) with walrus operator chains validates flag as big-endian integer via boolean circuit. Obfuscated XOR (a | b) & ~(a & b). Split on semicolons, translate to Z3 symbolically, solve in under a second. See patterns-ctf-2.md.
Sliding Window Popcount Differential Propagation
Pattern: Binary validates input via expected popcount for each position of a 16-bit sliding window. Popcount differences create a recurrence: bit[i+16] = bit[i] + (data[i+1] - data[i]). Brute-force ~4000-8000 valid initial 16-bit windows; each determines the entire bit sequence. See patterns-ctf-2.md.
Ruby/Perl Polyglot Constraint Satisfaction
Pattern: Single file valid in both Ruby and Perl, each imposing different constraints on a key. Exploits =begin/=end (Ruby block comment) vs =begin/=cut (Perl POD) to run different code per interpreter. Intersect constraints from both languages to recover the unique key. See languages.md.
Verilog/Hardware RE
Pattern: Verilog HDL source for state machines with hidden conditions gated on shift register history. Analyze always @(posedge clk) blocks and case statements to find correct input sequences. See languages.md.
Backdoored Shared Library Detection
Binary works in GDB but fails when run normally (suid)? Check ldd for non-standard libc paths, then strings | diff the suspicious vs. system library to find injected code/passwords. See patterns-ctf.md.
Go Binary Reversing
Large static binary with go.buildid? Use GoReSym to recover function names (works even on stripped binaries). Go strings are {ptr, len} pairs — not null-terminated. Look for main.main, runtime.gopanic, channel ops (runtime.chansend1/chanrecv1). Use Ghidra golang-loader plugin for best results. See languages-compiled.md.
Rust Binary Reversing
Binary with core::panicking strings and _ZN mangled symbols? Use rustfilt for demangling. Panic messages contain source paths and line numbers — strings binary | grep "panicked" is the fastest approach. Option/Result enums use discriminant byte (0=None/Err, 1=Some/Ok). See languages-compiled.md.
Frida Dynamic Instrumentation
Hook runtime functions without modifying binary. frida -f ./binary -l hook.js to spawn with instrumentation. Hook strcmp/memcmp to capture expected values, bypass anti-debug by replacing ptrace return value, scan memory for flag patterns, replace validation functions. See tools-dynamic.md.
angr Symbolic Execution
Automatic path exploration to find inputs satisfying constraints. Load binary with angr.Project, set find/avoid addresses, call simgr.explore(). Constrain input to printable ASCII and known prefix for faster solving. Hook expensive functions (crypto, I/O) to prevent path explosion. See tools-dynamic.md.
Qiling Emulation
Cross-platform binary emulation with OS-level support (syscalls, filesystem). Emulate Linux/Windows/ARM/MIPS binaries on any host. No debugger artifacts — bypasses all anti-debug by default. Hook syscalls and addresses with Python API. See tools-dynamic.md.
VMProtect / Themida Analysis
VMProtect virtualizes code into custom bytecode. Identify VM entry (pushad-like), find handler table (large indirect jump), trace handlers dynamically. For CTF, focus on tracing operations on input rather than full devirtualization. Themida: dump at OEP with ScyllaHide + Scylla. See tools-advanced.md.
Binary Diffing
BinDiff and Diaphora compare two binaries to highlight changes. Essential when challenge provides patched/original versions. Export from IDA/Ghidra, diff to find vulnerability or hidden functionality. See tools-advanced.md.
Advanced GDB (pwndbg, rr)
pwndbg: context, vmmap, search -s "flag{", telescope $rsp. GEF alternative. Reverse debugging with rr record/rr replay — step backward through execution. Python scripting for brute-force and automated tracing. See tools-advanced.md.
macOS / iOS Reversing
Mach-O binaries: otool -l for load commands, class-dump for Objective-C headers. Swift: swift demangle for symbols. iOS apps: decrypt FairPlay DRM with frida-ios-dump, bypass jailbreak detection with Frida hooks. Re-sign patched binaries with codesign -f -s -. See platforms.md.
Embedded / IoT Firmware RE
binwalk -Me firmware.bin for recursive extraction. Hardware: UART/JTAG/SPI flash for firmware dumps. Filesystems: SquashFS (unsquashfs), JFFS2, UBI. Emulate with QEMU: qemu-arm -L /usr/arm-linux-gnueabihf/ ./binary. See platforms.md.
Kernel Driver Reversing
Linux .ko: find ioctl handler via file_operations struct, trace copy_from_user/copy_to_user. Debug with QEMU+GDB (-s -S). eBPF: bpftool prog dump xlated. Windows .sys: find DriverEntry → IoCreateDevice → IRP handlers. See platforms.md.
Game Engine Reversing
Unreal: extract .pak with UnrealPakTool, reverse Blueprint bytecode with FModel. Unity Mono: decompile Assembly-CSharp.dll with dnSpy. Anti-cheat (EAC, BattlEye, VAC): identify system, bypass specific check. Lua games: luadec/unluac for bytecode. See platforms.md.
Swift / Kotlin Binary Reversing
Swift: swift demangle symbols, protocol witness tables for dispatch, __swift5_* sections. Kotlin/JVM: coroutines compile to state machines in invokeSuspend, jadx with Kotlin mode for best decompilation. Kotlin/Native: LLVM backend, looks like C++ in disassembly. See languages-compiled.md.
CTF Reverse — Advanced Tooling (2025-2026 era)
Heavy-weight tooling patterns from 2025-2026 elite CTFs. Base tooling (angr, Ghidra, IDA scripting, Unicorn standalone) lives in tools-advanced.md; dynamic-only tools (Frida, Qiling) in tools-dynamic.md.
Table of Contents
Massive PE (GB-scale) with Per-Layer VirtualProtect-Gated Self-Decryption — Unicorn + angr Hybrid (source: DEFCON 2025 Quals nfuncs1)
Trigger:
- A Windows PE binary ≥ 500 MB (often 3 GB+) where loading in IDA or Ghidra exhausts RAM.
- Each "layer" calls
VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE, &old) followed by an inline decryption loop (xor/add/rol with a per-layer key derived from earlier inputs), then call rax/jmp rax into the freshly-decrypted chunk.
- Many
read() calls that validate each input byte against a lookup table or equality compare, then conditionally fall through to the next layer.
Signals to grep:
strings -a bin.exe | grep -iE 'VirtualProtect|NtProtectVirtualMemory'
objdump -d --start-address=... | grep -c '\s(xor|ror|rol)\s' # many small crypto loops
pefile: count of sections with SizeOfRawData > 10 MB
Why you cannot just angr it: a symbolic engine running through VirtualProtect and self-modified code explodes on path count; a pure Unicorn emulator has no SMT to solve the per-byte input constraints.
Hybrid approach (works in ≤ 1 h on 16 cores):
-
Unicorn-only warm-up — emulate from entry with a fake ReadFile/read hook that feeds placeholder bytes. Record every VirtualProtect call (addr, size, prot) and every call/jmp rax that follows. This builds a layer graph without interpreting crypto semantically — each layer is a (decrypt-fn, entry-addr, key-source) tuple.
-
Per-layer angr solve — for each layer from Unicorn's graph:
- Load only the decrypted bytes Unicorn observed (dump the code page right after the
VirtualProtect with RWX).
- Build a
CFGEmulated rooted at entry-addr, stopping at the next VirtualProtect call (use a state.inspect breakpoint on VirtualProtect address).
- Add constraints from input-byte lookup table checks (angr's
SimState.solver.add(buf[i] == const) after matching the comparator pattern).
state.solver.eval(input, cast_to=bytes) → canonical input for that layer.
-
Hook orchestration in Unicorn — once layers 1..k are solved:
def hook_read(uc, buf_addr, count, user_data):
uc.mem_write(buf_addr, LAYER_INPUTS[user_data["layer_idx"]])
user_data["layer_idx"] += 1
return count
def hook_puts(uc, *_):
uc.reg_write(UC_X86_REG_RIP, uc.reg_read(UC_X86_REG_RIP) + uc.mem_read(RIP, 5))
Let the binary cascade through all decrypted layers end-to-end; the final layer prints the flag.
-
Performance fences:
- Do NOT disassemble inside hooks (Capstone/iced_x86 inside a per-instruction hook costs ~40× — measured on DEFCON 2025 nfuncs1 at 3 h without this fix).
- Use
UC_HOOK_BLOCK, not UC_HOOK_CODE, unless you need single-step.
- Dump memory pages to disk between layers; don't hold 3 GB in Python.
Template:
import unicorn, angr
uc = unicorn.Uc(UC_ARCH_X86, UC_MODE_64)
uc.mem_map(BASE, SIZE)
uc.mem_write(BASE, open(BIN, "rb").read())
LAYERS = []
def vp_hook(uc, addr, sz, user_data):
LAYERS.append((addr, sz, uc.mem_read(addr, sz)))
uc.hook_add(UC_HOOK_CODE, vp_hook, begin=VP_ADDR, end=VP_ADDR+1)
uc.emu_start(ENTRY, STOP, timeout=3*60*10**6)
for addr, sz, code in LAYERS:
proj = angr.Project.angr.load_shellcode(code, "amd64", load_address=addr)
st = proj.factory.blank_state(addr=addr)
buf = st.solver.BVS("buf", 8*INPUT_SZ)
st.memory.store(INPUT_BUF, buf)
sm = proj.factory.simulation_manager(st)
sm.explore(find=SUCCESS_ADDR, avoid=FAIL_ADDR)
LAYER_INPUT = sm.found[0].solver.eval(buf, cast_to=bytes)
Generalizes to: any multi-stage unpacker (VMProtect 3+, Themida, custom game protections) where each stage exposes a small read() oracle that gates progression. The key insight is Unicorn to navigate, angr to solve per-stage constraints, never both at once.
CTF Reverse - Advanced Tools & Deobfuscation
Advanced tooling for commercial packers/protectors, binary diffing, deobfuscation frameworks, emulation, and symbolic execution beyond angr.
Table of Contents
VMProtect Analysis
VMProtect virtualizes x86/x64 code into custom bytecode interpreted by a generated VM. One of the most challenging protectors in CTF.
Recognition
strings binary | grep -i "vmp\|vmprotect"
readelf -S binary | grep ".vmp"
Key indicators:
push / pop heavy prologues (VM entry pushes all registers to stack)
- Large switch-case dispatcher (the VM handler loop)
- Anti-debug checks embedded in VM handlers
- Mutation engine: same opcode has different handlers per build
Approach
1. Identify VM entry points — look for pushad/pushaq-like sequences
2. Find the handler table — large indirect jump (jmp [reg + offset])
3. Trace handler execution — each handler ends with jump to next
4. Identify handlers:
- vAdd, vSub, vMul, vXor, vNot (arithmetic)
- vPush, vPop (stack operations)
- vLoad, vStore (memory access)
- vJmp, vJcc (control flow)
- vRet (VM exit — restores real registers)
5. Build disassembler for VM bytecode
6. Simplify / deobfuscate the lifted IL
Tools
- VMPAttack (IDA plugin): Automatically identifies VM handlers
- NoVmp: Devirtualization via VTIL (open-source)
- VMProtect devirtualizer scripts: Community IDA/Binary Ninja scripts
- Approach for CTF: Often easier to trace specific operations (crypto, comparisons) than fully devirtualize
CTF Strategy
import frida
script = """
var vm_dispatch = ptr('0x...'); // Address of handler table jump
Interceptor.attach(vm_dispatch, {
onEnter(args) {
// Log handler index and stack state
var handler_idx = this.context.rax; // or whichever register
console.log('Handler:', handler_idx, 'RSP:', this.context.rsp);
}
});
"""
Key insight: Full devirtualization is rarely needed for CTF. Focus on tracing what operations are performed on your input. Hook comparison/crypto functions called from within the VM.
Themida / WinLicense Analysis
Similar to VMProtect but with additional anti-debug layers.
Recognition
- Sections:
.themida, .winlice
- Extremely heavy anti-debug (kernel-level checks, driver installation)
- Code mutation + virtualization + packing combined
Approach for CTF
- Dump unpacked code: Let it run, dump process memory after unpacking
- Bypass anti-debug: ScyllaHide in x64dbg with Themida-specific preset
- Fix imports: Use Scylla plugin for IAT reconstruction
- Focus on dumped code: Once unpacked, analyze as normal binary
1. Load binary
2. Enable ScyllaHide → Profile: Themida
3. Run to OEP (Original Entry Point) — may need several attempts
4. Dump with Scylla: OEP → IAT Autosearch → Get Imports → Dump
5. Fix dump: Scylla → Fix Dump
6. Analyze fixed dump in Ghidra/IDA
Binary Diffing
Critical for patch analysis, 1-day exploit development, and CTF challenges that provide two versions of a binary.
BinDiff
bindiff primary.BinExport secondary.BinExport
Key metrics:
- Similarity score (0.0-1.0) per function pair
- Changed instructions highlighted
- Unmatched functions = new/removed code
Diaphora
Free, open-source alternative to BinDiff, runs as IDA plugin.
Useful for CTF: When challenge provides "patched" and "original" binaries, diff reveals the vulnerability or hidden functionality.
Deobfuscation Frameworks
D-810 (IDA)
Pattern-based deobfuscation plugin for IDA Pro. Excellent for OLLVM-obfuscated binaries.
Capabilities:
- MBA simplification: (a ^ b) + 2*(a & b) → a + b
- Dead code elimination
- Opaque predicate removal
- Constant folding
- Control flow unflattening (partial)
Installation: Copy to IDA plugins directory
Usage: Edit → Plugins → D-810 → Select rules → Apply
GOOMBA (Ghidra)
GOOMBA (Ghidra-based Obfuscated Object Matching and Bytes Analysis):
- Integrates with Ghidra's P-Code
- Simplifies MBA expressions
- Pattern matching for known obfuscation
Installation: Copy .jar to Ghidra extensions
Usage: Code Browser → Analysis → GOOMBA
Miasm
Powerful reverse engineering framework with symbolic execution and IR lifting.
from miasm.analysis.binary import Container
from miasm.analysis.machine import Machine
from miasm.expression.expression import *
cont = Container.from_stream(open("binary", "rb"))
machine = Machine(cont.arch)
mdis = machine.dis_engine(cont.bin_stream, loc_db=cont.loc_db)
asmcfg = mdis.dis_multiblock(entry_addr)
lifter = machine.lifter_model_call(loc_db=cont.loc_db)
ircfg = lifter.new_ircfg_from_asmcfg(asmcfg)
from miasm.ir.symbexec import SymbolicExecutionEngine
sb = SymbolicExecutionEngine(lifter)
Use case: Deobfuscate expression trees, simplify complex arithmetic, trace data flow through obfuscated code.
Qiling Framework (Emulation)
Cross-platform emulation framework built on Unicorn, with OS-level support (syscalls, filesystem, registry).
from qiling import Qiling
from qiling.const import QL_VERBOSE
ql = Qiling(["./binary"], "rootfs/x8664_linux",
verbose=QL_VERBOSE.DEBUG)
@ql.hook_address
def hook_check(ql, address, size):
if address == 0x401234:
ql.arch.regs.rax = 0
ql.log.info("Anti-debug bypassed")
@ql.hook_syscall(name="ptrace")
def hook_ptrace(ql, request, pid, addr, data):
return 0
@ql.set_api("IsDebuggerPresent", target=ql.os.user_defined_api)
def hook_isdebug(ql, address, params):
return 0
ql.run()
Advantages over Unicorn:
- OS emulation (file I/O, network, registry)
- Multi-platform (Linux, Windows, macOS, Android, UEFI)
- Built-in debugger interface
- Rootfs for library loading
CTF use cases:
- Emulate binaries for foreign architectures (ARM, MIPS, RISC-V)
- Bypass all anti-debug at once (no debugger artifacts)
- Fuzz embedded/IoT firmware without hardware
- Trace execution without code modification
Triton (Dynamic Symbolic Execution)
Pin-based dynamic binary analysis framework with symbolic execution, taint analysis, and AST simplification.
from triton import *
ctx = TritonContext(ARCH.X86_64)
with open("binary", "rb") as f:
binary = f.read()
ctx.setConcreteMemoryAreaValue(0x400000, binary)
for i in range(32):
ctx.symbolizeMemory(MemoryAccess(INPUT_ADDR + i, CPUSIZE.BYTE), f"input_{i}")
pc = ENTRY_POINT
while pc:
inst = Instruction(pc, ctx.getConcreteMemoryAreaValue(pc, 16))
ctx.processing(inst)
if pc == CMP_ADDR:
ast = ctx.getPathConstraintsAst()
model = ctx.getModel(ast)
for k, v in sorted(model.items()):
print(f"input[{k}] = {chr(v.getValue())}", end="")
break
pc = ctx.getConcreteRegisterValue(ctx.registers.rip)
Triton vs angr:
| Feature | Triton | angr |
|---|
| Execution | Concrete + symbolic (DSE) | Fully symbolic |
| Speed | Faster (concrete-driven) | Slower (explores all paths) |
| Path explosion | Less prone (follows one path) | Major issue |
| API | C++ / Python | Python |
| Best for | Single-path deobfuscation, taint tracking | Multi-path exploration |
Key use: Triton excels at deobfuscation — run the program concretely, but track symbolic state, then simplify the collected constraints.
Manticore (Symbolic Execution)
Trail of Bits' symbolic execution tool. Similar to angr but with native EVM (Ethereum) support.
from manticore.native import Manticore
m = Manticore("./binary")
@m.hook(0x401234)
def success(state):
buf = state.solve_one_n_batched(state.input_symbols, 32)
print("Flag:", bytes(buf))
m.kill()
@m.hook(0x401256)
def fail(state):
state.abandon()
m.run()
Best for: EVM/smart contract analysis, simpler Linux binaries. angr is generally more mature for complex RE tasks.
Rizin / Cutter
Rizin is the maintained fork of radare2. Cutter is its Qt-based GUI.
rizin -d ./binary
> aaa
> afl
> pdf @ main
> VV
cutter binary
Cutter advantages:
- Built-in Ghidra decompiler (via r2ghidra plugin)
- Graph view, hex editor, debug panel in one GUI
- Integrated Python/JavaScript scripting console
- Free and open source
RetDec (Retargetable Decompiler)
LLVM-based decompiler supporting many architectures. Free and open-source.
pip install retdec-decompiler
retdec-decompiler binary
retdec-decompiler --select-ranges 0x401000-0x401100 binary
Strengths: Multi-arch support (x86, ARM, MIPS, PowerPC, PIC32), free, produces compilable C. Good for architectures not well-supported by Ghidra.
Advanced GDB Techniques
Python Scripting
import gdb
class TraceCompare(gdb.Breakpoint):
"""Log all comparison operations."""
def __init__(self, addr):
super().__init__(f"*{addr}", gdb.BP_BREAKPOINT)
def stop(self):
frame = gdb.selected_frame()
rdi = int(frame.read_register("rdi"))
rsi = int(frame.read_register("rsi"))
rdx = int(frame.read_register("rdx"))
inferior = gdb.selected_inferior()
buf1 = inferior.read_memory(rdi, rdx).tobytes()
buf2 = inferior.read_memory(rsi, rdx).tobytes()
print(f"memcmp({buf1!r}, {buf2!r}, {rdx})")
return False
Brute-Force with GDB Script
import gdb, string
def bruteforce_flag(check_addr, success_addr, fail_addr, flag_len):
flag = []
for pos in range(flag_len):
for ch in string.printable:
candidate = ''.join(flag) + ch + 'A' * (flag_len - pos - 1)
gdb.execute('start', to_string=True)
gdb.execute(f'b *{check_addr}', to_string=True)
gdb.execute('continue', to_string=True)
rip = int(gdb.parse_and_eval('$rip'))
if rip == success_addr:
flag.append(ch)
break
gdb.execute('delete breakpoints', to_string=True)
return ''.join(flag)
Conditional Breakpoints
(gdb) b *0x401234 if $rax == 0x41
(gdb) b *0x401234 if *(char*)$rdi == 'f'
(gdb) b *0x401234
(gdb) ignore 1 99
(gdb) b *0x401234
(gdb) commands
> silent
> printf "rax=%lx rdi=%lx\n", $rax, $rdi
> continue
> end
Watchpoints
(gdb) watch *(int*)0x601050
(gdb) rwatch *(int*)0x601050
(gdb) awatch *(int*)0x601050
(gdb) watch flag_buffer[0]
(gdb) watch *(int*)0x601050 if *(int*)0x601050 == 0x42
Reverse Debugging (rr)
rr record ./binary
rr replay
(gdb) reverse-continue
(gdb) reverse-stepi
(gdb) reverse-next
(gdb) when
(gdb) checkpoint
(gdb) restart 1
Key use: When you step past the critical moment, reverse back instead of restarting. Invaluable for anti-debug that corrupts state.
GDB Dashboard / GEF / pwndbg
git clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh
pwndbg> context
pwndbg> vmmap
pwndbg> search -s "flag{"
pwndbg> telescope $rsp 20
pwndbg> cyclic 200
pwndbg> hexdump $rdi 64
pwndbg> got
pwndbg> plt
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"
gef> xinfo $rdi
gef> checksec
gef> heap chunks
gef> pattern create 100
Advanced Ghidra Scripting
from ghidra.program.model.symbol import SourceType
fm = currentProgram.getFunctionManager()
for func in fm.getFunctions(True):
if func.getName().startswith("FUN_"):
body = func.getBody()
inst_iter = currentProgram.getListing().getInstructions(body, True)
for inst in inst_iter:
if inst.getMnemonicString() == "CPUID":
func.setName("anti_vm_check_" + hex(func.getEntryPoint().getOffset()),
SourceType.USER_DEFINED)
break
def extract_xor_constants(func):
"""Find all XOR operations and their immediate operands."""
constants = []
body = func.getBody()
inst_iter = currentProgram.getListing().getInstructions(body, True)
for inst in inst_iter:
if inst.getMnemonicString() == "XOR":
for i in range(inst.getNumOperands()):
op = inst.getOpObjects(i)
if op and hasattr(op[0], 'getValue'):
constants.append(int(op[0].getValue()))
return constants
from ghidra.app.decompiler import DecompInterface
decomp = DecompInterface()
decomp.openProgram(currentProgram)
for func in fm.getFunctions(True):
result = decomp.decompileFunction(func, 30, monitor)
if result.depiledFunction():
code = result.getDecompiledFunction().getC()
if "strcmp" in code or "memcmp" in code:
print(f"Comparison in {func.getName()} at {func.getEntryPoint()}")
Patching Strategies
Binary Ninja Patching (Python API)
import binaryninja as bn
bv = bn.open_view("binary")
bv.write(0x401234, b"\x90" * 5)
bv.write(0x401234, b"\x74")
bv.write(0x401234, b"\xb8\x01\x00\x00\x00\xc3")
bv.save("patched")
LIEF (Library for Instrumenting Executable Formats)
import lief
binary = lief.parse("binary")
section = lief.ELF.Section(".patch")
section.content = list(b"\xcc" * 0x100)
section.type = lief.ELF.SECTION_TYPES.PROGBITS
section.flags = lief.ELF.SECTION_FLAGS.EXECINSTR | lief.ELF.SECTION_FLAGS.ALLOC
binary.add(section)
binary.header.entrypoint = 0x401000
binary.patch_pltgot("strcmp", 0x401000)
binary.write("patched")
LIEF advantages: Cross-format (ELF, PE, Mach-O), Python API, can add sections/segments, modify headers, patch imports.
TTF GSUB Ligature Steganography (source: TFC CTF 2025 font-leagues)
Trigger: challenge ships a TTF/OTF; typing specific pairs in an editor renders a different glyph; correct flag collapses to O (or another marker glyph).
Signals: abnormally dense GSUB table in ttx output; glyph names like hex_a, hex_b, hex_c; ligature chains > 10 steps.
Mechanic: dump GSUB via ttx -t GSUB file.ttf. Ligature subtables define (glyph-A, glyph-B) → glyph-X replacements. Reverse the DAG: leaves you need to reach the marker glyph → input-byte pairs. Map one/zero glyph names back to bits or ascii_XX → characters. Automation:
from fontTools.ttLib import TTFont
f = TTFont("x.ttf")
gsub = f["GSUB"].table
AVX2 Lane-Wise Z3 Lifting (source: pwn.college AoP 2025 day 12)
Trigger: RE target uses vpaddb / vpsubb / vpblendvb / vpshufb in a tight loop operating on the input.
Signals: ymm* registers, vp* opcodes with imm8 shuffle masks, loop bound = input length.
Mechanic: lift each vector op into 32 (or 16) parallel BitVec8 operations over Z3. Treat each vp* as lane-parallel; vpshufb ymm0, ymm1, ymm2 → per-lane BitVec(ymm1[ymm2[i] & 0xf]). Solver inverts the transformation in seconds. Don't try to re-express as scalar — stay in lanes.
Template:
from z3 import *
inp = [BitVec(f"b{i}", 8) for i in range(32)]
s = Solver(); s.add(final == expected); s.check()
CTF Reverse - Dynamic Analysis Tools
Table of Contents
Frida (Dynamic Instrumentation)
Frida injects JavaScript into running processes for real-time hooking, tracing, and modification. Essential for anti-debug bypass, runtime inspection, and mobile RE.
Installation
pip install frida-tools frida
frida --version
Basic Function Hooking
Interceptor.attach(Module.findExportByName(null, "strcmp"), {
onEnter: function(args) {
this.arg0 = Memory.readUtf8String(args[0]);
this.arg1 = Memory.readUtf8String(args[1]);
console.log(`strcmp("${this.arg0}", "${this.arg1}")`);
},
onLeave: function(retval) {
console.log(` → ${retval}`);
}
});
frida -p $(pidof binary) -l hook.js
frida -f ./binary -l hook.js --no-pause
frida -f ./binary --no-pause -e '
Interceptor.attach(Module.findExportByName(null, "strcmp"), {
onEnter(args) {
console.log("strcmp:", Memory.readUtf8String(args[0]), Memory.readUtf8String(args[1]));
}
});
'
Anti-Debug Bypass
Interceptor.attach(Module.findExportByName(null, "ptrace"), {
onEnter: function(args) {
this.request = args[0].toInt32();
},
onLeave: function(retval) {
if (this.request === 0) {
retval.replace(ptr(0));
console.log("[*] ptrace(TRACEME) bypassed");
}
}
});
var isDbg = Module.findExportByName("kernel32.dll", "IsDebuggerPresent");
Interceptor.attach(isDbg, {
onLeave: function(retval) {
retval.replace(ptr(0));
}
});
Interceptor.attach(Module.findExportByName(null, "clock_gettime"), {
onLeave: function(retval) {
var ts = this.context.rsi || this.context.x1;
Memory.writeU64(ts, 0);
Memory.writeU64(ts.add(8), 0);
}
});
Memory Scanning and Patching
Process.enumerateRanges('r--').forEach(function(range) {
Memory.scan(range.base, range.size, "66 6c 61 67 7b", {
onMatch: function(address, size) {
console.log("[FLAG] Found at:", address, Memory.readUtf8String(address, 64));
},
onComplete: function() {}
});
});
var addr = Module.findBaseAddress("binary").add(0x1234);
Memory.patchCode(addr, 2, function(code) {
var writer = new X86Writer(code, { pc: addr });
writer.putNop();
writer.putNop();
writer.flush();
});
Function Replacement
var checkFlag = Module.findExportByName(null, "check_flag");
Interceptor.replace(checkFlag, new NativeCallback(function(input) {
console.log("[*] check_flag called with:", Memory.readUtf8String(input));
return 1;
}, 'int', ['pointer']));
Tracing and Stalker
var targetAddr = Module.findExportByName(null, "main");
Stalker.follow(Process.getCurrentThreadId(), {
transform: function(iterator) {
var instruction;
while ((instruction = iterator.next()) !== null) {
if (instruction.mnemonic === "call") {
iterator.putCallout(function(context) {
console.log("CALL at", context.pc, "→", ptr(context.pc).readPointer());
});
}
iterator.keep();
}
}
});
r2frida (Radare2 + Frida Integration)
r2 frida://spawn/./binary
\ii
\il
\dt strcmp
\dc
\dm
Frida for Android/iOS
adb push frida-server /data/local/tmp/
adb shell "chmod 755 /data/local/tmp/frida-server && /data/local/tmp/frida-server &"
frida -U -f com.example.app -l hook_android.js --no-pause
Java.perform(function() {
var MainActivity = Java.use("com.example.app.MainActivity");
MainActivity.checkPassword.implementation = function(input) {
console.log("[*] checkPassword called with:", input);
var result = this.checkPassword(input);
console.log("[*] Result:", result);
return result;
};
});
Key insight: Frida excels where static analysis fails — obfuscated code, packed binaries, and runtime-generated data. Hook comparison functions (strcmp, memcmp, custom validators) to extract expected values without reversing the algorithm. Use Interceptor.attach for observation, Interceptor.replace for modification.
When to use: Anti-debugging bypass, extracting runtime-computed keys, hooking crypto functions to dump plaintext, mobile app analysis, packed binary inspection.
angr (Symbolic Execution)
angr automatically explores program paths to find inputs satisfying constraints. Solves many flag-checking binaries in minutes that take hours manually.
Installation
pip install angr
Basic Path Exploration
import angr
import claripy
proj = angr.Project('./binary', auto_load_libs=False)
FIND_ADDR = 0x401234
AVOID_ADDR = 0x401256
simgr = proj.factory.simgr()
simgr.explore(find=FIND_ADDR, avoid=AVOID_ADDR)
if simgr.found:
found = simgr.found[0]
print("Flag:", found.posix.dumps(0))
Symbolic Input with Constraints
import angr
import claripy
proj = angr.Project('./binary', auto_load_libs=False)
flag_len = 32
flag_chars = [claripy.BVS(f'flag_{i}', 8) for i in range(flag_len)]
flag = claripy.Concat(*flag_chars + [claripy.BVV(b'\n')])
state = proj.factory.entry_state(stdin=flag)
for c in flag_chars:
state.solver.add(c >= 0x20)
state.solver.add(c <= 0x7e)
state.solver.add(flag_chars[0] == ord('f'))
state.solver.add(flag_chars[1] == ord('l'))
state.solver.add(flag_chars[2] == ord('a'))
state.solver.add(flag_chars[3] == ord('g'))
state.solver.add(flag_chars[4] == ord('{'))
state.solver.add(flag_chars[flag_len-1] == ord('}'))
simgr = proj.factory.simgr(state)
simgr.explore(find=0x401234, avoid=0x401256)
if simgr.found:
found = simgr.found[0]
result = found.solver.eval(flag, cast_to=bytes)
print("Flag:", result.decode())
Hook Functions to Simplify Analysis
import angr
proj = angr.Project('./binary', auto_load_libs=False)
@proj.hook(0x401100, length=5)
def skip_printf(state):
pass
@proj.hook(0x401050, length=5)
def skip_sleep(state):
pass
class AlwaysSucceed(angr.SimProcedure):
def run(self):
return 1
proj.hook_symbol('check_license', AlwaysSucceed())
Exploring from Specific Address
state = proj.factory.blank_state(addr=0x401200)
state.regs.rdi = 0x600000
state.memory.store(0x600000, b"AAAA" + b"\x00" * 28)
simgr = proj.factory.simgr(state)
simgr.explore(find=0x401300, avoid=0x401350)
Common Patterns and Tips
state = proj.factory.entry_state(args=['./binary', flag_sym])
simgr.explore(
find=[0x401234, 0x401300],
avoid=[0x401256, 0x401400]
)
def is_successful(state):
stdout = state.posix.dumps(1)
return b"Correct" in stdout
def should_avoid(state):
stdout = state.posix.dumps(1)
return b"Wrong" in stdout
simgr.explore(find=is_successful, avoid=should_avoid)
simgr.explore(find=0x401234, avoid=0x401256, num_find=1)
simgr.use_technique(angr.exploration_techniques.DFS())
simgr.use_technique(angr.exploration_techniques.LengthLimiter(max_length=500))
Dealing with Path Explosion
simgr.use_technique(angr.exploration_techniques.DFS())
state.options.add(angr.options.ZERO_FILL_UNCONSTRAINED_MEMORY)
state.options.add(angr.options.ZERO_FILL_UNCONSTRAINED_REGISTERS)
import hashlib
class SHA256Hook(angr.SimProcedure):
def run(self, data, length, output):
concrete_data = self.state.solver.eval(
self.state.memory.load(data, self.state.solver.eval(length)),
cast_to=bytes
)
h = hashlib.sha256(concrete_data).digest()
self.state.memory.store(output, h)
proj.hook_symbol('SHA256', SHA256Hook())
angr CFG Recovery
cfg = proj.analyses.CFGFast()
print(f"Functions found: {len(cfg.functions)}")
for addr, func in cfg.functions.items():
if func.name == 'main':
print(f"main at {addr:#x}")
break
node = cfg.model.get_any_node(0x401234)
print("Predecessors:", [hex(p.addr) for p in cfg.model.get_predecessors(node)])
Key insight: angr works best on flag-checker binaries with clear success/failure paths. For complex binaries, hook expensive functions (crypto, I/O) and use DFS exploration. Start with the simplest approach (just find/avoid addresses) before adding constraints. If angr is slow, constrain input to printable ASCII and add known prefix.
When to use: Flag validators with branching logic, maze/path-finding binaries, constraint-heavy checks, automated binary analysis. Less effective for: heavy crypto, floating-point math, complex heap operations.
lldb (LLVM Debugger)
Primary debugger for macOS/iOS. Also works on Linux. Preferred for Swift/Objective-C and Apple platform binaries.
Basic Commands
lldb ./binary
(lldb) run
(lldb) b main
(lldb) b 0x401234
(lldb) breakpoint set -r "check.*"
(lldb) c
(lldb) si
(lldb) ni
(lldb) register read
(lldb) register write rax 0
(lldb) memory read 0x401000 -c 32
(lldb) x/s $rsi
(lldb) dis -n main
(lldb) image list
Scripting (Python)
import lldb
def hook_strcmp(debugger, command, result, internal_dict):
target = debugger.GetSelectedTarget()
process = target.GetProcess()
thread = process.GetSelectedThread()
frame = thread.GetSelectedFrame()
arg0 = frame.FindRegister("rdi").GetValueAsUnsigned()
arg1 = frame.FindRegister("rsi").GetValueAsUnsigned()
s0 = process.ReadCStringFromMemory(arg0, 256, lldb.SBError())
s1 = process.ReadCStringFromMemory(arg1, 256, lldb.SBError())
print(f'strcmp("{s0}", "{s1}")')
Key insight: Use lldb for macOS binaries (Mach-O), iOS apps, and when GDB isn't available. image list gives ASLR slide for PIE binaries. Scripting API is more structured than GDB's.
x64dbg (Windows Debugger)
Open-source Windows debugger with modern UI. Alternative to OllyDbg/WinDbg for Windows RE challenges.
Key Features
x64dbg.exe binary.exe
x32dbg.exe binary.exe
F2 → Toggle breakpoint
F7 → Step into
F8 → Step over
F9 → Run
Ctrl+G → Go to address
Ctrl+F → Find pattern in memory
Scripting
bp 0x401234
SetBPX 0x401234, 0, "log {s:utf8@[esp+4]}"
run
StepOver
Common CTF Workflow
- Set breakpoint on
GetWindowTextA/MessageBoxA for GUI crackers
- Trace back from success/failure message
- Use Scylla plugin for IAT reconstruction on packed binaries
- Snowman decompiler plugin for quick pseudo-C
Key insight: x64dbg has built-in pattern scanning, hardware breakpoints, and conditional logging. For Windows CTF binaries, it's often faster than IDA/Ghidra for dynamic analysis. Use the xAnalyzer plugin for automatic function argument annotation.
Qiling Framework (Cross-Platform Emulation)
Qiling emulates binaries with OS-level support (syscalls, filesystem, registry). Built on Unicorn but adds the OS layer that Unicorn lacks.
Installation
pip install qiling
git clone https://github.com/qilingframework/rootfs
Basic Usage
from qiling import Qiling
from qiling.const import QL_VERBOSE
ql = Qiling(["./binary", "arg1"], "rootfs/x8664_linux",
verbose=QL_VERBOSE.DEFAULT)
ql.run()
ql = Qiling(["rootfs/x86_windows/bin/binary.exe"], "rootfs/x86_windows")
ql.run()
ql = Qiling(["rootfs/arm_linux/bin/binary"], "rootfs/arm_linux")
ql.run()
Anti-Debug Bypass via Emulation
from qiling import Qiling
ql = Qiling(["./binary"], "rootfs/x8664_linux")
def hook_ptrace(ql, ptrace_request, pid, addr, data):
ql.log.info("ptrace bypassed")
return 0
ql.os.set_syscall("ptrace", hook_ptrace)
def skip_check(ql):
ql.arch.regs.rax = 0
ql.log.info(f"Skipped check at {ql.arch.regs.rip:#x}")
ql.hook_address(skip_check, 0x401234)
ql.run()
Input Fuzzing with Qiling
import string
from qiling import Qiling
def test_input(candidate):
ql = Qiling(["./binary"], "rootfs/x8664_linux",
verbose=QL_VERBOSE.DISABLED, stdin=candidate.encode())
ql.run()
return ql.os.stdout.read()
for ch in string.printable:
output = test_input("flag{" + ch)
if b"Correct" in output:
print(f"Found: {ch}")
Advantages over GDB/Frida:
- No debugger artifacts (bypasses all anti-debug by default)
- Cross-platform without hardware (ARM, MIPS, RISC-V on x86 host)
- Scriptable with Python (faster iteration than GDB)
- Snapshot/restore for brute-forcing
When to use: Foreign architecture binaries, IoT firmware, heavy anti-debug, automated testing of many inputs.
Triton (Dynamic Symbolic Execution)
See tools-advanced.md for full Triton reference. Quick usage:
from triton import *
ctx = TritonContext(ARCH.X86_64)
for i in range(32):
ctx.symbolizeMemory(MemoryAccess(0x600000 + i, CPUSIZE.BYTE), f"flag_{i}")
model = ctx.getModel(ctx.getPathConstraintsAst())
flag = ''.join(chr(v.getValue()) for _, v in sorted(model.items()))
Best for: Single-path symbolic execution, deobfuscation, taint analysis. Faster than angr for linear code paths.
Intel Pin Instruction-Counting Side Channel (Hackover CTF 2015)
Pattern: Brute-force input character-by-character against a binary using Intel Pin's inscount0 tool. Each correct character causes deeper execution (more instructions) in the comparison logic.
import string
from subprocess import Popen, PIPE
pin = './pin'
tool = './source/tools/ManualExamples/obj-ia32/inscount0.so'
binary = './target'
key = ''
while True:
best_count, best_char = 0, ''
for c in string.printable:
cmd = [pin, '-injection', 'child', '-t', tool, '--', binary]
p = Popen(cmd, stdout=PIPE, stdin=PIPE, stderr=PIPE)
p.communicate((key + c + '\n').encode())
with open('inscount.out') as f:
count = int(f.read().split()[-1])
if count > best_count:
best_count, best_char = count, c
key += best_char
print(f"Found: {key}")
Key insight: Movfuscated binaries (compiled with movfuscator) expand every instruction into sequences of mov operations, making static analysis impractical. However, character-by-character comparison still creates measurable instruction count differences. Pin's inscount0.so counts total executed instructions — the correct character at each position causes ~1000+ more instructions (proceeding further in the comparison). Also works for obfuscated binaries with sequential input checks.
CTF Reverse - Tools Reference
Table of Contents
For dynamic instrumentation tools (Frida, angr, lldb, x64dbg), see tools-dynamic.md.
GDB
Basic Commands
gdb ./binary
run
start
b *0x401234
b *main+0x100
c
si
ni
x/s $rsi
x/20x $rsp
info registers
set $eax=0
PIE Binary Debugging
gdb ./binary
start
b *main+0xca
b *main+0x198
run
One-liner Automation
gdb -ex 'start' -ex 'b *main+0x198' -ex 'run' ./binary
Memory Examination
x/s $rsi
x/38c $rsi
x/20x $rsp
x/10i $rip
Radare2
Basic Session
r2 -d ./binary
aaa
afl
pdf @ main
db 0x401234
dc
ood
dr
dr eax=0
r2pipe Automation
import r2pipe
r2 = r2pipe.open('./binary', flags=['-d'])
r2.cmd('aaa')
r2.cmd('db 0x401234')
for char in range(256):
r2.cmd('ood')
r2.cmd(f'dr eax={char}')
output = r2.cmd('dc')
if 'correct' in output:
print(f"Found: {chr(char)}")
Ghidra
Headless Analysis
analyzeHeadless /path/to/project tmp -import binary -postScript script.py
Emulator for Decryption
EmulatorHelper emu = new EmulatorHelper(currentProgram);
emu.writeRegister("RSP", 0x2fff0000);
emu.writeRegister("RBP", 0x2fff0000);
emu.writeMemory(dataAddress, encryptedBytes);
emu.writeRegister("RDI", arg1);
emu.setBreakpoint(returnAddress);
emu.run(functionEntryAddress);
byte[] decrypted = emu.readMemory(outputAddress, length);
MCP Commands
- Recon:
list_functions, list_imports, list_strings
- Analysis:
decompile_function, get_xrefs_to
- Annotation:
rename_function, rename_variable
Unicorn Emulation
Basic Setup
from unicorn import *
from unicorn.x86_const import *
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(0x400000, 0x10000)
mu.mem_write(0x400000, code_bytes)
mu.mem_map(0x7fff0000, 0x10000)
mu.reg_write(UC_X86_REG_RSP, 0x7fff0000 + 0xff00)
mu.emu_start(start_addr, end_addr)
Mixed-Mode (64 to 32) Switch
uc32 = Uc(UC_ARCH_X86, UC_MODE_32)
reg_map = {
UC_X86_REG_EAX: UC_X86_REG_RAX,
UC_X86_REG_EBX: UC_X86_REG_RBX,
UC_X86_REG_ECX: UC_X86_REG_RCX,
UC_X86_REG_EDX: UC_X86_REG_RDX,
UC_X86_REG_ESI: UC_X86_REG_RSI,
UC_X86_REG_EDI: UC_X86_REG_RDI,
UC_X86_REG_EBP: UC_X86_REG_RBP,
}
for e, r in reg_map.items():
uc32.reg_write(e, mu.reg_read(r) & 0xffffffff)
uc32.reg_write(UC_X86_REG_EFLAGS, mu.reg_read(UC_X86_REG_RFLAGS) & 0xffffffff)
for xr in [UC_X86_REG_XMM0, UC_X86_REG_XMM1, UC_X86_REG_XMM2, UC_X86_REG_XMM3,
UC_X86_REG_XMM4, UC_X86_REG_XMM5, UC_X86_REG_XMM6, UC_X86_REG_XMM7]:
uc32.reg_write(xr, mu.reg_read(xr))
Tip: set UC_IGNORE_REG_BREAK=1 to silence warnings on unimplemented regs.
Register Tracing Hook
def hook_code(uc, address, size, user_data):
if address == TARGET_ADDR:
rsi = uc.reg_read(UC_X86_REG_RSI)
print(f"0x{address:x}: rsi=0x{rsi:016x}")
mu.hook_add(UC_HOOK_CODE, hook_code)
Track Register Changes
prev_rsi = [None]
def hook_rsi_changes(uc, address, size, user_data):
rsi = uc.reg_read(UC_X86_REG_RSI)
if rsi != prev_rsi[0]:
print(f"0x{address:x}: RSI changed to 0x{rsi:016x}")
prev_rsi[0] = rsi
mu.hook_add(UC_HOOK_CODE, hook_rsi_changes)
Python Bytecode
Disassembly
import marshal, dis
with open('file.pyc', 'rb') as f:
f.read(16)
code = marshal.load(f)
dis.dis(code)
Extract Constants
for ins in dis.get_instructions(code):
if ins.opname == 'LOAD_CONST':
print(ins.argval)
Pyarmor Static Unpack (1shot)
Repository: https://github.com/Lil-House/Pyarmor-Static-Unpack-1shot
python /path/to/oneshot/shot.py /path/to/scripts
python /path/to/oneshot/shot.py /path/to/scripts -r /path/to/pyarmor_runtime.so
python /path/to/oneshot/shot.py /path/to/scripts -o /path/to/output
Notes:
oneshot/pyarmor-1shot must exist before running shot.py.
- Supported focus: Pyarmor 8.x-9.x (
PY + six digits header style).
- Pyarmor 7 and earlier (
PYARMOR header) are out of scope.
- Disassembly output is generally reliable; decompiled source is experimental.
WASM Analysis
Decompile to C
wasm2c checker.wasm -o checker.c
gcc -O3 checker.c wasm-rt-impl.c -o checker
Common Patterns
w2c_memory - Linear memory array
wasm_rt_trap(N) - Runtime errors
- Function exports:
flagChecker, validate
Android APK
Extraction
apktool d app.apk -o decoded/
jadx app.apk
unzip app.apk -d extracted/
Key Locations
res/values/strings.xml - String resources
AndroidManifest.xml - App metadata
classes.dex - Dalvik bytecode
assets/, res/raw/ - Resources
Search
grep -r "flag\|CTF" decoded/
strings decoded/classes*.dex | grep -i flag
Flutter APK (Blutter)
python3 blutter.py path/to/app/lib/arm64-v8a out_dir
HarmonyOS HAP/ABC (abc-decompiler)
Repository: https://github.com/ohos-decompiler/abc-decompiler
unzip app.hap -d hap_extracted/
Critical startup mode:
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI [options] <input>
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI -d "out" ".abc"
java -cp "./jadx-dev-all.jar" jadx.cli.JadxCLI -m simple --log-level ERROR -d "out_abc_simple" ".abc"
Notes:
- Start with
-m simple --log-level ERROR.
- If
auto fails, retry with -m simple first.
- Errors do not always mean total failure; check
out_xxx/sources/.
- Use a fresh output directory per run.
.NET Analysis
Tools
- dnSpy - Debugging + decompilation (best)
- ILSpy - Decompiler
- dotPeek - JetBrains decompiler
NativeAOT
- Look for
System.Private.CoreLib strings
- Type metadata present but restructured
- Search for length-prefixed UTF-16 patterns
Two-Stage XOR + AES-CBC Decode Pattern (Codegate 2013)
Pattern: .NET binary stores an encrypted byte array that undergoes XOR decoding followed by AES-256-CBC decryption. The same key value serves as both the AES key and IV.
Steps:
- Extract hardcoded byte array and key string from binary (dnSpy/ILSpy)
- XOR each byte (may be multi-pass, e.g.,
0x25 then 0x58, equivalent to single 0x7D)
- Base64-decode the XOR result
- AES-256-CBC decrypt with
RijndaelManaged using the extracted key as both Key and IV
from Crypto.Cipher import AES
from base64 import b64decode
data = bytearray(encrypted_bytes)
for i in range(len(data)):
data[i] ^= 0x7D
ct = b64decode(bytes(data))
key = b"9e2ea73295c7201c5ccd044477228527"
cipher = AES.new(key, AES.MODE_CBC, iv=key)
plaintext = cipher.decrypt(ct)
Key insight: When RijndaelManaged appears in .NET decompilation, check if Key and IV are set to the same value — this is a common CTF pattern. The XOR stage often serves as a simple obfuscation layer before the real crypto.
Packed Binaries
UPX
upx -d packed -o unpacked
strings binary | grep UPX
Custom Packers
- Set breakpoint after unpacking stub
- Dump memory
- Fix PE/ELF headers
PyInstaller
python pyinstxtractor.py binary.exe
LLVM IR
Convert to Assembly
llc task.ll --x86-asm-syntax=intel
gcc -c task.s -o file.o
RISC-V Binary Analysis (EHAX 2026)
Pattern (iguessbro): Statically linked, stripped RISC-V ELF binary. Can't run natively on x86.
Disassembly with Capstone:
from capstone import *
with open('binary', 'rb') as f:
code = f.read()
md = Cs(CS_ARCH_RISCV, CS_MODE_RISCVC | CS_MODE_RISCV64)
md.detail = True
TEXT_OFFSET = 0x10000
for insn in md.disasm(code[TEXT_OFFSET:], TEXT_OFFSET):
print(f"0x{insn.address:x}:\t{insn.mnemonic}\t{insn.op_str}")
Common RISC-V patterns:
li a0, N → load immediate (argument setup)
mv a0, s0 → register move
call offset → function call (auipc + jalr pair)
beq/bne a0, zero, label → conditional branch
sd/ld → 64-bit store/load
addiw → 32-bit add (W-suffix = word operations)
Key differences from x86:
- No flags register — comparisons are inline with branch instructions
- Arguments in a0-a7 (not rdi/rsi/rdx)
- Return value in a0
- Saved registers s0-s11 (callee-saved)
- Compressed instructions (2 bytes) mixed with standard (4 bytes) — use
CS_MODE_RISCVC
Anti-RE tricks in RISC-V:
- Fake flags as string constants (check for
"n0t_th3_r34l" patterns)
- Timing anti-brute-force (rdtime instruction)
- XOR decryption with incremental key:
decrypted[i] = enc[i] ^ (key & 0xFF) ^ 0xA5; key += 7
Emulation: qemu-riscv64 -L /usr/riscv64-linux-gnu/ ./binary (needs cross-toolchain sysroot)
Binary Ninja
Interactive disassembler/decompiler with rapid community growth.
Decompilation outputs: High-Level Intermediate Language (HLIL), pseudo-C, pseudo-Rust, pseudo-Python.
binaryninja binary
import binaryninja
bv = binaryninja.open_view("binary")
for func in bv.functions:
print(func.name, hex(func.start))
print(func.hlil)
Community plugins: Available via Plugin Manager (Ctrl+Shift+P → "Plugin Manager").
Free version: https://binary.ninja/free/ (cloud-based, limited features).
Advantages over Ghidra: Faster startup, cleaner IL representations, better Python API for scripting.
Decompiler Comparison with dogbolt.org
dogbolt.org runs multiple decompilers simultaneously on the same binary and shows results side-by-side.
Supported decompilers: Hex-Rays (IDA), Ghidra, Binary Ninja, angr, RetDec, Snowman, dewolf, Reko, Relyze.
When to use:
- Decompiler output is confusing — compare with alternatives for clarity
- One decompiler mishandles a construct — another may get it right
- Quick triage without installing every tool locally
- Validate decompiler correctness by cross-referencing outputs
curl -F "file=@binary" https://dogbolt.org/api/binaries/
Key insight: Different decompilers excel at different constructs. When one produces unreadable output, another often generates clearer pseudocode. Cross-referencing catches decompiler bugs.
Useful Commands
file binary
checksec --file=binary
rabin2 -I binary
strings binary | grep -iE "flag|secret"
rabin2 -z binary
readelf -S binary
objdump -h binary
nm binary
readelf -s binary
objdump -d binary
objdump -M intel -d binary