| name | malware-analysis-orchestrator |
| description | Structured malware triage and reverse-engineering orchestration for PE, ELF, and Mach-O binaries with strict artifact dumping to a status folder. Use when requests involve malware sample analysis, strings triage, API/import analysis, behavioral hypothesis generation, component mapping, deep-analysis planning, Binary Ninja MCP usage, Ghidra MCP usage, or role-based orchestration (orchestrator/planner/reporter) with complete intermediate outputs. |
Malware Analysis Orchestrator
Overview
Execute a static-first malware workflow for PE, ELF, and Mach-O samples and always write detailed intermediate artifacts to a per-sample case directory under status/.
Case Directory Convention
Each analysis gets a case directory at status/<NNN>-<filename>, e.g. status/000-malware.bin. When the same sample is analyzed again, the existing case is reused. To force a fresh case (e.g. after re-packing or for a second opinion), pass --new to init_status_tree.sh, which increments the number: status/001-malware.bin.
Supported binary formats:
- PE (Windows) --
.exe, .dll, .sys
- ELF (Linux) -- executables, shared objects, kernel modules
- Mach-O (macOS) -- executables, dylibs, bundles
Use role-separated execution:
- Orchestrator role: drive phases, enforce artifact completeness, schedule parallel collection.
- Planning role: refine investigation priorities from intermediate evidence.
- Reporting role: produce high-level and technical summaries with traceable evidence.
Load these references before execution:
references/workflow.md
references/artifact-spec.md
references/agent-roles.md
references/interesting-signals.md
references/deep-analysis-checklist.md
Claude-Specific Guidance
Claude reasons about evidence directly instead of delegating all thinking to scripts.
- Scripts are data-collection helpers. They extract raw strings, imports, and metadata. They produce baseline rankings and template hypotheses.
- Claude does hypothesis generation, planning, and reporting natively. After scripts run, read their output and apply your own reasoning. Correlate evidence across artifacts, generate richer hypotheses, and write deeper analysis than the script templates provide.
build_hypothesis.py is optional. It produces baseline hypotheses as a starting point. Claude should refine, extend, or replace them with contextual reasoning grounded in the collected evidence.
- Format detection is automatic. Scripts use the
file command to detect PE/ELF/Mach-O and branch accordingly. The --format flag can override auto-detection.
- Case directories are automatic.
init_status_tree.sh creates and prints the case directory path. Subsequent scripts resolve the latest case for a sample automatically, or accept an explicit case_dir argument.
Required Rules
- Run
init_status_tree.sh <sample> to create the case directory before analysis.
- Use the printed case directory path for all subsequent
--status-dir arguments.
- Dump all intermediate outputs, not just summaries.
- Keep raw artifacts and interpreted artifacts separate.
- Include evidence references in every conclusion.
- If a tool is missing, use the documented fallback and record the gap in
<case_dir>/INDEX.md.
- Do not skip required artifact files listed in
references/artifact-spec.md.
Quick Start
Run these scripts in order:
CASE_DIR=$(scripts/init_status_tree.sh <sample_path> | tail -1 | awk '{print $NF}')
scripts/collect_strings.sh <sample_path>
scripts/collect_imports.sh <sample_path>
scripts/scan_yara.sh <sample_path>
scripts/scan_capa.sh <sample_path>
python3 scripts/rank_signals.py --status-dir "$CASE_DIR"
python3 scripts/build_hypothesis.py --status-dir "$CASE_DIR"
python3 scripts/update_state.py --status-dir "$CASE_DIR" --phase triage_complete
To force a new case for the same sample: scripts/init_status_tree.sh <sample_path> --new
After quick start, perform deep-analysis planning using references/deep-analysis-checklist.md and write:
<case_dir>/06_component_inventory.md
<case_dir>/07_interaction_model.md
<case_dir>/08_deep_analysis_plan.md
<case_dir>/09_priority_queue.md
<case_dir>/10_reporting_draft.md
Then update state:
python3 scripts/update_state.py --status-dir "$CASE_DIR" --phase planning_complete
Workflow Decision Tree
- Detect binary format using
file command output.
- If format is PE: use PE-specific tooling and signal rules.
- If format is ELF: use ELF-specific tooling (
readelf, nm) and signal rules.
- If format is Mach-O: use
rabin2 for metadata and imports, apply Mach-O signal rules.
- If format is unrecognized: record unsupported format in profile and attempt generic strings/entropy analysis.
- If imports are packed/obfuscated: continue with strings and static structure, then prioritize unpacking in deep plan.
- If clear C2, persistence, or kernel indicators are present: raise priority and move them to top of
<case_dir>/09_priority_queue.md.
- If capa is available and format is PE or ELF: run capa scan for ATT&CK-mapped capability identification.
- If Binary Ninja MCP or Ghidra MCP is available (prefer Binary Ninja MCP; fall back to Ghidra MCP; fall back to
r2): include disassembly/decompilation artifacts and xref tables.
Tooling and Fallbacks
All tools below are installed in the analysis environment (Kali Linux).
Fingerprinting and Profiling
file -- format detection
sha256sum, md5sum -- cryptographic hashes
ssdeep -- fuzzy hashing
die / diec (detect-it-easy) -- packer, compiler, linker detection
rabin2 -I -- binary metadata (universal, all formats)
r2 /ca -- crypto constant / expanded key detection (AES, SM4)
yara + bundled rules (scan_yara.sh) -- automated detection of:
- Crypto algorithms and constants (124 rules)
- Anti-debug and anti-VM techniques (64 rules)
- Malicious capabilities (55 rules)
- Packer/compiler signatures (36 rules)
Rules sourced from Yara-Rules/rules (GPL-2.0), stored in
assets/yara_rules/.
capa (Mandiant) -- automated capability identification mapped to MITRE ATT&CK and MBC.
Supports PE and ELF. Raw JSON saved to tool-logs/capa.json, summary appended to profile.
String Extraction
strings -a -- raw ASCII/Unicode
rabin2 -zz -- section-aware strings
floss (FLARE) -- stack and decoded strings
Import / Symbol Collection
rabin2 -i -- imports (universal, all formats)
rabin2 -l -- linked libraries (universal)
- PE:
objdump -p (DLL name + symbol parsing)
- ELF:
readelf --dyn-syms, readelf -d, nm -D
- Mach-O:
rabin2 -i and rabin2 -l (primary; no otool in this environment)
Disassembly and Decompilation
- Binary Ninja MCP server -- primary tool for disassembly, decompilation, xrefs, patching, scripting, and more. Interact via MCP tool calls (tool prefix
mcp__binary_ninja_headless_mcp__*, derived from the binary_ninja_headless_mcp key in .mcp.json), NOT by importing the Python API directly (import binaryninja).
- Ghidra MCP server -- secondary tool for disassembly, decompilation, xrefs, patching, scripting, and more. Interact via MCP tool calls (tool prefix
mcp__ghidra_headless_mcp__*, derived from the ghidra_headless_mcp key in .mcp.json), NOT by importing pyghidra directly. Use when Binary Ninja MCP is unavailable.
r2 (radare2 CLI) -- fallback disassembly and scripted analysis when neither Binary Ninja MCP nor Ghidra MCP is available
Deep Analysis Tools (available for manual/scripted use)
binwalk -- entropy analysis, embedded file detection
xxd -- hex dumps
gdb / gdb-multiarch -- dynamic debugging
strace, ltrace -- syscall/library call tracing
capstone (Python) -- programmatic disassembly
ropper -- gadget finding
unblob -- recursive extraction
upx -- UPX unpacking
qemu-user -- cross-architecture emulation
yq, jq -- structured data processing
Fallback Order
- Strings:
strings -> rabin2 -zz -> floss -> note missing source.
- Imports:
rabin2 -i -> format-specific tools -> note missing source.
- Metadata:
rabin2 -I -> readelf -h (ELF) / objdump -f (PE) -> note gap.
- Disassembly: Binary Ninja MCP -> Ghidra MCP ->
r2.
- Capability detection:
capa -> YARA capabilities rules -> manual string/API analysis -> note gap.
Role-Oriented Execution Model
Orchestrator
- Initialize case directory and sample profile.
- Run independent collection in parallel where possible.
- Gate phase progression on artifact completeness.
- Hand off to planner when triage artifacts are present.
Planner
- Consume strings/API signal outputs.
- Build behavior hypotheses with confidence and evidence.
- Define component-focused deep-analysis sequence.
- Update priority queue after each new intermediate result.
Reporter
- Build high-level overview for analysts.
- Build technical drill-down with evidence pointers.
- Capture open questions, unknowns, and next probes.
Artifact Discipline
Follow references/artifact-spec.md exactly.
Minimum required artifacts (all relative to the case directory):
00_sample_profile.md
01_strings_raw.txt
02_strings_interesting.md
03_imports_raw.txt
04_imports_interesting.md
05_behavior_hypotheses.md
06_component_inventory.md
07_interaction_model.md
08_deep_analysis_plan.md
09_priority_queue.md
10_reporting_draft.md
INDEX.md
CURRENT_STATE.json
Binary Ninja MCP Guidance
Use Binary Ninja exclusively through its MCP server. Do NOT use import binaryninja directly. The MCP server exposes capabilities as tool calls that Claude can invoke directly, including:
- Disassembly and decompilation (HLIL/MLIL/LLIL)
- Cross-references (API xrefs, string xrefs, call graphs)
- Patching (binary modification, NOP-ing, instruction replacement)
- Scripting (run Binary Ninja Python snippets server-side for complex or custom analysis)
- Type and structure recovery
- Symbol and annotation management
If the Binary Ninja MCP server is available, include these artifacts under subdirectories of the case directory:
disassembly/ for function disassembly.
decompilation/ for HLIL/MLIL outputs.
xrefs/ for API and string cross-references.
patches/ for any binary patches applied during analysis.
Capture only high-value functions first:
- Entry point and initialization chain.
- Persistence and service management functions.
- Network connect/send/recv and protocol handlers.
- Process injection and memory manipulation functions.
- Kernel/device communication and IOCTL handlers.
Ghidra MCP Guidance
Use Ghidra exclusively through its MCP server. Do NOT use import pyghidra directly. The MCP tool prefix is mcp__ghidra_headless_mcp__*. The MCP server exposes capabilities as tool calls that Claude can invoke directly, including:
- Disassembly and decompilation (C-like output, P-code)
- Cross-references (API xrefs, string xrefs, call graphs)
- Patching (binary modification, NOP-ing, instruction replacement, branch inversion)
- Scripting (run Ghidra scripts server-side via
ghidra_eval, ghidra_call, and ghidra_script for complex or custom analysis)
- Type and structure recovery (struct creation, enum definition, C type parsing)
- Symbol and annotation management (labels, comments, bookmarks, tags)
- Graph extraction (basic blocks, CFG edges, call paths)
If the Ghidra MCP server is available (and Binary Ninja MCP is not), include these artifacts under subdirectories of the case directory:
disassembly/ for function disassembly.
decompilation/ for decompiler outputs.
xrefs/ for API and string cross-references.
patches/ for any binary patches applied during analysis.
Capture only high-value functions first:
- Entry point and initialization chain.
- Persistence and service management functions.
- Network connect/send/recv and protocol handlers.
- Process injection and memory manipulation functions.
- Kernel/device communication and IOCTL handlers.
Completion Criteria
Declare analysis complete only when:
- All required artifacts exist and are updated.
- Hypotheses include confidence scores and evidence.
- Deep-analysis plan has ordered steps and expected outputs.
- Reporting draft contains both executive overview and technical map.
Expected Deliverable Style
- Keep claims falsifiable and evidence-linked.
- Distinguish observed facts from inferred hypotheses.
- Track uncertainties explicitly.
- Keep artifacts machine-parsable where practical.