name	malware-analysis
description	Analyze suspicious files through triage/static/dynamic/code phases to produce IOCs, YARA/Sigma rules, and MITRE ATT&CK mappings
license	MIT
metadata	{"category":"incident-response","locale":"en","phase":"v1"}

What this skill does

Runs a five-phase malware analysis pipeline on a suspicious file: triage (file type, hashes, VirusTotal lookup, YARA), static analysis (PE/ELF headers, imports, strings, packer detection), dynamic analysis (sandbox detonation, process/network/filesystem monitoring), code analysis (disassembly, decompilation, algorithm/C2 identification), and classification (malware family, MITRE ATT&CK TTP mapping). Produces a structured report with IOCs, detection rules, and remediation guidance.

When to use

Analyzing a suspicious binary, script, or document during incident response
Performing malware reverse engineering for threat intelligence
Producing YARA/Sigma/Snort detection rules from a malware sample
Mapping an unknown sample to a known malware family or APT toolset

Prerequisites

file, sha256sum, md5sum, sha1sum, strings (standard Linux utilities)
python3 with yara-python: pip install yara-python
FLOSS (FireEye Labs Obfuscated String Solver): https://github.com/mandiant/flare-floss
DIE (Detect It Easy): sudo apt install detect-it-easy or https://github.com/horsicq/Detect-It-Easy
ssdeep: sudo apt install ssdeep
oletools for document analysis: pip install oletools
Ghidra or IDA Pro for code analysis (optional, for Phase 4)
x64dbg (Windows) for dynamic API tracing (optional)
ANY.RUN or Cuckoo sandbox access for dynamic analysis (optional)
Environment variable VT_API_KEY for VirusTotal lookups

Inputs

Variable	Description	Example
`SAMPLE_PATH`	Path to the suspicious file	`/tmp/suspicious.exe`
`VT_API_KEY`	VirusTotal API key (optional)	`abc123...`
`YARA_RULES_DIR`	Directory containing YARA rule files (optional)	`/opt/yara-rules/`
`SANDBOX_URL`	Cuckoo sandbox API URL (optional)	`http://cuckoo.local:8090`

Workflow

Phase 1: Triage

Step 1.1 – 1.2: Setup, file identification, and hash calculation

SAMPLE_PATH="${SECSKILL_SAMPLE_PATH:-}"
if [ -z "$SAMPLE_PATH" ]; then
  read -rp "Enter path to sample file: " SAMPLE_PATH
fi
[ ! -f "$SAMPLE_PATH" ] && echo "[!] File not found: $SAMPLE_PATH" && exit 1

SAMPLE_NAME=$(basename "$SAMPLE_PATH")
WORK_DIR="/tmp/malware-analysis-$(date +%s)"
mkdir -p "$WORK_DIR"
cp "$SAMPLE_PATH" "$WORK_DIR/$SAMPLE_NAME"

echo "=== Malware Analysis: $SAMPLE_NAME | $(date -u '+%Y-%m-%d %H:%M UTC') ==="
echo "[File identification]"; file "$SAMPLE_PATH"
command -v die &>/dev/null && die "$SAMPLE_PATH"

echo "[Hashes]"
MD5=$(md5sum "$SAMPLE_PATH" | awk '{print $1}')
SHA1=$(sha1sum "$SAMPLE_PATH" | awk '{print $1}')
SHA256=$(sha256sum "$SAMPLE_PATH" | awk '{print $1}')
echo "  MD5:    $MD5"
echo "  SHA1:   $SHA1"
echo "  SHA256: $SHA256  Size: $(wc -c < "$SAMPLE_PATH") bytes"
command -v ssdeep &>/dev/null && echo "  ssdeep: $(ssdeep "$SAMPLE_PATH" | tail -1 | awk '{print $1}')"
printf '%s\n' "$MD5" "$SHA1" "$SHA256" > "$WORK_DIR/hashes.txt"

Step 1.3: VirusTotal lookup

VT_API_KEY="${VT_API_KEY:-}"
if [ -z "$VT_API_KEY" ] && [ -f ~/.config/security-skill/secrets.env ]; then
    source ~/.config/security-skill/secrets.env
fi

echo ""
echo "[VirusTotal lookup]"
if [ -n "$VT_API_KEY" ]; then
    curl -s --request GET \
        --url "https://www.virustotal.com/api/v3/files/$SHA256" \
        --header "x-apikey: $VT_API_KEY" | \
    python3 -c "
import sys, json
d = json.load(sys.stdin).get('data',{}).get('attributes',{})
s = d.get('last_analysis_stats',{})
m = s.get('malicious',0); t = sum(s.values())
print(f'  Detection: {m}/{t}  Name: {d.get(\"meaningful_name\",\"N/A\")}')
print('  [!] DETECTED' if m > 0 else '  [OK] Not detected')
"
else
    echo "  [*] VT_API_KEY not set — manual: https://www.virustotal.com/gui/file/$SHA256"
fi

Step 1.4: YARA matching

YARA_RULES_DIR="${YARA_RULES_DIR:-/opt/yara-rules}"

echo ""
echo "[YARA matching]"
if [ -d "$YARA_RULES_DIR" ] && command -v yara &>/dev/null; then
    YARA_HITS=$(yara -r "$YARA_RULES_DIR" "$SAMPLE_PATH" 2>/dev/null)
    if [ -n "$YARA_HITS" ]; then
        echo "  [!] YARA matches found:"
        echo "$YARA_HITS" | while read -r line; do echo "    $line"; done
    else
        echo "  [OK] No YARA matches"
    fi
else
    echo "  [*] YARA CLI not available — install: pip install yara-python"
    echo "      Then run: yara -r $YARA_RULES_DIR $SAMPLE_PATH"
fi

Step 1.5: String extraction with FLOSS

echo ""
echo "[String extraction (FLOSS)]"
if command -v floss &>/dev/null; then
    floss --no-progress "$SAMPLE_PATH" 2>/dev/null | tee "$WORK_DIR/floss_output.txt" | head -100
    echo "  [+] Full FLOSS output saved to $WORK_DIR/floss_output.txt"
else
    echo "  [*] FLOSS not found — falling back to strings"
    strings -n 6 "$SAMPLE_PATH" | tee "$WORK_DIR/strings_output.txt" | \
        grep -iE 'http|cmd|powershell|reg|\\temp\\|\.exe|\.dll|socket|connect|download|upload|token|pass|key' | \
        head -50
    echo "  [+] Full strings output: $WORK_DIR/strings_output.txt"
fi

Step 1.6: Entropy analysis (packer/obfuscation detection)

Reference: See REFERENCE.md for the entropy analysis script (whole-file and PE section-level entropy with thresholds >7.2 = packed, >6.5 = moderate).

Phase 2: Static Analysis

Step 2.1: PE header analysis

Reference: See REFERENCE.md for the PE header analysis script (machine type, timestamp, subsystem, ASLR/DEP/CFG flags, sections, imports, exports).

echo ""
echo "=== Phase 2: Static Analysis ==="
echo ""
echo "[PE Headers]"
python3 -c "
import sys, pefile
pe = pefile.PE('$SAMPLE_PATH')
print('  Machine:', hex(pe.FILE_HEADER.Machine))
print('  DLL:', pe.is_dll())
print('  ASLR:', bool(pe.OPTIONAL_HEADER.DllCharacteristics & 0x0040))
print('  Sections:', [s.Name.rstrip(b\"\\x00\").decode() for s in pe.sections])
" 2>/dev/null || echo "  [*] pefile not installed — run: pip install pefile"

Step 2.2: ELF analysis (Linux binaries)

FILE_TYPE=$(file "$SAMPLE_PATH")
if echo "$FILE_TYPE" | grep -qi 'ELF'; then
    echo "[ELF Analysis]"
    readelf -h "$SAMPLE_PATH" 2>/dev/null | grep -E 'Type|Machine|Entry|Flags'
    objdump -d "$SAMPLE_PATH" 2>/dev/null | grep -oE '<[^>]+@plt>' | sed 's/<//;s/@plt>//' | sort -u | head -20
    nm -D "$SAMPLE_PATH" 2>/dev/null | grep -v ' U ' | awk '{print $3}' | grep -v '^$' | head -20
fi

Step 2.3: Document / script analysis

if echo "$FILE_TYPE" | grep -qiE 'office|ole|word|excel|powerpoint|pdf|rtf'; then
    echo "[Office/Document Analysis]"
    if command -v olevba &>/dev/null; then
        olevba "$SAMPLE_PATH" 2>/dev/null | tee "$WORK_DIR/olevba_output.txt" | head -80
        echo "  [+] Full olevba output: $WORK_DIR/olevba_output.txt"
    fi
    if command -v olemeta &>/dev/null; then
        olemeta "$SAMPLE_PATH" 2>/dev/null
    fi
fi

if echo "$FILE_TYPE" | grep -qiE 'python|javascript|powershell|script|text'; then
    echo "[Script Analysis]"
    python3 -c "
import re, sys
c = open('$SAMPLE_PATH', errors='replace').read()
for lbl, pat in [('Base64', r'[A-Za-z0-9+/]{40,}={0,2}'), ('Hex', r'0x[0-9a-fA-F]{4,}'),
                 ('Suspicious', r'eval|exec|iex|frombase64|gunzip'), ('Network', r'https?://\S+')]:
    m = re.findall(pat, c, re.I)
    if m: print(f'  [{lbl}] {len(m)} found:', *m[:3], sep=chr(10)+'    ')
"
fi

Phase 3: Dynamic Analysis

Step 3.1: Sandbox submission

echo ""
echo "=== Phase 3: Dynamic Analysis ==="
echo ""

SANDBOX_URL="${SANDBOX_URL:-}"
echo "[Sandbox submission]"
if [ -n "$SANDBOX_URL" ]; then
    TASK_RESP=$(curl -s -F "file=@$SAMPLE_PATH" "$SANDBOX_URL/tasks/create/file")
    TASK_ID=$(echo "$TASK_RESP" | python3 -c "import sys,json; print(json.load(sys.stdin).get('task_id',''))" 2>/dev/null)
    if [ -n "$TASK_ID" ]; then
        echo "  [+] Cuckoo task submitted: ID=$TASK_ID"
        echo "  [*] Monitor at: $SANDBOX_URL/analysis/$TASK_ID/"
        echo "  [*] Wait for completion, then run:"
        echo "      curl -s $SANDBOX_URL/tasks/report/$TASK_ID > $WORK_DIR/cuckoo_report.json"
    else
        echo "  [!] Cuckoo submission failed: $TASK_RESP"
    fi
else
    echo "  [*] No sandbox configured — manual options:"
    echo "      ANY.RUN:     https://app.any.run/ (upload $SAMPLE_NAME)"
    echo "      Cuckoo:      Set SANDBOX_URL env var and re-run"
    echo "      Joe Sandbox: https://www.joesandbox.com/"
fi

Step 3.2: Local dynamic monitoring setup

Reference: See REFERENCE.md for local monitoring commands (process watch, inotify, tcpdump, strace, anti-debug bypass notes).

# Run BEFORE detonating — key commands:
watch -n1 'ps aux | grep "$SAMPLE_NAME"' &
inotifywait -m -r /tmp /var/tmp --format '%T %e %w%f' --timefmt '%H:%M:%S' 2>/dev/null | tee /tmp/fs_events.txt &
tcpdump -i any -w /tmp/sample_capture.pcap &
# Then detonate: strace -f -e trace=network,file,process -o /tmp/strace.log "./$SAMPLE_NAME"

Step 3.3: Parse Cuckoo report (if available)

Reference: See REFERENCE.md for the Cuckoo report parser script (score, process tree, network IOCs, dropped files).

CUCKOO_REPORT="$WORK_DIR/cuckoo_report.json"
if [ -f "$CUCKOO_REPORT" ]; then
    echo ""
    echo "[Cuckoo report summary]"
    python3 -c "
import json
r = json.load(open('$CUCKOO_REPORT'))
print('  Score:', r.get('info',{}).get('score','N/A'), '/10')
net = r.get('network', {})
for c in net.get('tcp',[])[:5]: print('  TCP', c.get('dst'), c.get('dport'))
for h in net.get('http',[])[:5]: print('  HTTP', h.get('method'), h.get('host'), h.get('uri',''))
for d in net.get('dns',[])[:5]: print('  DNS', d.get('request'))
"
fi

Phase 4: Code Analysis

Step 4.1: Disassembly guidance

Reference: See REFERENCE.md for Ghidra/IDA quick-start instructions, algorithmic pattern identification, and C2 protocol RE checklist.

Phase 5: Classification and Reporting

Step 5.1: Malware family classification and MITRE ATT&CK mapping

Reference: See REFERENCE.md for the full MITRE ATT&CK TTP mapping table, malware family keyword indicators, and the classification script.

echo ""
echo "=== Phase 5: Classification & Report ==="
echo "  Sample: $SAMPLE_NAME"
echo "  [*] Match strings from $WORK_DIR/floss_output.txt against family keywords"
echo "      in REFERENCE.md (Malware Family Keyword Indicators) to determine family."
echo "  [*] Cross-reference matched family with MITRE ATT&CK TTP table in REFERENCE.md."
echo "  [*] See Step 5.2 for generated YARA, Sigma, and Snort rules."

Step 5.2: Generate detection rules

Reference: See REFERENCE.md for YARA, Sigma, and Snort rule scaffolds and placeholder guidance.

Use the scaffolds in REFERENCE.md to build rules from observed IOCs:

Fill the YARA scaffold with unique strings extracted in Phase 1 (FLOSS output)
Fill the Sigma scaffold with the sample SHA256 and observed command-line patterns
Fill the Snort scaffold with the C2 beacon pattern identified in Phase 3/4

SHA256=$(sha256sum "$SAMPLE_PATH" | awk '{print $1}')
echo "  SHA256: $SHA256"
echo "  [+] Use YARA scaffold from REFERENCE.md — substitute strings from $WORK_DIR/floss_output.txt"
echo "  [+] Use Sigma scaffold from REFERENCE.md — substitute \$SHA256 and observed CommandLine"
echo "  [+] Use Snort scaffold from REFERENCE.md — substitute C2 network pattern"
echo "  [!] Replace all placeholders with actual observed indicators before deploying."

Step 5.3: Final remediation guidance

Reference: See REFERENCE.md for the full 15-step incident response remediation checklist.

echo ""
echo "[+] Analysis artifacts saved to: $WORK_DIR"
echo "    hashes.txt        — MD5/SHA1/SHA256"
echo "    floss_output.txt  — extracted strings"
echo "    olevba_output.txt — document macro analysis (if applicable)"
echo "    cuckoo_report.json— sandbox report (if submitted)"
echo "    detection.yar     — generated YARA rule"
echo "    detection.yml     — generated Sigma rule"

Done when

All five phases are complete (or skipped with documented reason)
Sample hashes are recorded and VirusTotal result is obtained or noted as unavailable
At least one detection rule (YARA, Sigma, or Snort) is generated
MITRE ATT&CK TTPs are mapped to observed behaviors
Remediation steps are documented
All artifacts are saved to the working directory

Failure modes

Problem	Cause	Solution
`pefile` import error	Package not installed	`pip install pefile`
FLOSS not found	Not in PATH	Install from https://github.com/mandiant/flare-floss/releases
VirusTotal rate limit	Free API tier (4 req/min)	Wait 60 s, then retry; use premium key for bulk
Cuckoo connection refused	Sandbox not running	Start Cuckoo service or use ANY.RUN manually
olevba not found	oletools not installed	`pip install oletools`
YARA compile error	Invalid rule syntax	Validate with `yara --syntax-only`
High entropy but no packer detected by DIE	Custom/unknown packer	Manually trace OEP via x64dbg `Run to OEP`

Notes

Never execute untrusted samples on a production host. Use a dedicated VM with snapshots or an isolated sandbox.
For Windows samples analyzed on Linux, use Wine + x64dbg inside a VM, or submit to ANY.RUN.
The YARA and Sigma rules generated here are scaffolds — review and tune before production deployment.
MITRE ATT&CK navigator layer export: https://mitre-attack.github.io/attack-navigator/
Cross-reference extracted hashes with malware-hash skill for additional threat intelligence.
BinDiff can be used to compare two similar samples: bindiff sample_a sample_b after Ghidra export.