| name | modal |
| description | Use when the user needs to run isolated code remotely — a disposable container, optional GPU access (T4 → H100), or a safer place for untrusted / heavy code. Prefer local execution for normal repo work; use Modal sandboxes for isolation, hardware access, or one-shot heavy compute. |
| triggers | ["modal sandbox","remote python","sandbox execution","isolated code run","gpu sandbox","h100","a100","remote container","ephemeral container","run untrusted code"] |
Modal Sandboxes
Disposable remote containers (with optional GPU) via Modal, paid per call in USDC. No Modal account, no GPU procurement — pay only for what runs.
How to Call from MCP
blockrun_modal({ path: "sandbox/create", body: {
image: "python:3.11",
gpu: "A100",
timeout: 600,
setup_commands: ["pip install torch transformers"]
}})
blockrun_modal({ path: "sandbox/exec", body: {
sandbox_id: "sb_abc...",
command: ["python", "-c", "import torch; print(torch.cuda.get_device_name(0))"]
}})
blockrun_modal({ path: "sandbox/terminate", body: { sandbox_id: "sb_abc..." } })
Endpoint Catalog
| Path | Method | Body | Price |
|---|
sandbox/create | POST | { image?, timeout?, cpu?, memory?, gpu?, setup_commands? } | $0.01 |
sandbox/exec | POST | { sandbox_id, command: ["python","-c","..."], timeout? } | $0.001 |
sandbox/status | POST | { sandbox_id } | $0.001 |
sandbox/terminate | POST | { sandbox_id } | $0.001 |
Field Reference
| Field | Default | Notes |
|---|
image | python:3.11 | Any public Docker image. nvidia/cuda:12-runtime if you bring GPU code. |
timeout | 300 | Sandbox lifetime in seconds (idle eviction) |
cpu | 1 | CPU cores |
memory | 1024 | Memory in MB |
gpu | none | T4 / L4 / A10G / A100 / A100-80GB / H100 |
setup_commands | [] | Shell commands run once during sandbox provisioning |
command (exec) | required | Array form: ["python","-c","print(2+2)"] |
Worked Examples
1. Quick Python eval
const { structuredContent: sb } = await blockrun_modal({ path: "sandbox/create", body: {} })
await blockrun_modal({ path: "sandbox/exec", body: {
sandbox_id: sb.sandbox_id,
command: ["python", "-c", "import numpy; print(numpy.__version__)"]
}})
await blockrun_modal({ path: "sandbox/terminate", body: { sandbox_id: sb.sandbox_id } })
Cost: $0.012 ($0.01 + $0.001 + $0.001).
2. GPU inference, A100, with deps pre-installed
blockrun_modal({ path: "sandbox/create", body: {
image: "pytorch/pytorch:2.4.0-cuda12.1-cudnn9-runtime",
gpu: "A100",
timeout: 1200,
memory: 16384,
setup_commands: ["pip install --quiet transformers accelerate"]
}})
Then sandbox/exec with your inference command. Sandbox auto-evicts after 1200s idle.
3. Test untrusted code Claude generated
blockrun_modal({ path: "sandbox/exec", body: {
sandbox_id,
command: ["bash", "-c", "<the generated script>"],
timeout: 60
}})
Output is captured. No risk to your local machine.
When NOT to Use Modal
- Normal repo edits / dev work — use local tools, Modal adds latency and cost
- Long-running services — sandboxes are ephemeral, not server hosts
- Anything you'd run hundreds of times per minute — payment overhead dominates at high QPS
Notes
sandbox_id is returned by create and required by every other endpoint
exec is sync — blocks until command finishes or hits its timeout
terminate is cheap; call it to free the sandbox even if timeout would expire shortly
- The free-tier
nvidia/* LLM models in blockrun_chat are different infrastructure — Modal is for your arbitrary code
Reference
- Endpoints:
POST /v1/modal/sandbox/{create,exec,status,terminate}
- Upstream: Modal