en un clic
en un clic
ALWAYS invoke this skill before running any sparkrun CLI commands. Never run sparkrun directly without loading this skill first. Covers launching, monitoring, stopping, and checking status of inference workloads on NVIDIA DGX Spark.
Install sparkrun and configure DGX Spark clusters
Manage recipe registries and create inference recipes
ALWAYS invoke this skill before running any sparkrun CLI commands. Never run sparkrun directly via Bash without loading this skill first. Covers launching, monitoring, stopping, and checking status of inference workloads on NVIDIA DGX Spark.
Install sparkrun and configure DGX Spark clusters
| name | registry |
| description | Manage recipe registries and create inference recipes |
<Use_When>
<Do_Not_Use_When>
# List configured registries (enabled only by default)
sparkrun registry list
sparkrun registry list --show-disabled
sparkrun registry list --only-show-visible
# Add registries from a git repo's .sparkrun/registry.yaml manifest
sparkrun registry add <git_url>
# Remove a registry
sparkrun registry remove <name>
# Enable a disabled registry
sparkrun registry enable <name>
# Disable a registry (recipes will not appear in searches)
sparkrun registry disable <name>
# Update all enabled registries from git (fetches latest recipes)
sparkrun registry update
# Update a specific registry
sparkrun registry update <name>
# Update sparkrun + registries in one command
sparkrun update
# List all recipes across all registries (no filter)
sparkrun list
# List with filters
sparkrun list --all # include hidden registry recipes
sparkrun list --registry <name> # filter by registry
sparkrun list --runtime vllm # filter by runtime (vllm, sglang, llama-cpp)
sparkrun list <query> # filter by name
# Search for recipes by name, model, runtime, or description (contains-match)
sparkrun recipe search <query>
sparkrun recipe search <query> --registry <name> --runtime sglang
# Inspect a specific known recipe (by exact name or file path)
sparkrun recipe show <recipe> [--tp N]
# Export a normalized recipe
sparkrun recipe export <recipe>
sparkrun recipe export <recipe> --json
sparkrun recipe export <recipe> --save out.yaml
Use sparkrun recipe search as the first attempt when looking for a particular recipe. Use sparkrun recipe show when given a specific recipe name or file -- it may not appear in search results.
Recipe names support @registry/name syntax for explicit registry selection (e.g. @spark-arena/qwen3-1.7b-vllm).
# List available benchmark profiles across registries
sparkrun registry list-benchmark-profiles
sparkrun registry list-benchmark-profiles --registry <name>
sparkrun registry list-benchmark-profiles --all # include hidden registries
# Show detailed benchmark profile information
sparkrun registry show-benchmark-profile <name>
# Check a recipe for issues
sparkrun recipe validate <recipe>
# Estimate VRAM usage with overrides
sparkrun recipe vram <recipe> [--tp N] [--max-model-len 32768] [--gpu-mem 0.9]
Recipes are YAML files defining an inference workload:
model: org/model-name # HuggingFace model ID (required)
runtime: vllm | sglang | llama-cpp # Inference runtime (required)
container: registry/image:tag # Docker image (required)
min_nodes: 1 # Minimum hosts needed
max_nodes: 4 # Maximum hosts (optional)
model_revision: abc123 # Pin to specific HF revision (optional)
metadata:
description: Human-readable description
maintainer: name <email>
model_params: 7B
model_dtype: fp16
category: general # Recipe category
defaults:
port: 8000
host: 0.0.0.0
tensor_parallel: 2
pipeline_parallel: 1
gpu_memory_utilization: 0.9
max_model_len: 32768
served_model_name: my-model
tokenizer_path: org/base-model # Required for GGUF models on SGLang
# Optional: explicit command template (overrides auto-generation)
command: |
python3 -m vllm.entrypoints.openai.api_server \
--model {model} \
--tensor-parallel-size {tensor_parallel} \
--port {port}
# Optional: environment variables passed to the container
env:
NCCL_DEBUG: INFO
# Optional: post-launch hooks
post_exec: # Commands to run inside the head container
- "echo 'Model loaded'"
post_commands: # Commands to run on the control machine
- "curl http://{head_ip}:{port}/v1/models"
stop_after_post: false # Stop workload after post hooks (default: false)
Key fields:
{placeholder} in command: templates are substituted from defaults + CLI overridesmodel_revision pins downloads to a specific HuggingFace commit/tagtokenizer_path is required for GGUF models on SGLang (points to base non-GGUF model)min_nodes / max_nodes control cluster size validation${HF_TOKEN} in env: are expanded from the control machine's environmentpost_exec and post_commands run after the server is healthy (port listening + /v1/models check)pipeline_parallel enables pipeline parallelism (total nodes = TP * PP)<Tool_Usage>
Use the sparkrun_exec tool for all sparkrun commands.
</Tool_Usage>
<Important_Notes>
sparkrun registry update or sparkrun update periodically to get the latest community recipessparkrun recipe validate before publishing custom recipessparkrun recipe vram to check if a model fits on DGX Spark before trying to run ittokenizer_path in defaults{placeholder} references to pick up defaults and CLI overrides~/.cache/sparkrun/registries/ and updated with sparkrun registry update.yaml recipe filessparkrun registry list-benchmark-profiles to discover available benchmark profiles from registries@registry/name syntax for explicit registry selectionsparkrun recipe export to get a normalized view of a recipe (useful for debugging)
</Important_Notes>Task: {{ARGUMENTS}}