with one click
launch-experiment
// Generate and execute a training launch command for FastVideo models
// Generate and execute a training launch command for FastVideo models
Use during /add-model after reference/architecture study to scaffold and later activate local FastVideo component parity tests. Emphasizes early test creation, official-reference loading, standardized FastVideo loading, and non-skip handoff gates.
Use during /add-model Phase 4 or Phase 6 to prototype or parity-debug one FastVideo-native DiT/transformer component.
Use during /add-model Phase 6 when component parity has failed and root cause requires layer-by-layer divergence analysis. Uses FastVideo activation trace first, falling back to custom hooks only for boundaries or stats the utility cannot observe.
Use during /add-model Phase 7 after all required component parity tests pass to define FastVideo pipeline wiring, configs, presets, registry entries, examples, smoke tests, and pipeline parity tests.
Manual /add-model workflow for implementing a FastVideo model or first-class component port after add-model-01-prep has staged reference code and weights. Organizes the port into numbered phases with conversion rules, component policies, parity gates, and handoff checks.
Use when redeploying the migrated Dreamverse app backend and frontend on a chosen local GPU; tears down existing ports, launches services, and waits for readiness checks.
| name | launch-experiment |
| description | Generate and execute a training launch command for FastVideo models |
Construct a fully-specified torchrun training command for a FastVideo model
given a target pipeline, dataset, and hyperparameter overrides. This skill
automates the boilerplate of setting environment variables, picking the right
entrypoint, and applying defaults from the closest example script.
fastvideo is installed (uv pip install -e ".[dev]").docs/training/data_preprocess.md).WANDB_API_KEY is set in the environment (or WANDB_MODE=offline for local).| Parameter | Required | Description |
|---|---|---|
pipeline | Yes | Training pipeline type: finetune, distill-dmd, self-forcing, lora, consistency |
model | Yes | Model family: wan-t2v-1.3B, wan-i2v-14B, ltx2, matrixgame |
data_path | Yes | Path to preprocessed dataset (parquet) |
num_gpus | Yes | Number of GPUs |
overrides | No | Dict of hyperparameter overrides (any CLI arg) |
output_dir | No | Output directory (default: outputs/<model>_<pipeline>) |
run_name | No | W&B run name (default: auto-generated) |
| Pipeline | Entrypoint |
|---|---|
finetune (Wan T2V) | fastvideo/training/wan_training_pipeline.py |
finetune (Wan I2V) | fastvideo/training/wan_i2v_training_pipeline.py |
finetune (LTX-2) | fastvideo/training/ltx2_training_pipeline.py |
finetune (Matrix-Game 2.0) | fastvideo/training/matrixgame2_training_pipeline.py |
distill-dmd | fastvideo/training/wan_distillation_pipeline.py |
self-forcing | fastvideo/training/wan_self_forcing_distillation_pipeline.py |
Find the closest example script in examples/training/ for the model:
| Model | Example Script Directory |
|---|---|
wan-t2v-1.3B | examples/training/finetune/wan_t2v_1.3B/crush_smol/ |
wan-i2v-14B | examples/training/finetune/wan_i2v_14B_480p/crush_smol/ |
ltx2 | examples/training/finetune/ltx2/ |
matrixgame | examples/training/finetune/MatrixGame2.0/ |
distill-dmd | scripts/distill/v1_distill_dmd_wan.sh |
Read the script to extract default values for:
--learning_rate, --train_batch_size, --sp_size, --tp_size--num_latent_t, --num_height, --num_width, --num_frames--gradient_accumulation_steps, --max_train_steps--mixed_precision, --weight_decay, --max_grad_norm--validation_steps, --validation_sampling_stepsexport WANDB_API_KEY="${WANDB_API_KEY}"
export WANDB_BASE_URL="https://api.wandb.ai"
export FASTVIDEO_ATTENTION_BACKEND=FLASH_ATTN
export TOKENIZERS_PARALLELISM=false
export TRITON_CACHE_DIR=/tmp/triton_cache
torchrun --nnodes 1 --nproc_per_node <num_gpus> \
<entrypoint> \
--pretrained_model_name_or_path <model_hf_id> \
--data_path "<data_path>" \
--output_dir "<output_dir>" \
--wandb_run_name "<run_name>" \
--tracker_project_name "<project_name>" \
--log_validation \
<...all hyperparameters...>
After launching, append an entry to .agents/memory/experiment-journal/README.md:
## [YYYY-MM-DD] Experiment: <run_name>
- **Hypothesis**: <user-provided or auto-generated>
- **Config**: model=<model>, lr=<lr>, sp_size=<sp>, gpus=<n>, script=<entrypoint>
- **W&B run**: <pending — will be updated by monitor skill>
- **Status**: running
Launch a Wan T2V 1.3B finetune on 4 GPUs with lr=5e-5 and max_train_steps=1000:
pipeline: finetune
model: wan-t2v-1.3B
data_path: data/crush_smol_preprocessed/
num_gpus: 4
overrides:
learning_rate: 5e-5
max_train_steps: 1000
examples/training/finetune/wan_t2v_1.3B/crush_smol/finetune_t2v.shscripts/distill/v1_distill_dmd_wan.shdocs/training/finetune.md (training arguments table)fastvideo/training/trackers.py (tracker initialization)| Date | Change |
|---|---|
| 2026-03-02 | Initial version |