Run any Skill in Manus with one click

$pwd:

add-model

Name: Add Model
Author: hao-ai-lab

// Manual /add-model workflow for implementing a FastVideo model or first-class component port after add-model-01-prep has staged reference code and weights. Organizes the port into numbered phases with conversion rules, component policies, parity gates, and handoff checks.

Run Skill in Manus

$ git log --oneline --stat

stars:3,655

forks:350

updated:May 26, 2026 at 20:45

File Explorer

16 files

SKILL.md

readonly

related-skills.json

same repository

add-model-02-parity.md

from "hao-ai-lab/FastVideo"

Use during /add-model after reference/architecture study to scaffold and later activate local FastVideo component parity tests. Emphasizes early test creation, official-reference loading, standardized FastVideo loading, and non-skip handoff gates.

2026-05-263.7k

add-model-03-port-dit.md

from "hao-ai-lab/FastVideo"

Use during /add-model Phase 4 or Phase 6 to prototype or parity-debug one FastVideo-native DiT/transformer component.

2026-05-263.7k

add-model-08-trace.md

from "hao-ai-lab/FastVideo"

Use during /add-model Phase 6 when component parity has failed and root cause requires layer-by-layer divergence analysis. Uses FastVideo activation trace first, falling back to custom hooks only for boundaries or stats the utility cannot observe.

2026-05-263.7k

add-model-09-pipeline.md

from "hao-ai-lab/FastVideo"

Use during /add-model Phase 7 after all required component parity tests pass to define FastVideo pipeline wiring, configs, presets, registry entries, examples, smoke tests, and pipeline parity tests.

2026-05-263.7k

dreamverse-deploy.md

from "hao-ai-lab/FastVideo"

Use when redeploying the migrated Dreamverse app backend and frontend on a chosen local GPU; tears down existing ports, launches services, and waits for readiness checks.

2026-05-233.7k

reseed-performance-baseline.md

from "hao-ai-lab/FastVideo"

Re-seed the HF performance-tracking baseline for an intentional runtime, dependency, or environment-caused benchmark shift using one or more reviewed normalized performance JSONs. Use when performance CI fails because metrics such as latency, throughput, component time, or peak memory changed for an accepted reason and the rolling median baseline in FastVideo/performance-tracking must be advanced from a consistent batch of reviewed source results. The workflow backs up existing history under /tmp, validates all source JSONs for the same (model_id, gpu_type), rejects internally inconsistent source batches, uploads one success=true reseed record per accepted source JSON, and offers to clean local temp state after a successful upload.

2026-05-203.7k

package.json

"author": "hao-ai-lab"

"repository": "hao-ai-lab/FastVideo"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	add-model
description	Manual /add-model workflow for implementing a FastVideo model or first-class component port after add-model-01-prep has staged reference code and weights. Organizes the port into numbered phases with conversion rules, component policies, parity gates, and handoff checks.

Add Model

Manual Invocation

This skill is for explicit /add-model use only. Do not auto-start it from a casual model-port mention. The setup-only workflow is ../add-model-01-prep/SKILL.md.

Goal

Port a new FastVideo model family, model variant, or first-class reusable component so it can be loaded through FastVideo's native model, config, stage, registry, preset, and test infrastructure.

FastVideo has one pipeline architecture: stage-based composition via ComposedPipelineBase. Vary the stages and modules, not the architecture.

Scope Shapes

Use this skill for either shape:

Shape	Required output
Full model family or variant	Native components, conversion if needed, pipeline config/class, presets, registry, smoke test, local parity tests, example, quality regression.
First-class component contribution	Native component class/config, bucket export, component parity test, and a documented downstream pipeline that will consume it. Skip pipeline/preset/registry rows only when the contribution is intentionally component-only.

If upstream ships many variants, lock scope before coding. "Base model" means checkpoint variant, not a modality subset. If the base checkpoint produces audio, pose, depth, masks, or other output heads, either support those outputs or get explicit user agreement to drop them.

Required Input

Start from an add-model-01-prep handoff, or equivalent fields matching contracts/prep_handoff.md.

Before Phase 0, read the shared rules and all relevant schemas:

shared/common_rules.md
contracts/prep_handoff.md
contracts/port_state.md
contracts/escape_hatch.md
contracts/component_context.md
contracts/parity_status.md
contracts/conversion_request.md
contracts/conversion_handoff.md
contracts/component_skill_handoff.md
contracts/pipeline_context.md
contracts/pipeline_handoff.md
contracts/final_handoff.md

Hard Rules

Follow shared/common_rules.md for token/auth safety, state files, escape hatches, production import boundaries, and skip/pass semantics.
If the prep handoff is missing or ambiguous, stop and run ../add-model-01-prep/SKILL.md.
If a needed component is not ported, do not ship the pipeline that needs it.
Wan is grandfathered for missing local parity; do not copy its missing-test precedent for new work.

Escape Hatches

Follow shared/common_rules.md and contracts/escape_hatch.md. The main orchestrator should ask only when no phase skill can safely continue under the shared rules.

Files Map

Area	Paths
DiT	`fastvideo/models/dits/<family>.py`, `fastvideo/configs/models/dits/<family>.py`, bucket `__init__.py`.
VAE	`fastvideo/models/vaes/<arch_or_family>.py`, `fastvideo/configs/models/vaes/<arch_or_family>.py`, bucket `__init__.py`. Name by shared arch when reusable (`oobleck.py`, `autoencoder_kl.py`), otherwise by family (`wanvae.py`).
Encoder / conditioner / scheduler / upsampler	Native class/config in the matching `fastvideo/models/<bucket>/` and `fastvideo/configs/models/<bucket>/` bucket.
Lazy loader wrapper	Optional `fastvideo/models/<bucket>/<family>_loader.py` or similar thin `nn.Module` wrapper when a component is fetched from an external HF repo and should be hidden from host-pipeline state-dict matching.
Conversion	`scripts/checkpoint_conversion/<family>_to_diffusers.py` only when `needs_conversion=yes`.
Pipeline	`fastvideo/pipelines/basic/<family>/<family>_pipeline.py` plus sibling files for variants whose components or required modules differ.
Pipeline config	`fastvideo/configs/pipelines/<family>.py` or `fastvideo/pipelines/basic/<family>/pipeline_configs.py`.
Stages	`fastvideo/pipelines/basic/<family>/stages/` only for model-specific stage subclasses.
Presets / registry	`fastvideo/pipelines/basic/<family>/presets.py`, `fastvideo/registry.py`.
Tests	Component parity under `tests/local_tests/<bucket>/`; pipeline smoke/parity under `tests/local_tests/pipelines/`; CI-backed quality tests under `fastvideo/tests/`.
Example	`examples/inference/basic/basic_<family>*.py`, one per public mode/variant.

Phase 0: Scope And Handoff Gate

Validate every required handoff field.
Resolve needs_conversion=unknown before component work:

python ".agents/skills/add-model-01-prep/scripts/inspect_hf_layout.py" \
    "<hf-or-local-path>" \
    --json

List first-PR scope across both axes:
- Variant axis: base, distill, SR/refine, causal, DMD, I2V, V2V, etc.
- Modality axis: video, image, audio, pose, depth, masks, text, etc.
For component-only work, explicitly name the downstream full-pipeline PR or planned consumer.
Confirm official_env_status is imports_ok or private_deps_need_stubs. If it is blocked, return to ../add-model-01-prep/SKILL.md before parity scaffolding.
Confirm local_tests_readme exists and records official setup, HF weights, dependency changes, and planned parity commands for reviewers.
Confirm port_state_file exists, follows contracts/port_state.md, and has rows for open questions/issues found during prep.
If there are multiple official implementations, choose the one whose architecture matches the published weights. A blessed library port can be a better parity reference than a highly configurable research repo; document the choice in tests.

Phase 1: Reference And Architecture Study

Read the official pipeline call path before writing code.

Record:

Required modules from model_index.json or equivalent: transformer, VAE, text encoders, tokenizers, scheduler, image encoders, audio VAE, vocoder, conditioners, upsamplers.
Input/output modalities and every dedicated DiT output head.
Text/image/audio encoding flow, latent shape, dtype, scaling, packing, scheduler/timestep math, guidance math, VAE normalization, and decode flow.
Whether the official code relies on private deps, custom ops, or special kernels that parity tests must stub.

Arch config rule:

ArchConfig fields must match the emitted per-component config, especially transformer/config.json, one-to-one.
Pipeline knobs do not belong on the DiT arch config: inference steps, CFG scales, flow shift, FPS, VAE stride, text target length, data-proxy knobs, eval defaults, and sampling defaults go on PipelineConfig, presets, or stages.
If the HF repo is raw or has empty configs, synthesize transformer/config.json from the official Python model-config class, not from data/eval config classes.

Phase 2: Early Parity Scaffolding

Create component parity tests before or alongside implementation. Use ../add-model-02-parity/SKILL.md and its templates/component_parity_test.py. The official reference must import in the current FastVideo environment, or the prep handoff must identify private deps that will be stubbed locally for tests. Use local_tests_readme as the reviewer-facing source for setup commands and update its planned test table as parity scaffolds are added.

This phase is early by design:

Official loading can be implemented from the reference study.
FastVideo loading can target planned standardized class/config/loader paths.
Tests may initially skip because the FastVideo class or converted weights do not exist yet.
The scaffold must still contain real official loading, deterministic inputs, output extraction, and concrete tensor comparisons. No unconditional skips, no shape-only tests.

Use subagents here: dispatch one parity-test subagent per required component, including components that may be reused. Their output becomes the red/skip target that porting or reuse-verification subagents make pass later.

Phase 3: Reuse Gate And Component Dispatch

Build a component inventory before implementation:

Field	Meaning
Component	transformer, VAE, text encoder, image encoder, scheduler, conditioner, upsampler, vocoder, etc.
Official definition	Repo-relative source file, class/function name, and relevant line/range if known.
Official instantiation	Repo-relative pipeline/config/factory call site plus constructor args and runtime flags.
FastVideo target	Existing class to reuse or new bucket/file/config to add.
Parity test	Required local test path, including reused components.
Status	`reuse_pending`, `reuse_proven`, `port_pending`, `non_skip_pass`, or `blocked`.

Reuse is allowed only from the checked-out FastVideo tree. Do not wait for or depend on an open PR adding a native class; add the native port directly in this PR if the current tree cannot be reused.

Reuse decision:

Record exact official definition and instantiation evidence for every component.
If an existing FastVideo class and config match both definition and instantiation, pass that reused target to the bucket-specific skill in mode=prototype and require reuse evidence plus key/shape dumps.
If either definition or instantiation differs, port the component directly as FastVideo-native code through the bucket-specific skill.
Reused components still require non-skip component parity against the exact official instantiation used by the target pipeline.

Porting subagent dispatch:

Dispatch one subagent per component after Phase 2 parity scaffolds exist.
Use ../add-model-03-port-dit/SKILL.md for DiTs/transformers.
Use ../add-model-04-port-vae/SKILL.md for VAEs.
Use ../add-model-05-port-encoder/SKILL.md for text, image, audio, or compound encoders/conditioners that fit the encoder config bucket.
Use ../add-model-06-port-generic/SKILL.md for schedulers, upsamplers, vocoders, adapters, preprocessors, or unknown components.
Each subagent owns one component only and must loop on that component's local parity test until it produces a non-skip PASS or returns a precise blocker.

Every component subagent must receive a complete packet matching contracts/component_context.md. If any required path is unknown, pass unknown plus the exact search already performed. Do not silently omit ambiguous official files or prototype concerns.

Bucket, layer, and attention rules live in the bucket-specific skills and fastvideo/layers/AGENTS.md.

Phase 4: Native Component Prototype

Conversion needs a FastVideo state-dict surface. Use the Phase 3 bucket-specific skill in mode=prototype for every required component, including reused components.

Prototype success criteria:

the FastVideo-native or reused class/config can import and instantiate with the exact official architecture args;
official and FastVideo key/shape dumps exist for every stateful component;
local_tests_readme and port_state_file record prototype status and concerns;
the returned handoff matches contracts/component_skill_handoff.md.

Do not chase numerical parity in Phase 4. Prototype mode ends when conversion has the key/shape surface it needs, or when the component skill returns a precise blocker or escape hatch.

Phase 5: Param Mapping And Weight Conversion

Use ../add-model-07-conversion/SKILL.md after Phase 4 prototypes exist. Send a request matching contracts/conversion_request.md; consume the returned contracts/conversion_handoff.md update before Phase 6.

Use the prep handoff's needs_conversion value:

no: verify the source already has the component layout FastVideo loaders can consume, then record any passthrough components.
yes: write scripts/checkpoint_conversion/<family>_to_diffusers.py and output converted_weights/<family>/.
unknown: return to Phase 0.

The conversion skill owns source-layout handling, mapping derivation, config and model_index.json emission, passthrough assets, strict-load verification, and Phase 6 retry requests. Component skills must not patch conversion scripts or converted weights ad hoc.

Phase 6: Component Parity Debug

This is the expected expensive loop. Dispatch one subagent per required component, including reused components, using the bucket-specific skill in mode=parity-debug.

Each subagent gets:

the complete component context packet from Phase 3/4;
updated conversion mapping notes and strict-load result from Phase 5;
any prototype concerns or unknowns that were not resolved before conversion.

The bucket-specific skills own parity-debug tactics. If a failure belongs to conversion, route it through ../add-model-07-conversion/SKILL.md with a retry request matching contracts/conversion_request.md, then resume the component skill with the updated conversion handoff.

When a component failure narrows to layer-by-layer numerical drift, load ../add-model-08-trace/SKILL.md before writing custom hooks. It uses fastvideo/hooks/activation_trace.py; canonical env vars and JSONL format are documented in docs/contributing/activation_trace.md.

Phase 6 ends only when every required component handoff reports parity_status=non_skip_pass, or when a precise blocker or escape hatch is recorded in port_state_file.

Phase 7: Pipeline, Stages, And Variants

Do not start Phase 7 until every required component, reused or ported, has a non-skip local parity PASS from Phase 6. If any component parity test is still scaffold_skip, debug_red, blocked, or missing, resume Phase 6 first.

Use ../add-model-09-pipeline/SKILL.md for pipeline definition and parity-debug. Send a complete packet matching contracts/pipeline_context.md; consume the returned contracts/pipeline_handoff.md before moving to quality regression or final handoff.

The pipeline skill owns:

pipeline class, stage chain, and optional model-specific stages;
pipeline config, presets, registry updates, and examples;
official args/defaults/presets comparison before setting FastVideo defaults;
pipeline smoke and parity tests;
continuous pipeline parity-debug until non-skip PASS or precise blocker;
updates to local_tests_readme and port_state_file.

The pipeline handoff must explicitly cover stage order, variants, modality and output-head handling, config/preset/registry/example status, smoke/parity tests, and any return-to-Phase-6 evidence.

Phase 8: PipelineConfig, Presets, Registry, Examples

This phase is implemented through ../add-model-09-pipeline/SKILL.md after the Phase 7 component-parity gate passes. Accept the pipeline handoff only if it covers configs, presets, registry detection/exact class resolution, examples, new SamplingParam fields for public kwargs/defaults, and local smoke/parity status. Detailed rules live in ../add-model-09-pipeline/SKILL.md.

Phase 9: Parity Activation And Local Verification

Local parity is author-run, not CI-enforced. CI may only run package-level quality tests later. Before handoff, Phase 2 scaffolds must be activated into non-skip PASS results.

Order is mandatory:

Run conversion if needed.
Run component parity for every required component, including reused ones.
Run pipeline smoke.
Run pipeline parity.
Run the basic example.

If pipeline smoke or parity points back to component implementation, strict-load, or conversion mapping, return to Phase 6 or Phase 5 rather than patching around the issue in the pipeline.

Skip policy:

Follow shared/common_rules.md: a committed local test may skip for absent clones/weights, but a local skip is not a verified pass.

Use the commands and tolerance guidance from ../add-model-02-parity/SKILL.md for component checks and from ../add-model-09-pipeline/SKILL.md for pipeline smoke, pipeline parity, and examples. Record exact commands, status, and blockers in local_tests_readme and port_state_file.

Phase 10: Quality Regression

Video outputs:

Add fastvideo/tests/ssim/test_<family>_similarity.py when output video quality must be preserved.
Seed references through seed-ssim-references after the test exists.

Audio outputs:

SSIM does not apply. Use an audio-specific regression metric such as mel-spectrogram L1, multi-resolution STFT, CLAP cosine, or a project-approved learned metric.
Document the metric and hardware/runtime assumptions in the test.

Joint AV outputs:

Keep video and audio regression checks separate unless there is a validated joint metric.

Phase 11: Post-Parity Review And Handoff

After parity is green, run a hot-path review before handoff:

Hoist constant tensor allocations out of sampler/denoising loops.
Replace per-step randn_like churn with preallocated buffers plus .normal_() when safe.
Move torch.backends.* flag changes to one-shot setup/load paths.
Delete batch.extra writes that nothing reads.
Derive magic constants from configs when possible.

Pre-handoff checklist:

[ ] Prep handoff is complete and committed nowhere with token values.
[ ] Conversion was run if needed and output loads with real weights.
[ ] Every required component, reused or newly ported, has a non-skip local parity PASS.
[ ] `local_tests_readme` lists every component parity test, command, status, and blocker if any.
[ ] `port_state_file` has every open question/issue either resolved or listed as an explicit blocker.
[ ] Any `next_step=ask_user` has a matching `escape_hatch` block and `E###` row.
[ ] Pipeline smoke has a non-skip local PASS.
[ ] Pipeline parity has a non-skip local PASS against the official reference.
[ ] Basic example runs and writes a non-corrupt output.
[ ] Video SSIM or audio-specific quality regression is added or explicitly deferred.
[ ] Runtime production code has no diffusers/transformers model-class imports.
[ ] Production comments are WHY-focused; examples have user-story docstrings.
[ ] Post-parity hot-path pass is complete.

Ask before deleting any reference clone or staged weights created by add-model-01-prep. Leave .gitignore entries so future parity assets stay untracked. Never commit the clone, weights, .env, credentials, or anything matching *secret*.

References

../add-model-01-prep/SKILL.md for user-input collection, HF inspection, weight staging, reference cloning, and setup handoff.
contracts/ for canonical handoff schemas used by prep, parity, conversion, component porting, escape hatches, and final handoff.
../add-model-02-parity/SKILL.md for early component parity scaffolds and activation templates.
../add-model-07-conversion/SKILL.md for Phase 5 mapping, conversion scripts, monolithic checkpoint splitting, and strict-load checks.
../add-model-03-port-dit/SKILL.md, ../add-model-04-port-vae/SKILL.md, ../add-model-05-port-encoder/SKILL.md, and ../add-model-06-port-generic/SKILL.md for component subagent implementation and parity-debug loops.
../add-model-09-pipeline/SKILL.md for pipeline definition, config/preset/ registry/example wiring, smoke tests, and pipeline parity-debug.
fastvideo/layers/AGENTS.md for native layer selection and state-dict surface guidance.
docs/contributing/coding_agents.md for narrative context.
docs/design/overview.md for pipeline/config/registry architecture.
fastvideo/pipelines/basic/wan/ for standard T2V/I2V/DMD/Causal variants.
fastvideo/pipelines/basic/ltx2/ for non-standard stages and audio/video patterns.
tests/local_tests/pipelines/test_gamecraft_pipeline_parity.py for pipeline parity shape.
tests/local_tests/transformers/test_ltx2.py, tests/local_tests/vaes/test_ltx2_vae.py, and tests/local_tests/encoders/test_ltx2_gemma_parity.py for component parity.
scripts/checkpoint_conversion/convert_ltx2_weights.py for modern conversion script shape.
scripts/checkpoint_conversion/wan_to_diffusers.py for legacy regex mapping reference only.
REVIEW.md is historical; its decisions are incorporated here as of 2026-04-30.

Changelog

Date	Change
2026-04-24	Initial FastVideo add-model workflow.
2026-04-30	Split external setup into `add-model-01-prep`.
2026-04-30	Rewrote as manual `/add-model` phase workflow and incorporated `REVIEW.md` decisions.
2026-04-30	Extracted early parity scaffolding into `add-model-02-parity` and moved it before conversion/component implementation.
2026-04-30	Added component reuse proof gate, bucket-specific porting skills, and parity PASS requirement for reused components.
2026-04-30	Split prototype, conversion, and parity-debug phases; added conversion skill for monolithic and separate checkpoint layouts.
2026-04-30	Extracted handoff schemas into `contracts/` for shared use across skills.
2026-04-30	Added pipeline skill contract and Phase 7 component-parity gate.
2026-04-30	Added escape-hatch contract for user decisions and `ask_user` handoffs.

add-model

More from this repository

More from this repository

Add Model

Manual Invocation

Goal

Scope Shapes

Required Input

Hard Rules

Escape Hatches

Files Map

Phase 0: Scope And Handoff Gate

Phase 1: Reference And Architecture Study

Phase 2: Early Parity Scaffolding

Phase 3: Reuse Gate And Component Dispatch

Phase 4: Native Component Prototype

Phase 5: Param Mapping And Weight Conversion

Phase 6: Component Parity Debug

Phase 7: Pipeline, Stages, And Variants

Phase 8: PipelineConfig, Presets, Registry, Examples

Phase 9: Parity Activation And Local Verification

Phase 10: Quality Regression

Phase 11: Post-Parity Review And Handoff

References

Changelog

Add Model

Manual Invocation

Goal

Scope Shapes

Required Input

Hard Rules

Escape Hatches

Files Map

Phase 0: Scope And Handoff Gate

Phase 1: Reference And Architecture Study

Phase 2: Early Parity Scaffolding

Phase 3: Reuse Gate And Component Dispatch

Phase 4: Native Component Prototype

Phase 5: Param Mapping And Weight Conversion

Phase 6: Component Parity Debug

Phase 7: Pipeline, Stages, And Variants

Phase 8: PipelineConfig, Presets, Registry, Examples

Phase 9: Parity Activation And Local Verification

Phase 10: Quality Regression

Phase 11: Post-Parity Review And Handoff

References

Changelog