一键在 Manus 中运行任何 Skill

$pwd:

veomni-new-model

Name: Veomni New Model
Author: ByteDance-Seed

// Use this skill when adding support for a new model to VeOmni. Covers the full lifecycle: analyzing the HuggingFace model, creating model patches, defining parallel plans, writing configs, integrating with the trainer, and testing. Trigger: 'add model', 'support new model', 'integrate <model_name>', 'new model support'.

在 Manus 中运行

$ git log --oneline --stat

stars:1,957

forks:197

updated:2026年5月29日 16:08

SKILL.md

readonly

name	veomni-new-model
description	Use this skill when adding support for a new model to VeOmni. Covers the full lifecycle: analyzing the HuggingFace model, creating model patches, defining parallel plans, writing configs, integrating with the trainer, and testing. Trigger: 'add model', 'support new model', 'integrate <model_name>', 'new model support'.

Before You Start: Create Todos

Use TodoWrite to track all phases:

Phase 1: Analyze HF model             -> in_progress
Phase 2: Create model patch            -> pending
Phase 3: Define parallel plan          -> pending
Phase 4: Write training config         -> pending
Phase 5: Integrate with trainer        -> pending
Phase 6: Test                          -> pending

Phase 1: Analyze HuggingFace Model

Identify the model on HuggingFace. Read its config.json, modeling_*.py, and any processor configs.
Determine model category:
- Text-only LLM -> veomni/models/transformers/<model_name>/
- Vision-Language -> veomni/models/transformers/<model_name>/ + veomni/data/multimodal/
- MoE model -> additional veomni/distributed/moe/ integration
- Diffusion model -> veomni/models/diffusers/<model_name>/
- Omni model -> veomni/models/seed_omni/
Check existing similar models: Find the closest existing model in veomni/models/transformers/ and use it as a reference. E.g., if adding a new Qwen variant, reference qwen3/ or qwen3_vl/.
Identify required patches: VeOmni uses a patchgen system (veomni/patchgen/) to auto-generate model patches from HuggingFace models. Check if a patch spec already exists or if one needs to be created.

Phase 2: Create Model Patch

Create the model directory: veomni/models/transformers/<model_name>/
Required files:
- __init__.py — model registration (MODELING_REGISTRY / MODEL_CONFIG_REGISTRY / MODEL_PROCESSOR_REGISTRY)
- <model_name>_gpu_patch_gen_config.py — declarative patchgen config (replace_class / override_method / replace_function / modify_init / add_post_import_block / drop_import_names) defining all VeOmni patches against the upstream HF modeling
- <model_name>_npu_patch_gen_config.py — NPU patchgen config (often just imports the GPU config and applies NPU-specific overrides via name_map)
- parallel_plan.py — FSDP / TP / EP sharding plan
- generated/patched_modeling_<model_name>_{gpu,npu}.py — patchgen output (do NOT edit manually)
Patch patterns — follow existing models:
- Sequence parallel: declare an OpSlot for attention/loss and override forward via patchgen
- MoE: stack per-expert weights (gate_up_proj [E, 2*I, H] / down_proj [E, H, I]) and add a veomni_moe_experts_forward OpSlot
- Cross-entropy: add a veomni_causal_lm_loss OpSlot and return CausalLMOutputWithLogProbs
- Register the model class in the model package __init__.py (no entry in veomni/models/auto.py is needed for transformers models — registration happens via the per-model MODELING_REGISTRY decorators)
Run patchgen: make patchgen regenerates every generated/patched_modeling_*.py from the matching *_patch_gen_config.py.

Phase 3: Define Parallel Plan

Create parallel_plan.py in the model directory.
Define FSDP/FSDP2 sharding strategy:
- Which layers to wrap (typically transformer blocks)
- Activation checkpointing granularity
- Parameter dtype policies
If the model is MoE, define expert parallelism plan in addition to FSDP.
Reference existing parallel plans for guidance (e.g., veomni/models/transformers/qwen3/parallel_plan.py).

Phase 4: Write Training Config

Model config: Create configs/model_configs/<model_family>/<ModelName>.json matching HuggingFace format.
Training config: Create YAML in the appropriate directory:
- Text: configs/text/<model_name>.yaml
- Multimodal: configs/multimodal/<model_name>/<model_name>.yaml
- DiT: configs/dit/<model_name>.yaml
Config must include: model path, data config, optimizer settings, parallelism config, checkpoint settings.
Verify against existing configs — match the structure of similar model configs.

Phase 5: Integrate with Trainer

Verify the model works with the appropriate trainer:
- Text -> TextTrainer (veomni/trainer/text_trainer.py)
- VLM -> VLMTrainer (veomni/trainer/vlm_trainer.py)
- DiT -> DitTrainer (veomni/trainer/dit_trainer.py)
If the model needs custom data preprocessing:
- Add transform in veomni/data/data_transform.py or veomni/data/multimodal/
- Register the transform for the model
If the model needs custom collator logic:
- Extend veomni/data/data_collator.py
VLM only — multimodal metadata precompute: to keep the ViT forward free of host-device CUDA syncs, derive ViT cu_seqlens / max_seqlen in the collator rather than the forward. Follow the checklist in .agents/knowledge/multimodal_metadata.md ("Adding the hook to a new model"): a collate_multimodal_metadata patchgen helper + a get_metadata_collate_func override, the per-modality vit_metadata sub-dict threaded through Model.forward → ViT.forward (with a runtime fallback), and the model added to _MM_METADATA_WIRED_CASES in the sync gate test.

Phase 6: Test

Create toy config: Add tests/toy_config/<model_name>_toy/config.json with minimal parameters for fast testing.
Unit tests: Add tests in tests/models/ to verify:
- Model loads correctly via veomni.models.auto
- Forward pass produces correct output shape
- Model patch applies without errors
E2e tests (if feasible): Test a short training run using the toy config.
Run make quality and pytest tests/models/.
Update documentation:
- Add usage example to docs/ (training command, config reference).
- Update .agents/knowledge/architecture.md if the model adds a new module or trainer path.
- Update supported models table in project README.md if applicable.

Common Pitfalls

Model registry: Registration must happen at import time in __init__.py. If the model's AutoConfig type is not registered, build_foundation_model() will fail.
Generated files: Never edit files in generated/ directories — they are overwritten by patchgen. Edit the matching <model>_{gpu,npu}_patch_gen_config.py and re-run make patchgen instead.
Tokenizer compatibility: Some models require specific tokenizer versions or custom chat templates — verify in veomni/data/chat_template.py.
Transformers version: All modeling targets transformers==5.9.0 (pinned by the transformers-stable default dependency group). Models register through the patchgen-generated path under generated/; do not introduce legacy modeling_<m>.py files or apply_veomni_<m>_patch() helpers.

related-skills.json

同仓库

veomni-migrate-transformers-v5.md

from "ByteDance-Seed/VeOmni"

Use this skill when adding or refreshing a patchgen-generated modeling file for a VeOmni model under veomni/models/transformers/<model>/generated/ — GPU-only or GPU+NPU, dense or MoE, text-only / VLM / Omni-thinker+talker. Covers: creating <model>_{gpu,npu}_patch_gen_config.py, using patchgen decorators (replace_class/override_method/replace_function/modify_init/add_post_import_block/drop_import_names), reusing sibling-model patches via name_map, handling MoE weight-loading (CheckpointTensorConverter + fused gate_up_proj layout), multimodal/VLM forward with Ulysses SP, excluding speech/vocoder subtrees in Omni models (talker/token2wav/DiT/BigVGAN), wiring __init__.py for the patchgen-generated classes, running codegen, and adding test cases. Trigger: 'port <model> to patchgen', 'add patchgen for <model>', 'transformers v5 migration', 'add NPU patchgen'. Do NOT edit files under generated/ manually — always regenerate via patchgen.

2026-05-302.0k

veomni-debug.md

from "ByteDance-Seed/VeOmni"

Use this skill for ANY bug, error, crash, wrong output, loss divergence, gradient explosion, test failure, CUDA error, distributed training hang, checkpoint load failure, or unexpected behavior. Covers both quick fixes (clear root cause) and complex debugging (unclear cause). Trigger: 'fix bug', 'fix error', 'broken', 'crash', 'doesn't work', 'fails with', 'loss NaN', 'training hangs', 'FSDP error', 'OOM'.

2026-05-292.0k

veomni-develop.md

from "ByteDance-Seed/VeOmni"

VeOmni-specific checklist for feature development and refactoring. Covers impact analysis across modalities, trainer hierarchy, data pipeline, and distributed code. Use before implementing any non-trivial change. For model-specific or ops-specific work, use veomni-new-model or veomni-new-op instead. Trigger: 'add feature', 'implement', 'refactor', 'reorganize', 'new capability'.

2026-05-192.0k

veomni-new-op.md

from "ByteDance-Seed/VeOmni"

Use this skill when adding a new optimized kernel or operator to veomni/ops/. Covers the full lifecycle: understanding VeOmni's ops architecture (KERNEL_REGISTRY + OpSlot dispatch, with a thin function-pointer shim for a few legacy global ops), implementing the kernel, registering it, adding tests, and documenting it. Trigger: 'add op', 'new kernel', 'add attention variant', 'new fused op', 'add triton kernel', 'optimize operator'.

2026-05-192.0k

veomni-uv-update.md

from "ByteDance-Seed/VeOmni"

Use this skill when updating dependencies managed by uv: bumping a package version, upgrading the uv tool itself, updating torch/CUDA stack, switching transformers version, or regenerating the lockfile. Trigger: 'update dependency', 'bump version', 'upgrade uv', 'update torch', 'update lockfile', 'uv sync fails'.

2026-05-192.0k

veomni-profile.md

from "ByteDance-Seed/VeOmni"

Use this skill for performance profiling and optimization. Two modes: (1) Analyze existing profile files (Chrome traces, memory snapshots) — write scripts to parse and summarize metrics per user requirements. (2) Generate profiles during development — configure ProfileConfig, run training, collect traces, analyze bottlenecks, and suggest optimizations. Trigger: 'profile', 'performance', 'slow', 'MFU', 'throughput', 'bottleneck', 'memory usage', 'trace', 'optimize training speed'.

2026-04-132.0k

package.json

"author": "ByteDance-Seed"

"repository": "ByteDance-Seed/VeOmni"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

软件开发工程师计算机与数学类职业15-1252L4

name	veomni-new-model
description	Use this skill when adding support for a new model to VeOmni. Covers the full lifecycle: analyzing the HuggingFace model, creating model patches, defining parallel plans, writing configs, integrating with the trainer, and testing. Trigger: 'add model', 'support new model', 'integrate <model_name>', 'new model support'.

Before You Start: Create Todos

Use TodoWrite to track all phases:

Phase 1: Analyze HF model             -> in_progress
Phase 2: Create model patch            -> pending
Phase 3: Define parallel plan          -> pending
Phase 4: Write training config         -> pending
Phase 5: Integrate with trainer        -> pending
Phase 6: Test                          -> pending

Phase 1: Analyze HuggingFace Model

Identify the model on HuggingFace. Read its config.json, modeling_*.py, and any processor configs.
Determine model category:
- Text-only LLM -> veomni/models/transformers/<model_name>/
- Vision-Language -> veomni/models/transformers/<model_name>/ + veomni/data/multimodal/
- MoE model -> additional veomni/distributed/moe/ integration
- Diffusion model -> veomni/models/diffusers/<model_name>/
- Omni model -> veomni/models/seed_omni/
Check existing similar models: Find the closest existing model in veomni/models/transformers/ and use it as a reference. E.g., if adding a new Qwen variant, reference qwen3/ or qwen3_vl/.
Identify required patches: VeOmni uses a patchgen system (veomni/patchgen/) to auto-generate model patches from HuggingFace models. Check if a patch spec already exists or if one needs to be created.

Phase 2: Create Model Patch

Create the model directory: veomni/models/transformers/<model_name>/
Required files:
- __init__.py — model registration (MODELING_REGISTRY / MODEL_CONFIG_REGISTRY / MODEL_PROCESSOR_REGISTRY)
- <model_name>_gpu_patch_gen_config.py — declarative patchgen config (replace_class / override_method / replace_function / modify_init / add_post_import_block / drop_import_names) defining all VeOmni patches against the upstream HF modeling
- <model_name>_npu_patch_gen_config.py — NPU patchgen config (often just imports the GPU config and applies NPU-specific overrides via name_map)
- parallel_plan.py — FSDP / TP / EP sharding plan
- generated/patched_modeling_<model_name>_{gpu,npu}.py — patchgen output (do NOT edit manually)
Patch patterns — follow existing models:
- Sequence parallel: declare an OpSlot for attention/loss and override forward via patchgen
- MoE: stack per-expert weights (gate_up_proj [E, 2*I, H] / down_proj [E, H, I]) and add a veomni_moe_experts_forward OpSlot
- Cross-entropy: add a veomni_causal_lm_loss OpSlot and return CausalLMOutputWithLogProbs
- Register the model class in the model package __init__.py (no entry in veomni/models/auto.py is needed for transformers models — registration happens via the per-model MODELING_REGISTRY decorators)
Run patchgen: make patchgen regenerates every generated/patched_modeling_*.py from the matching *_patch_gen_config.py.

Phase 3: Define Parallel Plan

Create parallel_plan.py in the model directory.
Define FSDP/FSDP2 sharding strategy:
- Which layers to wrap (typically transformer blocks)
- Activation checkpointing granularity
- Parameter dtype policies
If the model is MoE, define expert parallelism plan in addition to FSDP.
Reference existing parallel plans for guidance (e.g., veomni/models/transformers/qwen3/parallel_plan.py).

Phase 4: Write Training Config

Model config: Create configs/model_configs/<model_family>/<ModelName>.json matching HuggingFace format.
Training config: Create YAML in the appropriate directory:
- Text: configs/text/<model_name>.yaml
- Multimodal: configs/multimodal/<model_name>/<model_name>.yaml
- DiT: configs/dit/<model_name>.yaml
Config must include: model path, data config, optimizer settings, parallelism config, checkpoint settings.
Verify against existing configs — match the structure of similar model configs.

Phase 5: Integrate with Trainer

Verify the model works with the appropriate trainer:
- Text -> TextTrainer (veomni/trainer/text_trainer.py)
- VLM -> VLMTrainer (veomni/trainer/vlm_trainer.py)
- DiT -> DitTrainer (veomni/trainer/dit_trainer.py)
If the model needs custom data preprocessing:
- Add transform in veomni/data/data_transform.py or veomni/data/multimodal/
- Register the transform for the model
If the model needs custom collator logic:
- Extend veomni/data/data_collator.py
VLM only — multimodal metadata precompute: to keep the ViT forward free of host-device CUDA syncs, derive ViT cu_seqlens / max_seqlen in the collator rather than the forward. Follow the checklist in .agents/knowledge/multimodal_metadata.md ("Adding the hook to a new model"): a collate_multimodal_metadata patchgen helper + a get_metadata_collate_func override, the per-modality vit_metadata sub-dict threaded through Model.forward → ViT.forward (with a runtime fallback), and the model added to _MM_METADATA_WIRED_CASES in the sync gate test.

Phase 6: Test

Create toy config: Add tests/toy_config/<model_name>_toy/config.json with minimal parameters for fast testing.
Unit tests: Add tests in tests/models/ to verify:
- Model loads correctly via veomni.models.auto
- Forward pass produces correct output shape
- Model patch applies without errors
E2e tests (if feasible): Test a short training run using the toy config.
Run make quality and pytest tests/models/.
Update documentation:
- Add usage example to docs/ (training command, config reference).
- Update .agents/knowledge/architecture.md if the model adds a new module or trainer path.
- Update supported models table in project README.md if applicable.

Common Pitfalls

Model registry: Registration must happen at import time in __init__.py. If the model's AutoConfig type is not registered, build_foundation_model() will fail.
Generated files: Never edit files in generated/ directories — they are overwritten by patchgen. Edit the matching <model>_{gpu,npu}_patch_gen_config.py and re-run make patchgen instead.
Tokenizer compatibility: Some models require specific tokenizer versions or custom chat templates — verify in veomni/data/chat_template.py.
Transformers version: All modeling targets transformers==5.9.0 (pinned by the transformers-stable default dependency group). Models register through the patchgen-generated path under generated/; do not introduce legacy modeling_<m>.py files or apply_veomni_<m>_patch() helpers.

veomni-new-model

Before You Start: Create Todos

Phase 1: Analyze HuggingFace Model

Phase 2: Create Model Patch

Phase 3: Define Parallel Plan

Phase 4: Write Training Config

Phase 5: Integrate with Trainer

Phase 6: Test

Common Pitfalls

同仓库更多 Skills

同仓库更多 Skills

Before You Start: Create Todos

Phase 1: Analyze HuggingFace Model

Phase 2: Create Model Patch

Phase 3: Define Parallel Plan

Phase 4: Write Training Config

Phase 5: Integrate with Trainer

Phase 6: Test

Common Pitfalls