com um clique
model-patch
// Merge a base HF model dir with an extra HF model dir by index diff — take tensors only present in extra, append them to a new output dir in HF-standard shard layout. Base wins on overlap.
// Merge a base HF model dir with an extra HF model dir by index diff — take tensors only present in extra, append them to a new output dir in HF-standard shard layout. Base wins on overlap.
Normalize an HF model checkpoint — quantize bf16/fp16 weights to per-block FP8, and/or repack engine-named safetensors into HF-standard ~4GB shards.
Use when `make html` or `sphinx-build` fails with `Extension error (sphinx.ext.autosummary)`, `ImportExceptionGroup`, or other Sphinx autosummary / autodoc import failures. Provides automated pdb diagnosis for `autodoc_mock_imports` issues.
Synchronize xtuner's supported model documentation (docs/en/pretrain_sft/advanced_tutorial/model.md and docs/zh_cn/pretrain_sft/advanced_tutorial/model.md) with the actual Config classes defined under xtuner/v1/model/. Use when (1) new TransformerConfig, MoEConfig, or BaseComposeConfig subclasses are added, removed, or renamed in xtuner/v1/model/, (2) existing model configs change their inheritance hierarchy, scale, or HuggingFace counterpart, or (3) a code review or user request points out that model.md is out of sync with the codebase.
| name | model_patch |
| description | Merge a base HF model dir with an extra HF model dir by index diff — take tensors only present in extra, append them to a new output dir in HF-standard shard layout. Base wins on overlap. |
Produce out = base ∪ (extra \ base) at the tensor level, by diffing the
two model.safetensors.index.json files. Useful when extra is a sibling
checkpoint (different training run, fine-tune, or fp8 conversion) that holds a
few tensors the base is missing — e.g. an updated head, a newly-added MTP
block, or auxiliary projections — and you want a self-contained merged dir
without re-running the whole conversion pipeline.
Overlap rule: base wins. Any tensor present in both indexes is taken from
base; the corresponding entry in extra is ignored.
.claude/skills/model_patch/
├── SKILL.md # this file
└── model_patch.py # CLI + library
python model_patch.py \
--base /path/to/base-model \
--extra /path/to/extra-model \
--out /path/to/merged-model
model-{i:05d}-of-{N:05d}.safetensors shards (total N = base shards +
newly-packed shards holding the extra-only tensors)model.safetensors.index.jsonbase — config.json, tokenizer files,
modeling/configuration .py, generation_config.json, README, etc. —
copied verbatim so out is loadable on its own. Subdirectories are
copied recursively. Existing entries in out are overwritten.extra's aux files are deliberately not copied. The skill's overlap
rule is "base wins" at the tensor level, and that extends to configs: extra
often carries engine-specific edits or modality keys that don't match the
tensor topology we keep. If you need fields from extra's config.json
(e.g. a new modality), merge them into out/config.json as a separate,
explicit step.
-of-N total. By default the
bytes are fully copied so out is independent of base. Pass
--hardlink to use os.link instead (fast, but the two trees share
inodes — editing one mutates the other; deleting one keeps the other
intact since hardlinks are symmetric).--shard-size-gb GiB (default 4), appended after the base shards.So a base with 12 shards plus 5 GB of extra-only tensors yields:
model-00001-of-00014.safetensors … model-00012-of-00014.safetensors
(copies of the base shards) and
model-00013-of-00014.safetensors / model-00014-of-00014.safetensors
(freshly written, containing only the extra tensors).
| Situation | Use this? |
|---|---|
extra contains a few tensors base is missing; everything else is identical. | Yes |
| You need to merge two checkpoints whose overlapping tensors differ in value. | No — this skill silently keeps base; you probably want a manual decision per key. |
extra is a full quantization of base and you want the quantized weights. | No — use model_normalize with --reference extra instead. |
You want a single model.safetensors instead of standard shards. | No — repack downstream. |
Before invoking model_patch.py:
--base, --extra, --out) with the user
via AskUserQuestion. Never infer them from earlier conversation.model.safetensors.index.json files and reporting:
After a successful run, do not ask the user where the output should go —
--out is already the final destination the user named. Just report what
landed there (shard count, index path, total size, and that base's aux files
were mirrored over). Unlike model_normalize, model_patch has no
staging-area concept: there is no temp dir, and the caller already made the
placement decision when they passed --out.
If the user wanted any of extra's aux files (e.g. a config.json that
documents a new modality), call that out — those are not copied by the
skill and must be merged in by hand.