with one click
commit-message
// Guide for writing clear, consistent git commit messages following this repository's conventions
// Guide for writing clear, consistent git commit messages following this repository's conventions
Bilingual guide for running and interpreting LLaVA-OneVision2 HF vs Megatron consistency checks across TP and PP settings
Bilingual guide for merging ViT + LLM into LlavaOnevision2 HF checkpoint and validating weight/inference consistency
Bilingual guide for the OFFLINE_PACKING_BMR and OFFLINE_PACKED_DATA environment variables that control LLaVA-OneVision2 training-side packing — what each gate does, why both must be enabled together, MBS=1 requirement, and the dead OFFLINE_PACKING_VQA branch
Bilingual guide for running offline_packing/auto_pipe.sh across multiple nodes to produce padding-free packed WebDataset shards for SFT, with Energon Metadataset assembly
Bilingual guide for understanding how cu_lengths controls attention behavior across ViT and LLM stages, and how patch_positions scope differs between the two
Bilingual guide for understanding LengthPoolSortDataset cross-rank length synchronization mechanism in multi-GPU training
| name | commit-message |
| description | Guide for writing clear, consistent git commit messages following this repository's conventions |
| compatibility | opencode |
| metadata | {"domain":"workflow","scope":"all-repos"} |
Use this skill when writing git commit messages. It enforces a formal Conventional Commits style suitable for open-source collaboration and future changelog generation, while still keeping messages aligned with this repository's tone.
<type>: <subject>
feat:, fix:, docs:, test:, chore:, etc.)type(scope): style; add a scope only if the user explicitly asks| Type | When to use | Example |
|---|---|---|
feat | New user-visible feature or capability | feat: add packed SFT dataset assembly workflow |
fix | Bug fix | fix: prevent cross-sample attention leakage in packed runs |
docs | Documentation, README, guides, skills, comments-only docs | docs: explain cu_lengths attention boundaries |
test | Tests, fixtures, consistency checks, validation scripts | test: add HF/Megatron consistency checks for PP=2 |
refactor | Code restructuring, no behavior change | refactor: move patch position block layout into task encoder |
chore | Maintenance, internal cleanup, generated metadata, repo hygiene | chore: remove internal-only skill references |
perf | Performance improvement without behavior change | perf: reduce token length scan overhead |
build | Build system, dependencies, Docker, packaging | build: update flash attention to v2.7.0 |
ci | CI/CD configuration and automation | ci: add packed dataset smoke test job |
style | Formatting-only changes, no code behavior change | style: format offline packing scripts |
Notes:
feat and fix are strongly preferred for code changesdocs for skill files under .opencode/skills/updated, missing _extra_state, or add OV2 SP unless the user explicitly asks to preserve old repo styleRevert and Merge are generated by git — don't manually write these typesfeat: add ... not feat: Add ...fix: correct rotary_base override in 8B config not fix: changed value from 1000000 to 8000000Use a body when:
feat: add YaRN RoPE scaling to Megatron training path
Wire YarnRotaryEmbedding into QwenModel when --rope-type=yarn is set.
All default parameters match the official Qwen3-8B config.
Affected files:
- qwen_model.py: construct YarnRotaryEmbedding, return (emb, mscale) tuple
- attention.py: extract mscale before RoPE application
- arguments.py: add 7 CLI args for YaRN configuration
When a commit will be part of a PR (most cases in this repo):
fix: loading dataloader (#121)(#N) suffix is added by GitHub on squash-merge — don't add it manually in local commitsWhen cleaning up commit history before a PR:
Co-authored-by trailers.Author; do not use --reset-author unless the user explicitly requests it.--force-with-lease, never plain --force.Example multi-author squash commit:
docs: explain distributed offline packing workflow
Group one day of packing notes into a single guide so the PR history stays
reviewable while preserving each contributor from the original commits.
Co-authored-by: Alice Zhang <alice@example.com>
Co-authored-by: Bob Chen <bob@example.com>
For commits touching many files, the subject should describe the intent, not list files:
# GOOD
feat: add YaRN RoPE support to QwenModel
# BAD
update qwen_model.py, attention.py, arguments.py, provider.py, config.py, model.py, mid_training.sh
Is it a new capability?
YES → feat: <what it enables>
NO ↓
Is it a bug fix?
YES → fix: <what was broken>
NO ↓
Is it restructuring without behavior change?
YES → refactor: <what was reorganized>
NO ↓
Is it adding a new file/script/example?
YES → docs: if documentation/skill, test: if test fixture, chore: if repo maintenance, feat: if user-visible capability
NO ↓
Is it a config/dependency/version update?
YES → build: for dependencies/build/Docker, ci: for CI, chore: for general maintenance
NO ↓
Is it trivial (typo, comment, formatting)?
YES → docs: for documentation typo, style: for formatting-only code changes
NO → pick the closest type
These are real commits — use them as reference for tone and length:
fix loading dataloader (#121)
missing _extra_state (#117)
Refactor model consistency checks and enhance encoder loading (#120)
add OV2 SP (#112)
update 30ba3b (#97)
feat: add MoE merge support for Qwen3-30B-A3B (#95)
refactor: move patch position block layout into task encoder (#89)
dev assert (#81)
Prefer the formal Conventional Commits shape for new local commits, even though older history contains mixed styles.
Better current-style examples:
feat: add packed SFT dataset assembly workflow
fix: prevent cross-sample attention leakage in packed runs
docs: explain cu_lengths attention boundaries
test: add HF/Megatron consistency checks for PP=2
chore: remove internal-only skill references
refactor: simplify Megatron checkpoint layout detection
build: update flash attention to v2.7.0
For most changes, just fill in:
<type>: <imperative verb> <what> [in/for/to <where>]
Examples:
feat: add custom pipeline layer splitting for PP>2fix: correct rotary_base override for 8B configdocs: document distributed offline packing workflowtest: add OneVision2 consistency fixture for PP=2chore: remove internal-only skill referencesrefactor: extract ViT encoder into standalone modulebuild: update flash attention to v2.7.0