Bilingual guide for the OFFLINE_PACKING_BMR and OFFLINE_PACKED_DATA environment variables that control LLaVA-OneVision2 training-side packing — what each gate does, why both must be enabled together, MBS=1 requirement, and the dead OFFLINE_PACKING_VQA branch

2026-05-06998

distributed-offline-packing.md

from "EvolvingLMMs-Lab/LLaVA-OneVision-2"

Bilingual guide for running offline_packing/auto_pipe.sh across multiple nodes to produce padding-free packed WebDataset shards for SFT, with Energon Metadataset assembly

2026-04-28998

cu-lengths-attention-flow.md

from "EvolvingLMMs-Lab/LLaVA-OneVision-2"

Bilingual guide for understanding how cu_lengths controls attention behavior across ViT and LLM stages, and how patch_positions scope differs between the two

2026-03-26998

length-pool-sort-dataset.md

from "EvolvingLMMs-Lab/LLaVA-OneVision-2"

Bilingual guide for understanding LengthPoolSortDataset cross-rank length synchronization mechanism in multi-GPU training

2026-03-26998

package.json

"author": "EvolvingLMMs-Lab"

"repository": "EvolvingLMMs-Lab/LLaVA-OneVision-2"

View GitHub Repository View Creator Repositories

$ install --global

$ download --local

Run Skill in Manus

$ useful --forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

Type

When to use

Example

feat

New user-visible feature or capability

feat: add packed SFT dataset assembly workflow

fix

Bug fix

fix: prevent cross-sample attention leakage in packed runs

docs

Documentation, README, guides, skills, comments-only docs

docs: explain cu_lengths attention boundaries

test

Tests, fixtures, consistency checks, validation scripts

test: add HF/Megatron consistency checks for PP=2

refactor

Code restructuring, no behavior change

refactor: move patch position block layout into task encoder

chore

Maintenance, internal cleanup, generated metadata, repo hygiene

chore: remove internal-only skill references

perf

Performance improvement without behavior change

perf: reduce token length scan overhead

build

Build system, dependencies, Docker, packaging

build: update flash attention to v2.7.0

ci

CI/CD configuration and automation

ci: add packed dataset smoke test job

style

Formatting-only changes, no code behavior change

style: format offline packing scripts

feat: add YaRN RoPE scaling to Megatron training path Wire YarnRotaryEmbedding into QwenModel when --rope-type=yarn is set. All default parameters match the official Qwen3-8B config. Affected files: - qwen_model.py: construct YarnRotaryEmbedding, return (emb, mscale) tuple - attention.py: extract mscale before RoPE application - arguments.py: add 7 CLI args for YaRN configuration

docs: explain distributed offline packing workflow Group one day of packing notes into a single guide so the PR history stays reviewable while preserving each contributor from the original commits. Co-authored-by: Alice Zhang <alice@example.com> Co-authored-by: Bob Chen <bob@example.com>

Is it a new capability? YES → feat: <what it enables> NO ↓ Is it a bug fix? YES → fix: <what was broken> NO ↓ Is it restructuring without behavior change? YES → refactor: <what was reorganized> NO ↓ Is it adding a new file/script/example? YES → docs: if documentation/skill, test: if test fixture, chore: if repo maintenance, feat: if user-visible capability NO ↓ Is it a config/dependency/version update? YES → build: for dependencies/build/Docker, ci: for CI, chore: for general maintenance NO ↓ Is it trivial (typo, comment, formatting)? YES → docs: for documentation typo, style: for formatting-only code changes NO → pick the closest type

fix loading dataloader (#121) missing _extra_state (#117) Refactor model consistency checks and enhance encoder loading (#120) add OV2 SP (#112) update 30ba3b (#97) feat: add MoE merge support for Qwen3-30B-A3B (#95) refactor: move patch position block layout into task encoder (#89) dev assert (#81)

feat: add packed SFT dataset assembly workflow fix: prevent cross-sample attention leakage in packed runs docs: explain cu_lengths attention boundaries test: add HF/Megatron consistency checks for PP=2 chore: remove internal-only skill references refactor: simplify Megatron checkpoint layout detection build: update flash attention to v2.7.0

Type

When to use

Example

feat

New user-visible feature or capability

feat: add packed SFT dataset assembly workflow

fix

Bug fix

fix: prevent cross-sample attention leakage in packed runs

docs

Documentation, README, guides, skills, comments-only docs

docs: explain cu_lengths attention boundaries

test

Tests, fixtures, consistency checks, validation scripts

test: add HF/Megatron consistency checks for PP=2

refactor

Code restructuring, no behavior change

refactor: move patch position block layout into task encoder

chore

Maintenance, internal cleanup, generated metadata, repo hygiene

chore: remove internal-only skill references

perf

Performance improvement without behavior change

perf: reduce token length scan overhead

build

Build system, dependencies, Docker, packaging

build: update flash attention to v2.7.0

ci

CI/CD configuration and automation

ci: add packed dataset smoke test job

style

Formatting-only changes, no code behavior change

style: format offline packing scripts

name	commit-message
description	Guide for writing clear, consistent git commit messages following this repository's conventions
compatibility	opencode
metadata	{"domain":"workflow","scope":"all-repos"}

name	commit-message
description	Guide for writing clear, consistent git commit messages following this repository's conventions
compatibility	opencode
metadata	{"domain":"workflow","scope":"all-repos"}

commit-message

Purpose

Format

Types

Rules

1. Subject Line

2. Body (when needed)

3. PR-linked Commits

4. Squash / Reword / Multi-author Rules

5. Multi-file Changes

6. What NOT to Write

Decision Tree

Examples from This Repository

Quick Template

Purpose

Format

Types

Rules

1. Subject Line

2. Body (when needed)

3. PR-linked Commits

4. Squash / Reword / Multi-author Rules

5. Multi-file Changes

6. What NOT to Write

Decision Tree

Examples from This Repository

Quick Template

commit-message

More from this repository

More from this repository

Purpose

Format

Types

Rules

1. Subject Line

2. Body (when needed)

3. PR-linked Commits

4. Squash / Reword / Multi-author Rules

5. Multi-file Changes

6. What NOT to Write

Decision Tree

Examples from This Repository

Quick Template

Purpose

Format

Types

Rules

1. Subject Line

2. Body (when needed)

3. PR-linked Commits

4. Squash / Reword / Multi-author Rules

5. Multi-file Changes

6. What NOT to Write

Decision Tree

Examples from This Repository

Quick Template