Bilingual guide for running and interpreting LLaVA-OneVision2 HF vs Megatron consistency checks across TP and PP settings
Bilingual guide for merging ViT + LLM into LlavaOnevision2 HF checkpoint and validating weight/inference consistency
Guide for writing clear, consistent git commit messages following this repository's conventions
Bilingual guide for the OFFLINE_PACKING_BMR and OFFLINE_PACKED_DATA environment variables that control LLaVA-OneVision2 training-side packing — what each gate does, why both must be enabled together, MBS=1 requirement, and the dead OFFLINE_PACKING_VQA branch
Bilingual guide for running offline_packing/auto_pipe.sh across multiple nodes to produce padding-free packed WebDataset shards for SFT, with Energon Metadataset assembly
Bilingual guide for understanding how cu_lengths controls attention behavior across ViT and LLM stages, and how patch_positions scope differs between the two
Bilingual guide for understanding LengthPoolSortDataset cross-rank length synchronization mechanism in multi-GPU training
Bilingual guidance for Megatron checkpoint 1D 2D 3D mp_rank layouts across tensor pipeline and expert parallel dimensions