mit einem Klick
mflux-debugging
// Debug MLX ports by comparing against a PyTorch/diffusers reference via exported tensors/images (export-then-compare).
// Debug MLX ports by comparing against a PyTorch/diffusers reference via exported tensors/images (export-then-compare).
Navigate MFLUX CLI capabilities, locate commands by area, and summarize supported features.
Set up and work in the mflux dev environment (arm64 expectation, uv, Makefile targets, lint/format/test).
Manually validate mflux CLIs by exercising the changed paths and reviewing output images/artifacts.
Port ML models into mflux/MLX with correctness-first validation, then refactor toward mflux style.
Make a clean PR in mflux (inspect diff, quick verification, commit, push, open PR) using repo conventions.
Prepare a release in mflux (version bump, changelog, contributors, uv lock) without tagging/publishing. Use when preparing a release branch or release PR.
| name | mflux-debugging |
| description | Debug MLX ports by comparing against a PyTorch/diffusers reference via exported tensors/images (export-then-compare). |
Use this skill when you are porting a model to MLX and need to prove numerical parity (or isolate where it diverges) versus a PyTorch reference implementation (often from diffusers).
This skill defaults to export-then-compare:
mflux-model-porting).uv to run Python: uv run python -m ...MFLUX_PRESERVE_TEST_OUTPUT=1 (see mflux-testing and the Makefile test targets).mflux-model-porting.seed is not enough for parity—export the exact initial noise/latents from the reference and load them in MLX.diffusers/) and mflux/ are frequently next to each other on disk (e.g. both on your Desktop). Use absolute paths when in doubt.For day-to-day debugging, prefer a minimal paired repro:
diffusers/), e.g. diffusers/flux2_klein_edit_debug.pymflux/, e.g. mflux/flux2_klein_edit_debug.pyKeep them “boring”: hardcoded variables, no cli, no framework, and just a few np.savez(...) / mx.save(...) lines at the right spot.
The key trick for RNG parity:
latents=...) so the run definitely uses the dumped tensor.Start coarse, then narrow:
Tip: work “backwards from pixels” like mflux-model-porting suggests: validate VAE decode first with exported latents, then the diffusion/transformer loop, then text encoder.
Create a run directory like:
./debug_artifacts/<run_id>/ref/Export with one of these patterns:
np.savez(path, **tensors_as_numpy)torch.save(dict_of_tensors, path)Create a matching run directory:
./debug_artifacts/<run_id>/mlx/Load and compare tensors. For each checkpoint, report:
Suggested tolerance starting points (adjust per component):
atol=1e-5, rtol=1e-5atol=1e-2, rtol=1e-2png visually, since tiny numeric diffs can look identical.If a checkpoint fails:
float16 can produce NaNs; prefer bfloat16 for reference dumps if you see NaNs.debug_artifacts/<run_id>/... at repo root.debug_artifacts/ unless explicitly asked.mflux-testing).mflux-model-porting: correctness-first workflow (validate components and lock behavior before refactor).mflux-testing: how to run tests safely and handle image outputs/goldens.