원클릭으로
hf-layer-package-jobs
// Use when changing mesh-llm automation or CLI flows that discover Hugging Face GGUF models, plan CPU Hugging Face Jobs for layer-package splitting, estimate max cost, or publish skippy layer packages/catalog entries.
// Use when changing mesh-llm automation or CLI flows that discover Hugging Face GGUF models, plan CPU Hugging Face Jobs for layer-package splitting, estimate max cost, or publish skippy layer packages/catalog entries.
Use this skill when running benchmark orchestration, local single-stage or split benchmarks, benchmark report flow, or performance-oriented skippy runtime checks.
Use this skill when validating skippy staged execution against full-model execution, adding model families, changing split boundaries, testing activation wire dtypes, or diagnosing mismatch behavior.
Use this skill when inspecting GGUF models, planning layer ranges, generating or validating skippy package artifacts, fake packages for direct GGUFs, materialized stage cache behavior, or GGUF writer integration.
Use this skill when running, configuring, debugging, or embedding skippy-server, binary stage transport, OpenAI frontend integration, activation wire dtype settings, stage configs, lifecycle status, or nonblocking telemetry.
Use this skill when certifying mesh-llm KV/cache stability under repeated OpenAI tool-call loops, same-prefix cache reuse, suffix-prefill limits, or native Skippy slot/decode/eviction failures.
Use this skill when adding, renaming, removing, or reviewing mesh-llm OTLP metrics, telemetry attributes, metrics exporter settings, or telemetry documentation.
| name | hf-layer-package-jobs |
| description | Use when changing mesh-llm automation or CLI flows that discover Hugging Face GGUF models, plan CPU Hugging Face Jobs for layer-package splitting, estimate max cost, or publish skippy layer packages/catalog entries. |
| metadata | {"short-description":"Maintain HF layer package job automation"} |
Use this skill for the models package CLI, the model-package crate, and the daily Unsloth queue workflow.
unsloth/Qwen3-8B-GGUF:Q4_K_M; do not split the quant into a separate --quant argument for generated job inputs.--confirm before submitting jobs.--confirm, submit at most the requested number of jobs, wait for every submitted HF Job, and fail if any job finishes unsuccessfully.Run Rust formatting and the focused package checks before committing:
cargo fmt --all -- --check
cargo test -p model-package
cargo check -p mesh-llm-host-runtime
For behavior smoke tests, use a tiny dry run first:
cargo run -p model-package --bin queue-unsloth-layer-packages -- --max-jobs 1 --recent-limit 3 --popular-limit 3 --dry-run