一键导入
skippy-model-package
// Use this skill when inspecting GGUF models, planning layer ranges, generating or validating skippy package artifacts, fake packages for direct GGUFs, materialized stage cache behavior, or GGUF writer integration.
// Use this skill when inspecting GGUF models, planning layer ranges, generating or validating skippy package artifacts, fake packages for direct GGUFs, materialized stage cache behavior, or GGUF writer integration.
| name | skippy-model-package |
| description | Use this skill when inspecting GGUF models, planning layer ranges, generating or validating skippy package artifacts, fake packages for direct GGUFs, materialized stage cache behavior, or GGUF writer integration. |
| metadata | {"short-description":"Inspect and package GGUF stages"} |
Use this skill for model inspection, package planning, stage materialization, and cache behavior.
Rust owns package manifests, topology planning inputs, cache policy, and mesh model-storage integration. The patched llama/skippy ABI owns GGUF tensor inspection and GGUF artifact writing.
Direct GGUF loading in mesh should materialize as a fake package identity in the skippy runtime so the split-serving path can use the same package-backed stage machinery as Hugging Face packages.
Check current package names before running commands:
cargo metadata --no-deps --format-version 1 | jq -r '.packages[].name' | sort
Useful current checks in this repo:
cargo test -p skippy-runtime --lib
cargo test -p skippy-topology --lib
cargo test -p mesh-llm-host-runtime --lib inference::skippy
For a published layer package, prefer package-local diagnostics before a live split smoke:
cargo test -p skippy-model-package --bin skippy-model-package
skippy-model-package preflight <package-dir> --stages 2
Materialized stages are derived cache. Model storage commands may evict materialized stage artifacts without deleting the source model/package. Preserve pinned materialized artifacts unless the command explicitly asks for a stronger cleanup.
Use this skill when running benchmark orchestration, local single-stage or split benchmarks, benchmark report flow, or performance-oriented skippy runtime checks.
Use this skill when validating skippy staged execution against full-model execution, adding model families, changing split boundaries, testing activation wire dtypes, or diagnosing mismatch behavior.
Use this skill when running, configuring, debugging, or embedding skippy-server, binary stage transport, OpenAI frontend integration, activation wire dtype settings, stage configs, lifecycle status, or nonblocking telemetry.
Use this skill when certifying mesh-llm KV/cache stability under repeated OpenAI tool-call loops, same-prefix cache reuse, suffix-prefill limits, or native Skippy slot/decode/eviction failures.
Use when changing mesh-llm automation or CLI flows that discover Hugging Face GGUF models, plan CPU Hugging Face Jobs for layer-package splitting, estimate max cost, or publish skippy layer packages/catalog entries.
Use this skill when adding, renaming, removing, or reviewing mesh-llm OTLP metrics, telemetry attributes, metrics exporter settings, or telemetry documentation.