en un clic
skippy-bench
// Use this skill when running benchmark orchestration, local single-stage or split benchmarks, benchmark report flow, or performance-oriented skippy runtime checks.
// Use this skill when running benchmark orchestration, local single-stage or split benchmarks, benchmark report flow, or performance-oriented skippy runtime checks.
| name | skippy-bench |
| description | Use this skill when running benchmark orchestration, local single-stage or split benchmarks, benchmark report flow, or performance-oriented skippy runtime checks. |
| metadata | {"short-description":"Benchmark skippy stage runtime"} |
Use this skill for performance, orchestration, and report-oriented checks.
Use skippy-correctness when the question is pass/fail exactness.
Standalone skippy-bench may not be present in this mesh checkout yet. Confirm
available packages before using old source-repo commands:
cargo metadata --no-deps --format-version 1 | jq -r '.packages[].name' | sort
Useful current checks:
cargo test -p skippy-server --lib
cargo test -p mesh-llm-host-runtime --lib inference::skippy
When benchmark harnesses are imported, keep reporting separate from request-path serving. Stage runtimes emit telemetry; benchmark/report tooling owns reports.
Use this skill when validating skippy staged execution against full-model execution, adding model families, changing split boundaries, testing activation wire dtypes, or diagnosing mismatch behavior.
Use this skill when inspecting GGUF models, planning layer ranges, generating or validating skippy package artifacts, fake packages for direct GGUFs, materialized stage cache behavior, or GGUF writer integration.
Use this skill when running, configuring, debugging, or embedding skippy-server, binary stage transport, OpenAI frontend integration, activation wire dtype settings, stage configs, lifecycle status, or nonblocking telemetry.
Use this skill when certifying mesh-llm KV/cache stability under repeated OpenAI tool-call loops, same-prefix cache reuse, suffix-prefill limits, or native Skippy slot/decode/eviction failures.
Use when changing mesh-llm automation or CLI flows that discover Hugging Face GGUF models, plan CPU Hugging Face Jobs for layer-package splitting, estimate max cost, or publish skippy layer packages/catalog entries.
Use this skill when adding, renaming, removing, or reviewing mesh-llm OTLP metrics, telemetry attributes, metrics exporter settings, or telemetry documentation.