Install vLLM Semantic Router in agent-safe mode, import supported OpenClaw model providers into canonical VSR config, and rewrite OpenClaw to target VSR.

2026-07-13

vllm-semantic-router-harness

Autres occupations informatiques

Bridges native skill discovery into the vLLM Semantic Router repository harness, routing tasks through the canonical agent-report flow, repo-local skill registry, and validation commands. Use when starting any task inside the vLLM Semantic Router repository to resolve the correct primary skill, read canonical docs, and run harness validation.

2026-05-30

config-platform-change

Développeurs de logiciels

Synchronizes config representations across router config, Python CLI schema, and dashboard config UI. Use when adding or changing a config concept that spans those surfaces or addressing config representation debt before Kubernetes-facing translation.

2026-05-30

k8s-platform-change

Développeurs de logiciels

Modifies Kubernetes-facing operator, CRD, deployment-profile, or DSL translation behavior for semantic-router platform integration. Use when changing operator APIs or controllers, deployment stack manifests, profile-owned platform wiring, or router-to-Kubernetes translation layers.

2026-05-30

maintainer-issue-pr-management

Autres occupations informatiques

Manages GitHub issue and pull-request lifecycle including creation, updates, triage labelling, and closeout metadata using canonical templates and repository taxonomy. Use when a maintainer asks to create, update, close, or triage GitHub issues or PRs, or when issue creation requires codebase analysis for scope, labels, or acceptance criteria.

2026-05-30

maintainer-release-ops

Autres occupations informatiques

Maintainer release and milestone operating workflow. Use when a maintainer wants to plan a release, create milestone issues, sync GitHub issue or PR state, generate a daily review brief, or manage stale PRs and backlog routing.

2026-05-30

routing-calibration-loop

Développeurs de logiciels

Calibrates routing changes against a live router endpoint with executable probes, local DSL validation, versioned deploys, and structured failure review. Use when tuning signals, projections, decisions, or maintained route examples against a real apiserver.

2026-05-30

plugin-end-to-end

Développeurs de logiciels

Implements end-to-end plugin changes spanning router config, post-decision processing, optional CLI/UI exposure, and E2E test coverage. Use when adding a new plugin type, changing plugin config schema or execution semantics, updating plugin chain behavior, or modifying plugin-exposed metadata across surfaces.

2026-05-30

Affichage des 8 principaux skills collectés sur 17 dans ce dépôt.

#002

vllm-omni

7 skills5.6k1.3kmis à jour 2026-07-16

16% du créateur

skill

métier

description

mis à jour

add-diffusion-model

Développeurs de logiciels

Add a new diffusion model (text-to-image, text-to-video, image-to-video, text-to-audio, image editing) to vLLM-Omni, including Cache-DiT acceleration and parallelism support (TP, SP/USP, CFG-Parallel, HSDP). Use when integrating a new diffusion model, porting a diffusers pipeline or a custom model repo to vllm-omni, creating a new DiT transformer adapter, adding diffusion model support, or enabling multi-GPU parallelism and cache acceleration for an existing model.

2026-07-16

vllm-omni-test

Analystes en assurance qualité des logiciels et testeurs

Generate and run tests for vllm-project/vllm-omni with CI-aligned levels and markers; wire new tests into Buildkite (test-ready.yml for L1/L2, test-merge.yml for L3, test-nightly.yml for L4). On completion, always provide copy-paste local and CI-like pytest commands plus prerequisites. Use when creating regression tests, adding L1-L4 coverage, selecting pytest markers, or validating fixes from issues/PRs.

2026-07-16

precheck-pr

Analystes en assurance qualité des logiciels et testeurs

Self-check your branch before creating a PR — catch dead code, verify accuracy/perf claims, validate PR title format, and confirm merge readiness. Use when the user says "precheck", "self review", "pre-submit check", or "check my PR before I open it." Never posts to GitHub.

2026-06-26

add-tts-model

Développeurs de logiciels

Integrate a new text-to-speech model into vLLM-Omni from HuggingFace reference implementation through production-ready serving with streaming and CUDA graph acceleration. Use when adding a new TTS model, wiring stage separation for speech synthesis, enabling online voice generation serving, debugging TTS integration behavior, or building audio output pipelines.

2026-06-18

quantization

Développeurs de logiciels

Work on vLLM-Omni quantization for diffusion, autoregressive, omni, or multi-stage models. Use when choosing or adding methods such as fp8, int8, gguf, mxfp8, mxfp4, mxfp4_dualscale, ModelOpt, AutoRound, INC, msModelSlim, awq, or gptq; debugging quantized loading; or validating memory, speed, and output quality.

2026-06-10

diffusion-perf-opt

Développeurs de logiciels

Diagnose and optimize vLLM Omni diffusion workloads, especially Wan/Qwen/Flux-style image and video generation. Use when Codex is asked to analyze profiling traces, choose parallel strategies, inspect torch profiler trace.json or trace.json.gz timelines, estimate optimization ROI, investigate GPU idle/free bubbles, compare USP/CFG/HSDP/VAE parallelism, or design operator/host/quantization optimizations for vLLM Omni.

2026-05-26

vllm-omni-npu-model-runner-upgrade

Développeurs de logiciels

Upgrade vllm-omni NPU model runners (OmniNPUModelRunner, NPUARModelRunner, NPUGenerationModelRunner) to align with the latest vllm-ascend NPUModelRunner while preserving omni-specific logic.

2026-04-18

#003

vime

6 skills37060mis à jour 2026-06-29

14% du créateur

skill

métier

description

mis à jour

add-tests-and-ci

Analystes en assurance qualité des logiciels et testeurs

Guide for adding or updating vime tests and CI wiring. Use when tasks require new test cases, CI registration, test matrix updates, or workflow template changes.

2026-06-29

vime-code-review-preferences

Analystes en assurance qualité des logiciels et testeurs

Use when reviewing or editing vime code, especially refactors around helper APIs, branch selection, argument validation, or recurring reviewer preferences about avoiding unnecessary wrappers and making control flow self-explanatory.

2026-06-29

add-dynamic-filter

Développeurs de logiciels

Guide for adding dynamic/filter hooks in vime rollout pipeline. Use when user wants sample-group selection during rollout, buffer filtering before training, or per-sample masking/processing hooks.

2026-06-04

add-eval-dataset-config

Développeurs de logiciels

Guide for adding and validating evaluation dataset configuration in vime. Use when user wants to configure eval datasets via --eval-config or --eval-prompt-data, add per-dataset overrides, or customize evaluation rollout behavior.

2026-06-04

add-reward-function

Développeurs de logiciels

Guide for adding a custom reward function in vime and wiring it through --custom-rm-path (and optional reward post-processing). Use when user wants new reward logic, remote/service reward integration, or task-specific reward shaping.

2026-06-04

add-rollout-function

Développeurs de logiciels

Guide for adding a new rollout function in vime and wiring it through --rollout-function-path. Use when user wants to implement custom rollout data generation logic, custom train/eval rollout outputs, or migrate from the default vLLM rollout path.

2026-06-04

#004

vllm-skills

6 skills8723mis à jour 2026-04-03

14% du créateur

skill

métier

description

mis à jour

vllm-bench-random-synthetic

Scientifiques des données

Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and other key performance metrics. Use when the user wants to quickly test vLLM serving performance without downloading external datasets.

2026-04-03

vllm-bench-serve

Scientifiques des données

Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.

2026-04-03

vllm-deploy-docker

Administrateurs de réseaux et de systèmes informatiques

Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.

2026-04-03

vllm-deploy-k8s

Administrateurs de réseaux et de systèmes informatiques

Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.

2026-04-03

vllm-deploy-simple

Développeurs de logiciels

Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.

2026-04-03

vllm-prefix-cache-bench

Développeurs de logiciels

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

2026-04-03

#005

llm-compressor

3 skills3.6k581mis à jour 2026-07-08

6.8% du créateur

skill

métier

description

mis à jour

fp8

Développeurs de logiciels

Generate a working FP8 quantization example script and save a compressed-tensors checkpoint. Triggers on: "fp8", "FP8_DYNAMIC", "FP8_BLOCK", "MXFP8", "fp8 example", "quantize to fp8".

2026-07-08

nvfp4

Développeurs de logiciels

Generate a working NVFP4 (W4A4) quantization example script and save a compressed-tensors checkpoint. Triggers on: "nvfp4", "NVFP4", "fp4", "nvfp4 example", "quantize to nvfp4", "w4a4".

2026-07-08

create-tiny-model

Développeurs de logiciels

Create and manage tiny models for testing and development. Includes utilities for saving tiny models, inspecting tensors, and finetuning workflows.

2026-06-30

#006

vllm-ascend

2 skills2.4k1.7kmis à jour 2026-07-15

4.5% du créateur

skill

métier

description

mis à jour

vllm-ascend-release

Développeurs de logiciels

End-to-end release management skill for vLLM Ascend. Creates release checklist issues, identifies critical bugs, runs functional tests, invokes release note generation, and guides through the complete release process.

2026-07-15

vllm-ascend-model-adapter

Développeurs de logiciels

Adapt and debug existing or new models for vLLM on Ascend NPU. Implement in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend, validate via direct vllm serve from /workspace, and deliver one signed commit in the current repo.

2026-02-26

#007

vllm

1 skills86.5k19.5kmis à jour 2026-06-19

2.3% du créateur

skill

métier

description

mis à jour

ci-fails-buildkite

Analystes en assurance qualité des logiciels et testeurs

Fetch and diagnose vLLM Buildkite CI failure logs. Use when investigating failing CI jobs on a PR or build, when the user pastes a buildkite.com URL, or asks to fetch/diagnose CI logs.

2026-06-19

#008

recipes

1 skills919332mis à jour 2026-07-13

2.3% du créateur

skill

métier

description

mis à jour

add-recipe

Développeurs de logiciels

Use when the user asks to add, contribute, or create a new vLLM recipe in this repo (e.g. "add a recipe for Qwen/Qwen3-XYZ", "create a recipe for huggingface.co/org/model"). Walks through fetching HF metadata, authoring the YAML at models/<hf_org>/<hf_repo>.yaml, picking variants/strategies, validating, and committing.

2026-07-13

#009

speculators

1 skills624158mis à jour 2026-07-16

2.3% du créateur

skill

métier

description

mis à jour

pr-review

Analystes en assurance qualité des logiciels et testeurs

Review a GitHub PR with design-first analysis, posted as a GitHub review.

2026-07-16

9 dépôts affichés sur 9

Tous les dépôts sont affichés