Install vLLM Semantic Router in agent-safe mode, import supported OpenClaw model providers into canonical VSR config, and rewrite OpenClaw to target VSR.

2026-07-13

vllm-semantic-router-harness

其他计算机职业

Bridges native skill discovery into the vLLM Semantic Router repository harness, routing tasks through the canonical agent-report flow, repo-local skill registry, and validation commands. Use when starting any task inside the vLLM Semantic Router repository to resolve the correct primary skill, read canonical docs, and run harness validation.

2026-05-30

config-platform-change

软件开发工程师

Synchronizes config representations across router config, Python CLI schema, and dashboard config UI. Use when adding or changing a config concept that spans those surfaces or addressing config representation debt before Kubernetes-facing translation.

2026-05-30

k8s-platform-change

软件开发工程师

Modifies Kubernetes-facing operator, CRD, deployment-profile, or DSL translation behavior for semantic-router platform integration. Use when changing operator APIs or controllers, deployment stack manifests, profile-owned platform wiring, or router-to-Kubernetes translation layers.

2026-05-30

maintainer-issue-pr-management

其他计算机职业

Manages GitHub issue and pull-request lifecycle including creation, updates, triage labelling, and closeout metadata using canonical templates and repository taxonomy. Use when a maintainer asks to create, update, close, or triage GitHub issues or PRs, or when issue creation requires codebase analysis for scope, labels, or acceptance criteria.

2026-05-30

maintainer-release-ops

其他计算机职业

Maintainer release and milestone operating workflow. Use when a maintainer wants to plan a release, create milestone issues, sync GitHub issue or PR state, generate a daily review brief, or manage stale PRs and backlog routing.

2026-05-30

routing-calibration-loop

软件开发工程师

Calibrates routing changes against a live router endpoint with executable probes, local DSL validation, versioned deploys, and structured failure review. Use when tuning signals, projections, decisions, or maintained route examples against a real apiserver.

2026-05-30

plugin-end-to-end

软件开发工程师

Implements end-to-end plugin changes spanning router config, post-decision processing, optional CLI/UI exposure, and E2E test coverage. Use when adding a new plugin type, changing plugin config schema or execution semantics, updating plugin chain behavior, or modifying plugin-exposed metadata across surfaces.

2026-05-30

当前展示该仓库 Top 8 / 17 个已收集 skills。

#002

vllm-omni

7 个 skills5.6k1.3k更新于 2026-07-16

占该创作者 16%

skill

职业分类

描述

更新

add-diffusion-model

软件开发工程师

Add a new diffusion model (text-to-image, text-to-video, image-to-video, text-to-audio, image editing) to vLLM-Omni, including Cache-DiT acceleration and parallelism support (TP, SP/USP, CFG-Parallel, HSDP). Use when integrating a new diffusion model, porting a diffusers pipeline or a custom model repo to vllm-omni, creating a new DiT transformer adapter, adding diffusion model support, or enabling multi-GPU parallelism and cache acceleration for an existing model.

2026-07-16

vllm-omni-test

软件质量保证分析师与测试员

Generate and run tests for vllm-project/vllm-omni with CI-aligned levels and markers; wire new tests into Buildkite (test-ready.yml for L1/L2, test-merge.yml for L3, test-nightly.yml for L4). On completion, always provide copy-paste local and CI-like pytest commands plus prerequisites. Use when creating regression tests, adding L1-L4 coverage, selecting pytest markers, or validating fixes from issues/PRs.

2026-07-16

precheck-pr

软件质量保证分析师与测试员

Self-check your branch before creating a PR — catch dead code, verify accuracy/perf claims, validate PR title format, and confirm merge readiness. Use when the user says "precheck", "self review", "pre-submit check", or "check my PR before I open it." Never posts to GitHub.

2026-06-26

add-tts-model

软件开发工程师

Integrate a new text-to-speech model into vLLM-Omni from HuggingFace reference implementation through production-ready serving with streaming and CUDA graph acceleration. Use when adding a new TTS model, wiring stage separation for speech synthesis, enabling online voice generation serving, debugging TTS integration behavior, or building audio output pipelines.

2026-06-18

quantization

软件开发工程师

Work on vLLM-Omni quantization for diffusion, autoregressive, omni, or multi-stage models. Use when choosing or adding methods such as fp8, int8, gguf, mxfp8, mxfp4, mxfp4_dualscale, ModelOpt, AutoRound, INC, msModelSlim, awq, or gptq; debugging quantized loading; or validating memory, speed, and output quality.

2026-06-10

diffusion-perf-opt

软件开发工程师

Diagnose and optimize vLLM Omni diffusion workloads, especially Wan/Qwen/Flux-style image and video generation. Use when Codex is asked to analyze profiling traces, choose parallel strategies, inspect torch profiler trace.json or trace.json.gz timelines, estimate optimization ROI, investigate GPU idle/free bubbles, compare USP/CFG/HSDP/VAE parallelism, or design operator/host/quantization optimizations for vLLM Omni.

2026-05-26

vllm-omni-npu-model-runner-upgrade

软件开发工程师

Upgrade vllm-omni NPU model runners (OmniNPUModelRunner, NPUARModelRunner, NPUGenerationModelRunner) to align with the latest vllm-ascend NPUModelRunner while preserving omni-specific logic.

2026-04-18

#003

vime

6 个 skills37060更新于 2026-06-29

占该创作者 14%

skill

职业分类

描述

更新

add-tests-and-ci

软件质量保证分析师与测试员

Guide for adding or updating vime tests and CI wiring. Use when tasks require new test cases, CI registration, test matrix updates, or workflow template changes.

2026-06-29

vime-code-review-preferences

软件质量保证分析师与测试员

Use when reviewing or editing vime code, especially refactors around helper APIs, branch selection, argument validation, or recurring reviewer preferences about avoiding unnecessary wrappers and making control flow self-explanatory.

2026-06-29

add-dynamic-filter

软件开发工程师

Guide for adding dynamic/filter hooks in vime rollout pipeline. Use when user wants sample-group selection during rollout, buffer filtering before training, or per-sample masking/processing hooks.

2026-06-04

add-eval-dataset-config

软件开发工程师

Guide for adding and validating evaluation dataset configuration in vime. Use when user wants to configure eval datasets via --eval-config or --eval-prompt-data, add per-dataset overrides, or customize evaluation rollout behavior.

2026-06-04

add-reward-function

软件开发工程师

Guide for adding a custom reward function in vime and wiring it through --custom-rm-path (and optional reward post-processing). Use when user wants new reward logic, remote/service reward integration, or task-specific reward shaping.

2026-06-04

add-rollout-function

软件开发工程师

Guide for adding a new rollout function in vime and wiring it through --rollout-function-path. Use when user wants to implement custom rollout data generation logic, custom train/eval rollout outputs, or migrate from the default vLLM rollout path.

2026-06-04

#004

vllm-skills

6 个 skills8723更新于 2026-04-03

占该创作者 14%

skill

职业分类

描述

更新

vllm-bench-random-synthetic

数据科学家

Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and other key performance metrics. Use when the user wants to quickly test vLLM serving performance without downloading external datasets.

2026-04-03

vllm-bench-serve

数据科学家

Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.

2026-04-03

vllm-deploy-docker

网络与计算机系统管理员

Deploy vLLM using Docker (pre-built images or build-from-source) with NVIDIA GPU support and run the OpenAI-compatible server.

2026-04-03

vllm-deploy-k8s

网络与计算机系统管理员

Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.

2026-04-03

vllm-deploy-simple

软件开发工程师

Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.

2026-04-03

vllm-prefix-cache-bench

软件开发工程师

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

2026-04-03

#005

llm-compressor

3 个 skills3.6k581更新于 2026-07-08

占该创作者 6.8%

skill

职业分类

描述

更新

fp8

软件开发工程师

Generate a working FP8 quantization example script and save a compressed-tensors checkpoint. Triggers on: "fp8", "FP8_DYNAMIC", "FP8_BLOCK", "MXFP8", "fp8 example", "quantize to fp8".

2026-07-08

nvfp4

软件开发工程师

Generate a working NVFP4 (W4A4) quantization example script and save a compressed-tensors checkpoint. Triggers on: "nvfp4", "NVFP4", "fp4", "nvfp4 example", "quantize to nvfp4", "w4a4".

2026-07-08

create-tiny-model

软件开发工程师

Create and manage tiny models for testing and development. Includes utilities for saving tiny models, inspecting tensors, and finetuning workflows.

2026-06-30

#006

vllm-ascend

2 个 skills2.4k1.7k更新于 2026-07-15

占该创作者 4.5%

skill

职业分类

描述

更新

vllm-ascend-release

软件开发工程师

End-to-end release management skill for vLLM Ascend. Creates release checklist issues, identifies critical bugs, runs functional tests, invokes release note generation, and guides through the complete release process.

2026-07-15

vllm-ascend-model-adapter

软件开发工程师

Adapt and debug existing or new models for vLLM on Ascend NPU. Implement in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend, validate via direct vllm serve from /workspace, and deliver one signed commit in the current repo.

2026-02-26

#007

vllm

1 个 skills86.5k19.5k更新于 2026-06-19

占该创作者 2.3%

skill

职业分类

描述

更新

ci-fails-buildkite

软件质量保证分析师与测试员

Fetch and diagnose vLLM Buildkite CI failure logs. Use when investigating failing CI jobs on a PR or build, when the user pastes a buildkite.com URL, or asks to fetch/diagnose CI logs.

2026-06-19

#008

recipes

1 个 skills919332更新于 2026-07-13

占该创作者 2.3%

skill

职业分类

描述

更新

add-recipe

软件开发工程师

Use when the user asks to add, contribute, or create a new vLLM recipe in this repo (e.g. "add a recipe for Qwen/Qwen3-XYZ", "create a recipe for huggingface.co/org/model"). Walks through fetching HF metadata, authoring the YAML at models/<hf_org>/<hf_repo>.yaml, picking variants/strategies, validating, and committing.

2026-07-13

#009

speculators

1 个 skills624158更新于 2026-07-16

占该创作者 2.3%

skill

职业分类

描述

更新

pr-review

软件质量保证分析师与测试员

Review a GitHub PR with design-first analysis, posted as a GitHub review.

2026-07-16

已展示 9 / 9 个仓库

已展示全部仓库