Skip to main content
在 Manus 中运行任何 Skill
一键导入
$pwd:
vllm-project
GitHub 创作者资料

vllm-project

按仓库查看 5 个 GitHub 仓库中的 30 个已收集 skills,并展示近似职业覆盖。

已收集 skills
30
仓库
5
职业领域
1
更新
2026-05-30
职业覆盖
该创作者主要覆盖的职业大类。
仓库浏览

仓库与代表性 skills

#001
semantic-router
17 个 skills4.2k689更新于 2026-05-30
占该创作者 57%
vllm-semantic-router-harness
软件开发工程师

Bridges native skill discovery into the vLLM Semantic Router repository harness, routing tasks through the canonical agent-report flow, repo-local skill registry, and validation commands. Use when starting any task inside the vLLM Semantic Router repository to resolve the correct primary skill, read canonical docs, and run harness validation.

2026-05-30
config-platform-change
软件开发工程师

Synchronizes config representations across router config, Python CLI schema, and dashboard config UI. Use when adding or changing a config concept that spans those surfaces or addressing config representation debt before Kubernetes-facing translation.

2026-05-30
k8s-platform-change
软件开发工程师

Modifies Kubernetes-facing operator, CRD, deployment-profile, or DSL translation behavior for semantic-router platform integration. Use when changing operator APIs or controllers, deployment stack manifests, profile-owned platform wiring, or router-to-Kubernetes translation layers.

2026-05-30
maintainer-issue-pr-management
软件开发工程师

Manages GitHub issue and pull-request lifecycle including creation, updates, triage labelling, and closeout metadata using canonical templates and repository taxonomy. Use when a maintainer asks to create, update, close, or triage GitHub issues or PRs, or when issue creation requires codebase analysis for scope, labels, or acceptance criteria.

2026-05-30
maintainer-release-ops
软件开发工程师

Maintainer release and milestone operating workflow. Use when a maintainer wants to plan a release, create milestone issues, sync GitHub issue or PR state, generate a daily review brief, or manage stale PRs and backlog routing.

2026-05-30
routing-calibration-loop
软件开发工程师

Calibrates routing changes against a live router endpoint with executable probes, local DSL validation, versioned deploys, and structured failure review. Use when tuning signals, projections, decisions, or maintained route examples against a real apiserver.

2026-05-30
plugin-end-to-end
软件开发工程师

Implements end-to-end plugin changes spanning router config, post-decision processing, optional CLI/UI exposure, and E2E test coverage. Use when adding a new plugin type, changing plugin config schema or execution semantics, updating plugin chain behavior, or modifying plugin-exposed metadata across surfaces.

2026-05-30
router-service-platform-change
软件开发工程师

Modifies router-side API, authz, memory, provider, storage, or runtime service modules outside config, decision, selection, and extproc plugin chains. Use when changing apiserver endpoints, authz or rate-limit policy code, memory or response storage flows, provider adapters, or other router service-platform modules.

2026-05-30
当前展示该仓库 Top 8 / 17 个已收集 skills。
#002
vllm-skills
5 个 skills7622更新于 2026-04-03
占该创作者 17%
vllm-bench-random-synthetic
软件开发工程师

Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and other key performance metrics. Use when the user wants to quickly test vLLM serving performance without downloading external datasets.

2026-04-03
vllm-bench-serve
软件开发工程师

Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.

2026-04-03
vllm-deploy-k8s
网络与计算机系统管理员

Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.

2026-04-03
vllm-deploy-simple
网络与计算机系统管理员

Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.

2026-04-03
vllm-prefix-cache-bench
软件开发工程师

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

2026-04-03
#003
vllm-omni
4 个 skills4.9k1.0k更新于 2026-05-26
占该创作者 13%
diffusion-perf-opt
软件开发工程师

Diagnose and optimize vLLM Omni diffusion workloads, especially Wan/Qwen/Flux-style image and video generation. Use when Codex is asked to analyze profiling traces, choose parallel strategies, inspect torch profiler trace.json or trace.json.gz timelines, estimate optimization ROI, investigate GPU idle/free bubbles, compare USP/CFG/HSDP/VAE parallelism, or design operator/host/quantization optimizations for vLLM Omni.

2026-05-26
add-diffusion-model
软件开发工程师

Add a new diffusion model (text-to-image, text-to-video, image-to-video, text-to-audio, image editing) to vLLM-Omni, including Cache-DiT acceleration and parallelism support (TP, SP/USP, CFG-Parallel, HSDP). Use when integrating a new diffusion model, porting a diffusers pipeline or a custom model repo to vllm-omni, creating a new DiT transformer adapter, adding diffusion model support, or enabling multi-GPU parallelism and cache acceleration for an existing model.

2026-05-11
add-tts-model
软件开发工程师

Integrate a new text-to-speech model into vLLM-Omni from HuggingFace reference implementation through production-ready serving with streaming and CUDA graph acceleration. Use when adding a new TTS model, wiring stage separation for speech synthesis, enabling online voice generation serving, debugging TTS integration behavior, or building audio output pipelines.

2026-05-05
vllm-omni-npu-model-runner-upgrade
软件开发工程师

Upgrade vllm-omni NPU model runners (OmniNPUModelRunner, NPUARModelRunner, NPUGenerationModelRunner) to align with the latest vllm-ascend NPUModelRunner while preserving omni-specific logic.

2026-04-18
#004
vllm-ascend
3 个 skills2.2k1.3k更新于 2026-05-22
占该创作者 10%
已展示 5 / 5 个仓库
已展示全部仓库