Skip to main content
Exécutez n'importe quel Skill dans Manus
en un clic
$pwd:
vllm-project
GitHub creator profile

vllm-project

Repository-level view of 30 collected skills across 5 GitHub repositories, including approximate occupation coverage.

skills collected
30
repositories
5
occupation fields
1
updated
2026-05-30
occupation focus
Major fields detected across this creator.
repository explorer

Repositories and representative skills

#001
semantic-router
17 skills4.2k689updated 2026-05-30
57% of creator
vllm-semantic-router-harness
Développeurs de logiciels

Bridges native skill discovery into the vLLM Semantic Router repository harness, routing tasks through the canonical agent-report flow, repo-local skill registry, and validation commands. Use when starting any task inside the vLLM Semantic Router repository to resolve the correct primary skill, read canonical docs, and run harness validation.

2026-05-30
config-platform-change
Développeurs de logiciels

Synchronizes config representations across router config, Python CLI schema, and dashboard config UI. Use when adding or changing a config concept that spans those surfaces or addressing config representation debt before Kubernetes-facing translation.

2026-05-30
k8s-platform-change
Développeurs de logiciels

Modifies Kubernetes-facing operator, CRD, deployment-profile, or DSL translation behavior for semantic-router platform integration. Use when changing operator APIs or controllers, deployment stack manifests, profile-owned platform wiring, or router-to-Kubernetes translation layers.

2026-05-30
maintainer-issue-pr-management
Développeurs de logiciels

Manages GitHub issue and pull-request lifecycle including creation, updates, triage labelling, and closeout metadata using canonical templates and repository taxonomy. Use when a maintainer asks to create, update, close, or triage GitHub issues or PRs, or when issue creation requires codebase analysis for scope, labels, or acceptance criteria.

2026-05-30
maintainer-release-ops
Développeurs de logiciels

Maintainer release and milestone operating workflow. Use when a maintainer wants to plan a release, create milestone issues, sync GitHub issue or PR state, generate a daily review brief, or manage stale PRs and backlog routing.

2026-05-30
routing-calibration-loop
Développeurs de logiciels

Calibrates routing changes against a live router endpoint with executable probes, local DSL validation, versioned deploys, and structured failure review. Use when tuning signals, projections, decisions, or maintained route examples against a real apiserver.

2026-05-30
plugin-end-to-end
Développeurs de logiciels

Implements end-to-end plugin changes spanning router config, post-decision processing, optional CLI/UI exposure, and E2E test coverage. Use when adding a new plugin type, changing plugin config schema or execution semantics, updating plugin chain behavior, or modifying plugin-exposed metadata across surfaces.

2026-05-30
router-service-platform-change
Développeurs de logiciels

Modifies router-side API, authz, memory, provider, storage, or runtime service modules outside config, decision, selection, and extproc plugin chains. Use when changing apiserver endpoints, authz or rate-limit policy code, memory or response storage flows, provider adapters, or other router service-platform modules.

2026-05-30
Showing top 8 of 17 collected skills in this repository.
#002
vllm-skills
5 skills7622updated 2026-04-03
17% of creator
vllm-bench-random-synthetic
Développeurs de logiciels

Run vLLM performance benchmark using synthetic random data to measure throughput, TTFT (Time to First Token), TPOT (Time per Output Token), and other key performance metrics. Use when the user wants to quickly test vLLM serving performance without downloading external datasets.

2026-04-03
vllm-bench-serve
Développeurs de logiciels

Benchmark vLLM or OpenAI-compatible serving endpoints using vllm bench serve. Supports multiple datasets (random, sharegpt, sonnet, HF), backends (openai, openai-chat, vllm-pooling, embeddings), throughput/latency testing with request-rate control, and result saving. Use when benchmarking LLM serving performance, measuring TTFT/TPOT, or load testing inference APIs.

2026-04-03
vllm-deploy-k8s
Administrateurs de réseaux et de systèmes informatiques

Deploy vLLM to Kubernetes (K8s) with GPU support, health probes, and OpenAI-compatible API endpoint. Use this skill whenever the user wants to deploy, run, or serve vLLM on a Kubernetes cluster, including creating deployments, services, checking existing deployments, or managing vLLM on K8s.

2026-04-03
vllm-deploy-simple
Administrateurs de réseaux et de systèmes informatiques

Quick install and deploy vLLM, start serving with a simple LLM, and test OpenAI API.

2026-04-03
vllm-prefix-cache-bench
Développeurs de logiciels

This is a skill for benchmarking the efficiency of automatic prefix caching in vLLM using fixed prompts, real-world datasets, or synthetic prefix/suffix patterns. Use when the user asks to benchmark prefix caching hit rate, caching efficiency, or repeated-prompt performance in vLLM.

2026-04-03
#003
vllm-omni
4 skills4.9k1.0kupdated 2026-05-26
13% of creator
diffusion-perf-opt
Développeurs de logiciels

Diagnose and optimize vLLM Omni diffusion workloads, especially Wan/Qwen/Flux-style image and video generation. Use when Codex is asked to analyze profiling traces, choose parallel strategies, inspect torch profiler trace.json or trace.json.gz timelines, estimate optimization ROI, investigate GPU idle/free bubbles, compare USP/CFG/HSDP/VAE parallelism, or design operator/host/quantization optimizations for vLLM Omni.

2026-05-26
add-diffusion-model
Développeurs de logiciels

Add a new diffusion model (text-to-image, text-to-video, image-to-video, text-to-audio, image editing) to vLLM-Omni, including Cache-DiT acceleration and parallelism support (TP, SP/USP, CFG-Parallel, HSDP). Use when integrating a new diffusion model, porting a diffusers pipeline or a custom model repo to vllm-omni, creating a new DiT transformer adapter, adding diffusion model support, or enabling multi-GPU parallelism and cache acceleration for an existing model.

2026-05-11
add-tts-model
Développeurs de logiciels

Integrate a new text-to-speech model into vLLM-Omni from HuggingFace reference implementation through production-ready serving with streaming and CUDA graph acceleration. Use when adding a new TTS model, wiring stage separation for speech synthesis, enabling online voice generation serving, debugging TTS integration behavior, or building audio output pipelines.

2026-05-05
vllm-omni-npu-model-runner-upgrade
Développeurs de logiciels

Upgrade vllm-omni NPU model runners (OmniNPUModelRunner, NPUARModelRunner, NPUGenerationModelRunner) to align with the latest vllm-ascend NPUModelRunner while preserving omni-specific logic.

2026-04-18
#004
vllm-ascend
3 skills2.2k1.3kupdated 2026-05-22
10% of creator
5 sur 5 depots affiches
Tous les depots sont affiches