Skip to main content
Ejecuta cualquier Skill en Manus
con un clic
$pwd:
BBuf
GitHub creator profile

BBuf

Repository-level view of 12 collected skills across 1 GitHub repositories, including approximate occupation coverage.

skills collected
12
repositories
1
occupation fields
1
updated
2026-05-26
occupation focus
Major fields detected across this creator.
repository explorer

Repositories and representative skills

#001
AI-Infra-Auto-Driven-SKILLS
12 skills48341updated 2026-05-26
100% of creator
vllm-sota-humanize-loop
Desarrolladores de software

Run an autonomous Humanize-governed vLLM SOTA performance loop for one LLM model: first perform the fixed fair vLLM/SGLang/TensorRT-LLM deployment search and benchmark, then start one RLCR loop that repeatedly decides the gap, profiles the current bottleneck, runs layer/kernel pipeline analysis, patches vLLM code, optionally uses ncu-report-skill for kernel evidence, and revalidates until vLLM matches or beats the best observed framework under the same workload and SLA.

2026-05-26
model-pr-history-knowledge
Científicos de datos

Use when an SGLang, vLLM, or TensorRT-LLM serving/model optimization task needs prior model-family PR evidence. Query and read the PR-driven history docs under model-pr-optimization-history before choosing source paths, fast paths, kernel/fusion ideas, regression risks, or validation lanes.

2026-05-26
sglang-sota-humanize-loop
Desarrolladores de software

Run an autonomous Humanize-governed SGLang SOTA performance loop for one LLM model: first perform the fixed fair SGLang/vLLM/TensorRT-LLM deployment search and benchmark, then start one RLCR loop that repeatedly decides the gap, profiles the current bottleneck, runs layer/kernel pipeline analysis, patches SGLang code, optionally uses ncu-report-skill for kernel evidence, and revalidates until SGLang matches or beats the best observed framework under the same workload and SLA.

2026-05-26
llm-pipeline-analysis
Desarrolladores de software

Inspect LLM torch profiler traces at forward-pass, layer, and kernel level. Use when you need layer timings, anchor-kernel boundaries, representative kernel flows, or Perfetto time ranges.

2026-05-26
sglang-humanize-review
Analistas de garantía de calidad de software y probadores

Perform SGLang code review in the style of human maintainers by consulting the 2024-2025 non-agent PR review corpus, including inline code snippets, original multilingual comments, and discussion threads. Use when reviewing SGLang PRs, diffs, patches, or local changes for correctness, tests, performance, GPU/runtime risks, API compatibility, and maintainability.

2026-05-20
llm-serving-capacity-planner
Administradores de redes y sistemas informáticos

Parse SGLang/vLLM startup logs to explain GPU memory use and request capacity. Use for KV cache budget, mem-fraction-static comparisons, OOM triage, and max-concurrency estimates.

2026-05-20
model-compute-simulation
Científicos de datos

Build an operator-level compute template for an LLM and estimate FLOPs/MFU for a serving shape. Use when you need tensor shapes, per-op FLOPs, kernel-to-op MFU mapping, or parallelism what-if analysis.

2026-05-20
llm-serving-auto-benchmark
Desarrolladores de software

Framework-independent LLM serving benchmark skill for comparing SGLang, vLLM, TensorRT-LLM, or another serving framework. Use when a user wants to find the best deployment command for one model across multiple serving frameworks under the same workload, GPU budget, and latency SLA.

2026-05-16
Showing top 8 of 12 collected skills in this repository.
Mostrando 1 de 1 repositorios
Todos los repositorios cargados