flashinfer-ai Agent Skills

skill

المهنة

الوصف

آخر تحديث

Auto-collect workloads from SGLang inference runs using FlashInfer logging API. Dumps tensors, sanitizes them according to kernel definitions, and submits PR to flashinfer-trace workload repo.

2026-05-01

discover-models

مطوّرو البرمجيات

Discover candidate LLMs and produce a kernel inventory — required definitions, classified as existing/new and fi_supported/fi_missing — for onboarding. Use as Phase 1 of /onboard-model, or standalone to plan onboarding work.

2026-05-01

extract-kernel-definitions

مطوّرو البرمجيات

Generate Definition JSON files for the flashinfer-trace HuggingFace dataset by harvesting them from a short SGLang inference pass (FlashInfer's @flashinfer_api(trace=...) dumper) — or, as a fallback, by manually transcribing the schema from SGLang sources when FlashInfer doesn't yet have a trace template. Use when adding a new model, extracting GPU kernels (MLA, MoE, GQA, RMSNorm, GEMM, GDN, RoPE, sampling), or filling gaps in the dataset.

2026-05-01

onboard-model

مطوّرو البرمجيات

End-to-end pipeline for discovering new LLMs with novel kernels and onboarding them into FlashInfer-Bench. Orchestrates repo updates, model discovery, kernel definition generation, workload collection, and PR submission.

2026-05-01

add-reference-tests

محللو ضمان جودة البرمجيات والمختبرون

Add pytest tests to validate reference implementations in the flashinfer-trace HuggingFace dataset against FlashInfer or SGLang ground truth. Use when validating kernel definitions, adding tests for new op_types, or verifying reference implementations are correct.

2026-04-28

clone-repos

مطوّرو البرمجيات

Clone SGLang, FlashInfer, sgl-cookbook, and flashinfer-trace repositories to tmp/. Use when setting up the project, preparing for kernel extraction, or when the user needs the source repositories.

2026-04-28

submit-onboarding-prs

مطوّرو البرمجيات

Open the per-definition pair of PRs that publishes a model onboarding — PR 2 to the HuggingFace flashinfer-trace dataset (definition + reference test + baseline solution + workloads + blobs + eval traces) and PR 1 to flashinfer-bench (docs/model_coverage.mdx update only). Use as Phase 4 of /onboard-model.

2026-04-28

track-models

مطوّرو البرمجيات

Track popular/new open-source LLMs and update docs/model_coverage.mdx with their kernel support status. Use when discovering new models to add to the coverage tracker, checking if a specific model is covered, or refreshing model coverage documentation.

2026-04-28

عرض أهم 8 من أصل 9 skills مجمعة في هذا المستودع.

flashinfer-ai

أين توجد skills

المستودعات و skills الممثلة