Run any Skill in Manus with one click

cookbook-migrate-model

Migrate a legacy-template SGLang cookbook page (monolithic per-model generator under docs_new/src/snippets/autoregressive/) onto the config-driven template (shared _deployment.jsx / _playground.jsx engines + per-model config). Use when asked to migrate, convert, or port an existing cookbook page — NOT for brand-new models (use cookbook-add-model for those). Run with /cookbook-migrate-model <Model page name, e.g. GLM-5.1>.

Run Skill in Manus

Stars29,023

Forks6,534

UpdatedJune 13, 2026 at 18:46

Source

sgl-project

sgl-project/sglang

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

File Explorer

2 files

SKILL.md

readonly

Cookbook Migrate Model

Convert one legacy cookbook page to the config-driven format, faithfully. The legacy page — its generator widget and its measured benchmark blocks — is the single source of truth. You are transcribing it into the new data model, not improving it.

Reuses the cookbook-add-model skill's assets (read them on demand):

../cookbook-add-model/templates/config.jsx.tmpl, page.mdx.tmpl, benchmarks.jsx.tmpl
../cookbook-add-model/references/authoring-reference.md (config/cells/playground contract)
../cookbook-add-model/references/mintlify-authoring.md (MDX rules)

Migration-specific references in this skill:

references/dimension-mapping.md — legacy-control → new-dimension mapping rules, command rewrite table, per-family strategy sets, and the Qwen3.5 pilot as a worked example (PR #27848).

The round's per-model inventory (scope, batch order, quirks, measured-data survey) is tracked by the migration maintainer outside the repo — expect it in your dispatch prompt, or ask for it.

Hard rules (non-negotiable)

Never modernize. Env vars, flags, TP values, docker tags, version strings are copied verbatim from the legacy page — even when today's defaults differ (e.g. SGLANG_ENABLE_SPEC_V2=1 is now default; keep it anyway). The recipe that was verified is the recipe as written. Allowed normalizations are ONLY the five alias rewrites in dimension-mapping.md §2 (launch_server→sglang serve, --model→--model-path, --tp-size→--tp, abbreviated --speculative-algo→--speculative-algorithm, --expert-parallel-size→--ep). Accuracy-degrading flags (--kv-cache-dtype fp8_e4m3, W4A4-style runtime quant) follow a deterministic rule — enforced in migration, no asking: offered as a legacy selectable option → never select it (cells mirror the accuracy-safe side), and the option survives as a Playground axis — an existing one where it fits, else add one via a separate prior engine PR (rule 4); a legacy choice never degrades to a tips mention. Baked into the recipe's default/unconditional command → keep it verbatim (the recipe was measured with it, and fp8 KV halves KV memory — stripping could OOM it). See dimension-mapping.md §2 caveats.
Never invent versions or numbers. Benchmark numbers only from the legacy page's measured blocks. Speed measurements migrate ONLY when the legacy page pins an exact, reproducible build (a release tag or a commit hash) — drifting strings like "main branch" are no version anchor: drop the speed numbers AND the entry's sglang_version, keep accuracy (far less build-sensitive) and benchmarkCommands so ⚡Reproduce guides re-measurement against a pinned release; confirm ambiguous strings (e.g. "0.5.8+") with the maintainer. When kept, sglang_version is the legacy page's string verbatim. Docker tags only the ones the legacy page pinned (unmapped hw falls back to :dev).
Verified policy (strictest tier). verified: true ONLY when (a) the legacy page has concrete measured data for that exact 5-dim combo AND (b) the cell's flags equal the deployment command used for that measurement — modulo {{HOST_IP}}/{{PORT}}, the five alias rewrites, and parser flags: --reasoning-parser/--tool-call-parser are stripped from every cell (Playground-only feature; when the measured run had them on, say so in the benchmarks file header). When the measured command diverges from the generator default, the verified cell follows the measured command; the generator default stays as the sibling strategy/cell or a tips note. Everything else is unverified (yellow) — including combos that look memory-infeasible; keep them verbatim and list them in the PR body for the re-verification track.
Engines are read-only. _deployment.jsx / _playground.jsx must not change in a migration PR. Model-specific features are config DATA consumed by generic axis handlers (MegaMoE precedent), so they need NO engine change — only a genuinely new control shape does, and then as a one-time generic primitive (never a model-named handler) on a separate prior PR (engine-axis.md). KV Cache DType / mamba-cache select trigger this once for the whole round; afterwards both are config.
github.cookbookModel must be set (<hf-org>/<page-slug>, e.g. qwen/qwen3.5) and the block never pruned — without it Submit ↗ mislabels as deepseek-v4. The issue template itself needs NO edits (free-form input).
Strategy tiers are signal-driven. A cell goes under low-latency / high-throughput ONLY on a signal present in the legacy source (an explicit performance toggle, a named recipe, or prose stating the operating point); no signal → balanced. Never derive a slant from your own hardware intuition — re-tiering on measured evidence is the hardware owner's follow-up PR, not part of a migration (dimension-mapping.md §4).

Workflow (one model = one PR)

1. Inventory the legacy assets

Read the legacy generator (docs_new/src/snippets/autoregressive/<slug>-deployment.jsx) end-to-end: every option dimension (radio vs checkbox vs dynamic), every gate/SUPPORT matrix, the full emitted command per reachable combo (env prefixes, # Error pseudo-commands included).
Read the legacy MDX: §2 install docker tags → dockerImages (pinned, not upgraded); §3.2 tips → new §2; §4 invocation examples → new §3 (keep real Output Examples verbatim); §5 benchmark blocks → transcribe each measured block: deploy command used, bench command (dataset/isl/osl/num-prompts/ concurrency), Mean TTFT/TPOT, output tok/s, hardware, version string.
Inbound-anchor sweep: grep -rn "<PageName>" docs_new/ --include='*.mdx' — find links/#fragments into this page (mint broken-links does NOT check fragments). Fix referrers or add <a id="old-anchor" /> shims in the same PR.
Check the maintainer-provided inventory notes for this model's known quirks — but treat them (and the §4 family table) as a survey snapshot: re-verify every dimension against the live legacy files before mapping. Pages keep receiving updates (precedent: Kimi-K2.6's live generator has a speculative toggle the 2026-06-10 survey notes lack).

2. Design the 5-dim mapping

Apply references/dimension-mapping.md. Key decision — which legacy toggle becomes the strategies dimension: a toggle that changes other parts of the command (TP, mem) must (the Playground can't do coupled changes), and so does a toggle the legacy page itself labels with operating-point words — e.g. a dpattention radio whose options are subtitled "Low Latency" / "High Throughput" (GLM-5.1 / Kimi-K2.6 pattern) — even when its flags are uncoupled (--dp N --enable-dp-attention is a pure flag add). Any other toggle that only adds/removes its own flags becomes a Playground axis with the flags baked into cells when the legacy default was ON — EXCEPT parsers: --reasoning-parser/--tool-call-parser are NEVER baked into cells, they are Playground-only (DSv4 convention). Every legacy control survives as an interactive control — a dimension or a Playground axis, never a tips-only mention — and a model-specific control is config data, not engine code (MegaMoE W4A4 is all DSv4 config on the existing moe axis). It's pure config whenever it fits an existing axis's data schema. Only a genuinely new shape (Nemotron3's "KV Cache DType" — a titled single-select stripping a flag family no axis manages) needs the engine, and then as a ONE-TIME generic config-parameterized primitive (never a model-named handler) on a separate PRIOR engine PR, keeping the migration PR data-only (hard rule 4, engine-axis.md). The strategy count follows the page's operating points: one recipe → a single balanced; two → low-latency + high-throughput; three → the full trio (the ideal). The tiers apply per (hw × variant × quant) combination — a single-recipe combination on a multi-strategy page parks under its semantically honest tier (no latency/throughput slant → balanced; the page's list is the union). When the legacy toggle is MTP / speculative decoding, the direction is a deterministic default — apply without asking: MTP on → low-latency, MTP off → high-throughput (reversed only with maintainer confirmation). Tier placement is signal-driven (hard rule 6). Never invent a recipe just to fill strategy chips (see dimension-mapping.md §4). Record the outcome as a strategy mapping table for the PR body — one row per group of combinations sharing the same legacy signal (e.g. "all GPU combos: MTP toggle → low-latency / high-throughput"; "xeon: (none) → balanced"), with a one-line rationale each; don't enumerate 60 identical rows. The table is what hardware owners sign off on at review.

3. Generate the config (codegen, then audit)

For >~30 cells, port the legacy generateCommand() into a throwaway Node script that enumerates combos and emits the cells:[...] literal (output must stay a pure literal — Mintlify forbids runtime spreads/calls). Apply the verified-cell override in the script. See the pilot scripts embedded in the worked example of dimension-mapping.md §5.
Independent equivalence audit (required): extract the ORIGINAL generator from git (git show main:<path> — NOT HEAD:, the migration branch deletes the file, see dimension-mapping.md §5 item 7), stub React hooks, run it for every combo, and diff token-by-token against the new cells. Expected deltas only: the appended --host {{HOST_IP}}/--port {{PORT}}, the engine-injected multi-node trio, the §2 alias rewrites (the entrypoint rewrite doesn't appear in cell tokens — cells hold flags only; the audit script normalizes it on the legacy side), and the intentional verified-cell override. Paste the PASS count + the audit script in the PR body (collapsed <details>).
Hand-author the non-cells fields per authoring-reference.md. Structural self-checks: every cell resolves a modelNames key; no --nnodes/--node-rank/ --dist-init-addr/--host/--port literals; every {{KEY}} declared; every supportedHardware id has ≥1 cell.

4. Benchmarks file

One entry per measured block only (cells without entries already render "pending" — bare {match} stubs are unnecessary). tokens_per_sec_per_gpu = output tok/s ÷ (tp × nnodes); TTFT/TPOT take the Mean rows; put the workload's num_prompts into workload. config.accuracyLabels is required whenever the benchmarks carry accuracy data — the engine ships no default eval set (#27842), so missing labels means the accuracy rows silently don't render; extra context (sample counts, suites that don't fit) goes in the entry's notes. Zero-measured-data pages: skip the file and the benchmarks prop entirely, but keep benchmarkCommands so ⚡Reproduce still guides users.

5. Rewrite the MDX

From page.mdx.tmpl: keep the original title (nav identity), write a fresh SEO description (top-level — delete any legacy metatags.description), no tag: NEW (a migration is not a launch), no mode:. Install accordion carries the legacy install content + pinned images. Keep the template's DSv4-style strategy bullets — serving semantics first (single-user chat / typical multi-user / batch throughput), trimmed to the strategies the page ships, plus at most a one-line note on what each strategy changes on this model; do NOT rewrite them as toggle-/migration-centric explanations. Legacy §5 benchmark prose is deleted (numbers → benchmark card, commands → ⚡Reproduce); legacy prose deploy commands are deleted (doc↔config parity — fold their unique flags into §2 tips). Invocation examples + real outputs carry over verbatim, but wrapped in Accordions — §3 commands and outputs are collapsible (required, DeepSeek-V4 pattern): code in an <Accordion title="… (Python)">, output in a following <Accordion title="Example Output">; legacy pages kept them inline.

6. Delete the legacy generator

Remove docs_new/src/snippets/autoregressive/<slug>-deployment.jsx and its import. grep -rn "<slug>-deployment" docs_new/ must return nothing (config provenance comments must not name the deleted path). Site wiring needs no changes: docs.json path/title unchanged, vendor card + logo already exist.

7. Validate

grep -rn '__[A-Z_]*__' on the new files (no template tokens).
cd docs_new && mint validate && mint broken-links (pre-existing breaks on main are not yours — say so in the PR).
mint dev browser smoke: initial selection = the verified cell (first in cells[]) with green badge; multi-node cells show the injected trio + header; AMD cells show env prefixes; Docker mode wraps with the pinned image and passes cell env as --env; condition-hidden combos grey out; benchmark card values; NO parser flags in any Deploy command; Playground parser toggles ADD the parser flags (green additions) while spec toggles strike the baked spec flags (red); Submit ↗ prefills this model. Probe pitfall: drive at most ONE programmatic click per evaluation and wait for React to settle — multiple clicks in one synchronous script batch and read stale DOM.
Token-level audit from step 3 passes.

8. PR + review

One PR per model. PR body: migration framing, verified policy applied, the strategy mapping table (step 2), the audit PASS count + script, any inherited-infeasible combos flagged for re-verification. Then run /cookbook-review-pr <N> and fix findings. FYI: docs previews only build for in-repo (sgl-project/sglang) branches — a fork-headed PR is perfectly fine but renders no preview; a maintainer can re-push the branch in-repo if a preview is wanted for review.

9. Keep this skill current

Any new convention, engine behavior, or pitfall you discover while migrating (naming decisions, audit-script gotchas, review-rule conflicts, …) MUST be fed back into this skill — same PR if it's skill-file-only, or an immediate follow-up commit on the skill's branch/PR. The next agent runs on what's written here, not on your session's context.

cookbook-migrate-model

More from this repository

Cookbook Migrate Model

Hard rules (non-negotiable)

Workflow (one model = one PR)

1. Inventory the legacy assets

2. Design the 5-dim mapping

3. Generate the config (codegen, then audit)

4. Benchmarks file

5. Rewrite the MDX

6. Delete the legacy generator

7. Validate

8. PR + review

9. Keep this skill current

Cookbook Migrate Model

Hard rules (non-negotiable)

Workflow (one model = one PR)

1. Inventory the legacy assets

2. Design the 5-dim mapping

3. Generate the config (codegen, then audit)

4. Benchmarks file

5. Rewrite the MDX

6. Delete the legacy generator

7. Validate

8. PR + review

9. Keep this skill current

More from this repository