ワンクリックでManusで任意のスキルを実行

$pwd:

jetson-model-catalog-inference

Name: Jetson Model Catalog Inference
Author: NVIDIA-AI-IOT

// Configure Jetson AI Lab model pages — frontmatter for the Model Details sidebar (parameters, modalities, context length, license, precision pills), the Run on Jetson serve/call/one-shot UI, the three "block" levers (hide_run_button, matrix_modules_disabled, per-engine modules_supported), and the join to src/data/benchmarks.json (benchmark_key, benchmark_series, OOM vs DNR vs null). Use when adding or editing model pages, inference engines, serve/run commands, one-shot blocks, the details sidebar, benchmark wiring, or deciding whether to mark a Jetson module as OOM, blocked, or simply untested.

Manusで実行

$ git log --oneline --stat

stars:177

forks:49

updated:2026年5月8日 18:28

SKILL.md

readonly

related-skills.json

同じリポジトリ

jetson-ai-lab-verify-build.md

from "NVIDIA-AI-IOT/jetson-ai-lab"

Run npm run build on jetson-ai-lab and triage Astro content collection / MDX / model frontmatter / benchmarks.json failures after edits. Use when verifying the site builds after content, model, tutorial, benchmark, or asset changes.

2026-05-12177

jetson-ai-lab-code-samples.md

from "NVIDIA-AI-IOT/jetson-ai-lab"

Place downloadable tutorial scripts under public/code-samples/ and link them from tutorial pages. Use when adding, moving, or linking downloadable scripts, assets, or code files for Jetson AI Lab tutorials.

2026-04-14177

package.json

"author": "NVIDIA-AI-IOT"

"repository": "NVIDIA-AI-IOT/jetson-ai-lab"

GitHub リポジトリを開く Creator のリポジトリを見る

$ install --global

$ download --local

Manusで実行

$ useful --forSOC

ウェブ開発者コンピュータ・数学職15-1254L4

ワンクリックで任意のスキルを実行

name	jetson-model-catalog-inference
description	Configure Jetson AI Lab model pages — frontmatter for the Model Details sidebar (parameters, modalities, context length, license, precision pills), the Run on Jetson serve/call/one-shot UI, the three "block" levers (hide_run_button, matrix_modules_disabled, per-engine modules_supported), and the join to src/data/benchmarks.json (benchmark_key, benchmark_series, OOM vs DNR vs null). Use when adding or editing model pages, inference engines, serve/run commands, one-shot blocks, the details sidebar, benchmark wiring, or deciding whether to mark a Jetson module as OOM, blocked, or simply untested.
license	Apache-2.0
compatibility	Requires Node.js and npm for build verification of model schema changes.

Jetson AI Lab: model catalog page configuration

Reality check

Jetson AI Lab is a static Astro site. It does not host inference APIs. Users copy commands from the site and run them on a Jetson (or other machine).
Model pages are Markdown files in the Astro content collection models: src/content/models/<slug>.md.
The file slug (filename without .md) becomes the URL segment: published route /models/<slug> (see src/pages/models/[slug].astro).
Almost every visible element on a model page is driven by YAML frontmatter in that file (Hero, Model Details sidebar, Run on Jetson tabs, Call snippets, one-shot tabs, Benchmark chart). Body Markdown only fills the bottom Model Details prose section.

Do not invent a base URL for "the Jetson AI Lab inference API" — there isn't one.

Page anatomy (top-to-bottom)

Section	Driven by	Component / file
Hero (title, badges, "Command to Run on Jetson" / "Benchmark" / "Model Details" chips)	`title`, `is_new`, `type`, `short_description`, presence of engines and `benchmark_key`	`src/pages/models/[slug].astro`
Specs sidebar (Parameters, Modalities, Context Length, License, Precision pills)	`parameters` (or `model_size`), `modalities` (or `vision_capable`), `context_length`, `license`, `precision` (per-token)	`src/pages/models/[slug].astro` L137–193
Run on Jetson → Serve matrix	`serving.entries` (preferred) or `supported_inference_engines`, plus `matrix_modules_disabled`	`ModelServeSection.astro`
Call the model	`client_call.{shell,python,intro}` (or auto-built from engines + `hf_checkpoint`)	`ModelCallSection.astro`
One-shot inference (per-module tabs)	`one_shot_inference.*`	`ModelOneShotSection.astro` + `src/lib/evalRunModal.ts`
Benchmark chart	`benchmark_key` (+ optional `benchmark_series`) → joins `src/data/benchmarks.json`	`ModelBenchmarkSection.astro`
Model Details prose + external links (`Try on build.nvidia.com`, `View on HuggingFace`)	Body Markdown + `build_nvidia_url`, `huggingface_url` (or `hf_checkpoint`)	`src/pages/models/[slug].astro` L242–289

Frontmatter reference

Every model page is driven by its YAML frontmatter. Validation lives in src/content/config.ts (modelsSchema).

Always-include tags

Either schema-required (build fails without them) or used by every model in the repo. Include all when adding a new model.

Tag	Schema	What it does
`title`	required	Display name on the model page and catalog cards.
`short_description`	required	One-line summary in the catalog grid and Hero subtitle.
`precision`	required	Slash-or-comma separated quant tokens (e.g. `"NVFP4 / FP8 / Q4_K_M GGUF"`). Each token renders as a colored pill in the sidebar (`precisionPill` in `[slug].astro`).
`family`	optional but ubiquitous	Groups models in the catalog. Special value `"Google Gemma4"` triggers the Gemma 4 tutorial CTA banner above the prose.
`model_id`, `icon`, `is_new`, `order`, `type`, `hf_checkpoint`, `huggingface_url`, `minimum_jetson`	optional but ubiquitous	Catalog card / hero / external links. `huggingface_url` falls back to `https://huggingface.co/${hf_checkpoint}` when omitted.

Model Details sidebar tags (drive the dark spec card on the right of the Hero)

Skipping these causes the row to disappear from the sidebar (no error, just less polish).

Tag	Sidebar row	Notes
`parameters` (or fallback to `model_size`)	"Parameters"	Free text, e.g. `"~9B"` or `"30B total / 3B active"`.
`modalities: [...]`	"Modalities"	Array of `"Text" \| "Image" \| "Audio" \| "Video"` — each renders with an icon. If omitted, falls back to `vision_capable` (Yes → "Text, Image", No → "Text").
`vision_capable` (boolean)	(fallback for Modalities)	Set explicitly even when `modalities` is present — used elsewhere for catalog filters.
`context_length`	"Context Length"	Free text, e.g. `"128K"`, `"262K"`.
`license`	"License"	Free text, e.g. `"Apache 2.0"`, `"NVIDIA Open Model License"`.
`model_size`	(fallback for Parameters)	e.g. `"12GB"`.
`memory_requirements`	not shown in sidebar but used by other UI	e.g. `"16GB RAM"`.

Situational tags

Only add these when the model needs them.

Tag	When to use	Repo usage
`build_nvidia_url`	Model has an NVIDIA Build page → adds the Try on build.nvidia.com button.	~67% of models
`matrix_modules_disabled`	Some matrix module tabs should be grayed out / non-clickable for every engine of this model.	~10% of models
`hide_run_button`	Hide the entire Run on Jetson UI (Serve + Call + one-shot) even when engine data is present.	~5% of models
`benchmark_key`	Model has measured perf in `src/data/benchmarks.json`. Value must equal the `name` field of that benchmarks entry.	~30% of models
`benchmark_series: [...]`	Show extra side-by-side comparison bars (e.g. compare against the previous generation of the same family). Each value is another `name` from `benchmarks.json`.	A handful

Inference blocks (at least one required unless `hide_run_button: true`)

Block	Purpose
`serving.entries`	New-style engine list — preferred for new models. Overrides `supported_inference_engines` in the UI when non-empty.
`supported_inference_engines`	Legacy engine array. Still works and is the most common shape today.
`one_shot_inference`	One-shot run commands (shell/python/per-module variants). Renders a separate "One-shot inference" tabbed section below Call.
`client_call.{shell,python,intro}`	Optional override for the Call the model snippet. If omitted, the page auto-builds a curl example from engine commands and `hf_checkpoint`.

Run on Jetson: the three "block" levers (when to OOM vs hide vs gray)

These are layered. The serve matrix evaluates them in this order:

hide_run_button: true (frontmatter, top-level)
- Hides the entire Run on Jetson section (Serve + Call + one-shot).
- Use when the model exists in the catalog but is not yet launchable from the site (e.g. requires NGC download, gated checkpoint, pending validation).
- Example: src/content/models/cosmos-reason2-8b.md — keeps engine data for documentation, but the run UI stays hidden.
matrix_modules_disabled: [moduleId, ...] (frontmatter, top-level)
- Grays out the listed module tabs across all engines for this model. Tab is shown but not clickable.
- Use when a Jetson module is theoretically possible but not validated yet for this model.
engines[].modules_supported: [moduleId, ...] (per-engine, inside serving.entries[] or supported_inference_engines[])
- Allowlist: only listed module ids are enabled for that engine. Cells outside the list show grayed for that engine row.
- Omit the field entirely → engine supports all matrix modules. An empty array means the engine supports nothing (rare).
- Use when the model runs everywhere but a specific engine only works on a subset (e.g. vLLM Thor-only, Ollama Orin+Thor).

OOM is a different concept and lives in src/data/benchmarks.json — see Benchmarks below.

You want to express…	Use…
"This model isn't ready to be run from the site at all"	`hide_run_button: true`
"This model can't run on Orin Nano (any engine)"	`matrix_modules_disabled: [orin_nano_8]`
"vLLM only works on Thor for this model; llama.cpp works everywhere"	`vLLM` entry: `modules_supported: [thor_t5000, thor_t4000]`; omit on `llama.cpp`
"We benchmarked this and it ran out of memory on T4000"	Add `"oom": ["t4000"]` to that quant in `benchmarks.json` (does not belong in model frontmatter)
"We benchmarked this and the model just doesn't run on Orin Nano"	Add `"dnr": ["orin_nano_8gb"]` to that quant in `benchmarks.json`
"We haven't measured this combination yet"	Set the value to `null` in `benchmarks.json`. No marker is drawn; the bar is just absent.

Inference data: serve matrix (engines)

Engine list is built by getSupportedInferenceEngines in src/lib/modelFrontmatterAdapter.ts:

If serving.entries is present and non-empty → those entries are used (preferred for new work).
Otherwise supported_inference_engines (legacy array) is used.

Each engine row maps to SupportedEngineEntry in src/lib/inferenceCommands.ts. Prefer:

serve_command_orin / serve_command_thor for long-running serve commands (vLLM, Ollama, etc.).
Legacy run_command_orin / run_command_thor are deprecated but still accepted as fallbacks.

Optional fields:

modules_supported — see block lever #3 above.
run_commands_by_module: per-cell override map { moduleId: command } for Run matrix cells.
install_command / install_command_orin / install_command_thor: prefixed as # Installation block above the serve command.

Engine display order is fixed by sortEnginesForUi in modelFrontmatterAdapter.ts: vLLM → SGLang → llama.cpp → Ollama → Edge-LLM → everything else (alphabetical). Don't try to control order from frontmatter.

Canonical Jetson module ids

Defined in JETSON_MATRIX_MODULES in src/lib/inferenceCommands.ts. Use these exact ids in modules_supported / matrix_modules_disabled / one_shot_inference.modules_supported.

Module id	Matrix tab?	Platform
`thor_t5000`	Yes	thor
`thor_t4000`	Yes	thor
`orin_agx_64`	Yes	orin
`orin_nx_16`	Yes	orin
`orin_nano_8`	Yes	orin
`orin_agx_32`	data-only (no tab)	orin
`orin_nx_8`	data-only (no tab)	orin
`orin_nano_4`	data-only (no tab)	orin

"Data-only" ids (showInMatrixUi: false) are valid frontmatter and resolve commands by platformKey, but no Serve/Run matrix tab is shown. Don't invent new ids — extending the list requires a code change in JETSON_MATRIX_MODULES.

Examples in repo:

serving.entries (new style): src/content/models/ministral3-8b-reasoning.md
Legacy supported_inference_engines with rich per-engine modules_supported: src/content/models/nemotron-3-nano-omni.md
hide_run_button: true: src/content/models/cosmos-reason2-8b.md

One-shot inference

Optional frontmatter block one_shot_inference (schema in src/content/config.ts). Fields:

run_command_orin, run_command_thor — single-command run per platform.
shell, python — non-platform-specific snippets.
shell_by_module, python_by_module — per-module override maps keyed by canonical module id.
modules_supported — restrict which module tabs appear.
intro — optional Markdown intro shown above the tabs.

Visibility: ModelOneShotSection.astro shows the section when at least one of the above command fields is non-empty (hasContent check).

Snippet/module resolution lives in src/lib/evalRunModal.ts (see evalSnippetByModule, evalMatrixModuleSpecs).

Call the model (`client_call`)

Optional frontmatter block — overrides the auto-generated curl example shown in Call the model.

client_call:
  intro: "Once the server is up:"
  shell: |-
    curl -s http://${JETSON_HOST}:8080/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{ "model": "my_model", "messages": [{"role":"user","content":"hi"}] }'
  python: |-
    import openai
    ...

Omit client_call entirely to let ModelCallSection.astro synthesize an example from hf_checkpoint and the first engine's serve command.

Benchmarks data

Benchmark numbers do not live in model frontmatter. They live in src/data/benchmarks.json under models[]. The model page joins via the frontmatter field benchmark_key, which must match models[].name exactly.

Adding benchmark data

Add an entry to models[] in src/data/benchmarks.json:

{
  "family": "Family Name",
  "name": "Exact Name (this is the join key)",
  "releaseDate": "YYYY-MM-DD",
  "capabilities": { "language": true, "vision": false, "audio": false, "vla": false },
  "isl": 2048,
  "osl": 128,
  "engines": {
    "vLLM": {
      "NVFP4": {
        "concurrency1": { "t5000": 63.3, "t4000": 51.8, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 59.6 },
        "concurrency8": { "t5000": 181.2, "t4000": 147.4, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 191.2 }
      },
      "FP8": {
        "concurrency1": { "t5000": 54.7, "t4000": null, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 52.3 },
        "concurrency8": { "t5000": 129.8, "t4000": null, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 138.6 },
        "oom": ["t4000"]
      }
    },
    "Ollama": {
      "Q4_K_M": {
        "concurrency1": { "t5000": 27, "t4000": null, "agx_orin_64gb": 17, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": null },
        "concurrency8": { "t5000": 28, "t4000": null, "agx_orin_64gb": 17, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": null },
        "oom": ["orin_nx_16gb", "orin_nano_8gb"]
      }
    }
  }
}

In the model frontmatter add:

benchmark_key: "Exact Name (this is the join key)"
benchmark_series:
  - "Predecessor Family Name"   # optional — overlays comparison bars

Benchmark product ids ≠ matrix module ids (critical gotcha)

The benchmark axis uses different ids than the Run/Serve matrix:

Hardware	Matrix module id (frontmatter)	Benchmark product id (`benchmarks.json`)
Jetson Thor T5000	`thor_t5000`	`t5000`
Jetson Thor T4000	`thor_t4000`	`t4000`
Jetson AGX Orin 64GB	`orin_agx_64`	`agx_orin_64gb`
Jetson Orin NX 16GB	`orin_nx_16`	`orin_nx_16gb`
Jetson Orin Nano 8GB	`orin_nano_8`	`orin_nano_8gb`
DGX Spark GB10	(not in matrix)	`dgx_spark`

Use the right ids for the right file. Don't paste matrix ids into benchmarks.json or vice versa.

`oom` vs `dnr` vs `null` (per-quant markers)

Inside any engines[engine][quant] you may add either array; both expect benchmark product ids:

Marker	Frontmatter shape	UI rendering	Use when
Numeric value	`"t5000": 63.3`	Real bar at that value	We measured throughput.
`null` value	`"t5000": null`	No marker, no bar (label grayed)	We haven't measured this combo yet.
`"oom": [productId, ...]`	Sibling of `concurrency1` / `concurrency8`	Small green-tinted OOM pill in place of the bar	We attempted the run and it OOM'd. Soft failure — keep visible to communicate "tried, ran out of memory".
`"dnr": [productId, ...]`	Sibling of `concurrency1` / `concurrency8`	Red ✕ in place of the bar	Hard failure — model genuinely cannot run on this hardware (architectural / driver / unsupported precision).

Rule of thumb: OOM = "the box was the wrong size", DNR = "the box was the wrong shape", null = "we didn't try the box".

OOM and DNR are read by ModelBenchmarkSection.astro (lines ~494–516). They have no effect on the Serve / Run matrix above — that only respects hide_run_button / matrix_modules_disabled / per-engine modules_supported.

Comparison series (`benchmark_series`)

Each value in the array must be another models[].name from benchmarks.json. The page page renders them as paler bars next to the main model and auto-links them to the corresponding /models/<slug>#model-benchmark page when one exists.

Discoverability artifacts (`/llms.txt`, `/llms-full.txt`)

Two text routes surface the model catalog to external clients (search agents, LLMs, plain curl consumers). Both are auto-generated from the content collection at build time — adding a model page is enough; no manual index edit required.

Route	Source file	What it contains
`/llms.txt`	`src/pages/llms.txt.ts`	Short index. Curated tutorial header (static template literal) + auto-generated Supported Models section grouped by vendor. Models grouped by `family` per `MODEL_SECTIONS`, sorted by frontmatter `order`.
`/llms-full.txt`	`src/pages/llms-full.txt.ts`	Full long-form index. Iterates `getCollection('tutorials')` and `getCollection('models')`, emits frontmatter + full Markdown body for every page.

When you add a new model:

Set family to one of the existing families listed in MODEL_SECTIONS (in src/pages/llms.txt.ts). The model lands in the right section automatically.
Set short_description clearly — it is the line text rendered in /llms.txt.
Set order if you want a specific position within the family (lower = higher in the list).

When you introduce a brand-new vendor / family (no existing entry matches): edit MODEL_SECTIONS in src/pages/llms.txt.ts to add a new section and list its family values. Without this edit the new model still appears, but in a fallback ### Other Models section at the bottom.

There is no static public/llms.txt — the dynamic route lives at src/pages/llms.txt.ts. Don't reintroduce a static file; Astro would let it shadow the route silently.

Validation (build will fail if wrong)

src/content/config.ts superRefine on the models collection (skipped when hide_run_button: true):

You must provide at least one of:

Non-empty serving.entries, or
Non-empty supported_inference_engines, or
one_shot_inference with at least one trimmed non-empty command among run_command_orin, run_command_thor, shell, python, or values inside shell_by_module / python_by_module.

Otherwise Zod reports a custom error on supported_inference_engines:

Provide supported_inference_engines and/or serving.entries (non-empty), one_shot_inference commands, or set hide_run_button: true.

Checklist before you finish

Required model fields present (title, short_description, precision).
Sidebar fields filled where applicable (parameters, modalities or vision_capable, context_length, license, memory_requirements).
Either serving.entries or supported_inference_engines or meaningful one_shot_inference, or hide_run_button: true.
serve_command_* preferred over deprecated run_command_* for serve flows.
All matrix ids in modules_supported / matrix_modules_disabled / one_shot_inference.modules_supported are from JETSON_MATRIX_MODULES.
If benchmark_key is set, it exactly matches a models[].name in src/data/benchmarks.json. All product ids in concurrency* / oom / dnr arrays are valid benchmark product ids (not matrix ids).
OOM vs DNR vs null chosen per the marker semantics above.
family is set to a value listed in MODEL_SECTIONS (src/pages/llms.txt.ts). If the family is brand-new (new vendor), add a section to MODEL_SECTIONS so /llms.txt groups it correctly.
Run npm run build (see skill jetson-ai-lab-verify-build).

Source of truth (repo files)

Purpose	Path
Model schema + Zod validation	`src/content/config.ts` (`modelsSchema`, `superRefine` ~L144–165)
Frontmatter → engine list adapter	`src/lib/modelFrontmatterAdapter.ts` (`getSupportedInferenceEngines`, `servingEntryToSupportedEngine`, `sortEnginesForUi`)
Engine types, matrix module ids, serve/run helpers	`src/lib/inferenceCommands.ts` (`SupportedEngineEntry`, `JETSON_MATRIX_MODULES`, `JETSON_MATRIX_TAB_MODULE_IDS`, matrix helpers)
Model page layout (Hero, sidebar, sections)	`src/pages/models/[slug].astro`
Serve matrix UI	`src/components/ModelServeSection.astro`
Call / client examples	`src/components/ModelCallSection.astro`
One-shot UI	`src/components/ModelOneShotSection.astro`
One-shot snippet logic	`src/lib/evalRunModal.ts`
Benchmark chart UI (OOM / DNR rendering)	`src/components/ModelBenchmarkSection.astro`
Benchmark data file	`src/data/benchmarks.json`
`/llms.txt` short index (auto-generated, vendor-grouped)	`src/pages/llms.txt.ts` (edit `MODEL_SECTIONS` for new vendors)
`/llms-full.txt` long-form index (auto-generated)	`src/pages/llms-full.txt.ts`

name	jetson-model-catalog-inference
description	Configure Jetson AI Lab model pages — frontmatter for the Model Details sidebar (parameters, modalities, context length, license, precision pills), the Run on Jetson serve/call/one-shot UI, the three "block" levers (hide_run_button, matrix_modules_disabled, per-engine modules_supported), and the join to src/data/benchmarks.json (benchmark_key, benchmark_series, OOM vs DNR vs null). Use when adding or editing model pages, inference engines, serve/run commands, one-shot blocks, the details sidebar, benchmark wiring, or deciding whether to mark a Jetson module as OOM, blocked, or simply untested.
license	Apache-2.0
compatibility	Requires Node.js and npm for build verification of model schema changes.

Jetson AI Lab: model catalog page configuration

Reality check

Jetson AI Lab is a static Astro site. It does not host inference APIs. Users copy commands from the site and run them on a Jetson (or other machine).
Model pages are Markdown files in the Astro content collection models: src/content/models/<slug>.md.
The file slug (filename without .md) becomes the URL segment: published route /models/<slug> (see src/pages/models/[slug].astro).
Almost every visible element on a model page is driven by YAML frontmatter in that file (Hero, Model Details sidebar, Run on Jetson tabs, Call snippets, one-shot tabs, Benchmark chart). Body Markdown only fills the bottom Model Details prose section.

Do not invent a base URL for "the Jetson AI Lab inference API" — there isn't one.

Page anatomy (top-to-bottom)

Section	Driven by	Component / file
Hero (title, badges, "Command to Run on Jetson" / "Benchmark" / "Model Details" chips)	`title`, `is_new`, `type`, `short_description`, presence of engines and `benchmark_key`	`src/pages/models/[slug].astro`
Specs sidebar (Parameters, Modalities, Context Length, License, Precision pills)	`parameters` (or `model_size`), `modalities` (or `vision_capable`), `context_length`, `license`, `precision` (per-token)	`src/pages/models/[slug].astro` L137–193
Run on Jetson → Serve matrix	`serving.entries` (preferred) or `supported_inference_engines`, plus `matrix_modules_disabled`	`ModelServeSection.astro`
Call the model	`client_call.{shell,python,intro}` (or auto-built from engines + `hf_checkpoint`)	`ModelCallSection.astro`
One-shot inference (per-module tabs)	`one_shot_inference.*`	`ModelOneShotSection.astro` + `src/lib/evalRunModal.ts`
Benchmark chart	`benchmark_key` (+ optional `benchmark_series`) → joins `src/data/benchmarks.json`	`ModelBenchmarkSection.astro`
Model Details prose + external links (`Try on build.nvidia.com`, `View on HuggingFace`)	Body Markdown + `build_nvidia_url`, `huggingface_url` (or `hf_checkpoint`)	`src/pages/models/[slug].astro` L242–289

Frontmatter reference

Every model page is driven by its YAML frontmatter. Validation lives in src/content/config.ts (modelsSchema).

Always-include tags

Either schema-required (build fails without them) or used by every model in the repo. Include all when adding a new model.

Tag	Schema	What it does
`title`	required	Display name on the model page and catalog cards.
`short_description`	required	One-line summary in the catalog grid and Hero subtitle.
`precision`	required	Slash-or-comma separated quant tokens (e.g. `"NVFP4 / FP8 / Q4_K_M GGUF"`). Each token renders as a colored pill in the sidebar (`precisionPill` in `[slug].astro`).
`family`	optional but ubiquitous	Groups models in the catalog. Special value `"Google Gemma4"` triggers the Gemma 4 tutorial CTA banner above the prose.
`model_id`, `icon`, `is_new`, `order`, `type`, `hf_checkpoint`, `huggingface_url`, `minimum_jetson`	optional but ubiquitous	Catalog card / hero / external links. `huggingface_url` falls back to `https://huggingface.co/${hf_checkpoint}` when omitted.

Model Details sidebar tags (drive the dark spec card on the right of the Hero)

Skipping these causes the row to disappear from the sidebar (no error, just less polish).

Tag	Sidebar row	Notes
`parameters` (or fallback to `model_size`)	"Parameters"	Free text, e.g. `"~9B"` or `"30B total / 3B active"`.
`modalities: [...]`	"Modalities"	Array of `"Text" \| "Image" \| "Audio" \| "Video"` — each renders with an icon. If omitted, falls back to `vision_capable` (Yes → "Text, Image", No → "Text").
`vision_capable` (boolean)	(fallback for Modalities)	Set explicitly even when `modalities` is present — used elsewhere for catalog filters.
`context_length`	"Context Length"	Free text, e.g. `"128K"`, `"262K"`.
`license`	"License"	Free text, e.g. `"Apache 2.0"`, `"NVIDIA Open Model License"`.
`model_size`	(fallback for Parameters)	e.g. `"12GB"`.
`memory_requirements`	not shown in sidebar but used by other UI	e.g. `"16GB RAM"`.

Situational tags

Only add these when the model needs them.

Tag	When to use	Repo usage
`build_nvidia_url`	Model has an NVIDIA Build page → adds the Try on build.nvidia.com button.	~67% of models
`matrix_modules_disabled`	Some matrix module tabs should be grayed out / non-clickable for every engine of this model.	~10% of models
`hide_run_button`	Hide the entire Run on Jetson UI (Serve + Call + one-shot) even when engine data is present.	~5% of models
`benchmark_key`	Model has measured perf in `src/data/benchmarks.json`. Value must equal the `name` field of that benchmarks entry.	~30% of models
`benchmark_series: [...]`	Show extra side-by-side comparison bars (e.g. compare against the previous generation of the same family). Each value is another `name` from `benchmarks.json`.	A handful

Inference blocks (at least one required unless `hide_run_button: true`)

Block	Purpose
`serving.entries`	New-style engine list — preferred for new models. Overrides `supported_inference_engines` in the UI when non-empty.
`supported_inference_engines`	Legacy engine array. Still works and is the most common shape today.
`one_shot_inference`	One-shot run commands (shell/python/per-module variants). Renders a separate "One-shot inference" tabbed section below Call.
`client_call.{shell,python,intro}`	Optional override for the Call the model snippet. If omitted, the page auto-builds a curl example from engine commands and `hf_checkpoint`.

Run on Jetson: the three "block" levers (when to OOM vs hide vs gray)

These are layered. The serve matrix evaluates them in this order:

hide_run_button: true (frontmatter, top-level)
- Hides the entire Run on Jetson section (Serve + Call + one-shot).
- Use when the model exists in the catalog but is not yet launchable from the site (e.g. requires NGC download, gated checkpoint, pending validation).
- Example: src/content/models/cosmos-reason2-8b.md — keeps engine data for documentation, but the run UI stays hidden.
matrix_modules_disabled: [moduleId, ...] (frontmatter, top-level)
- Grays out the listed module tabs across all engines for this model. Tab is shown but not clickable.
- Use when a Jetson module is theoretically possible but not validated yet for this model.
engines[].modules_supported: [moduleId, ...] (per-engine, inside serving.entries[] or supported_inference_engines[])
- Allowlist: only listed module ids are enabled for that engine. Cells outside the list show grayed for that engine row.
- Omit the field entirely → engine supports all matrix modules. An empty array means the engine supports nothing (rare).
- Use when the model runs everywhere but a specific engine only works on a subset (e.g. vLLM Thor-only, Ollama Orin+Thor).

OOM is a different concept and lives in src/data/benchmarks.json — see Benchmarks below.

You want to express…	Use…
"This model isn't ready to be run from the site at all"	`hide_run_button: true`
"This model can't run on Orin Nano (any engine)"	`matrix_modules_disabled: [orin_nano_8]`
"vLLM only works on Thor for this model; llama.cpp works everywhere"	`vLLM` entry: `modules_supported: [thor_t5000, thor_t4000]`; omit on `llama.cpp`
"We benchmarked this and it ran out of memory on T4000"	Add `"oom": ["t4000"]` to that quant in `benchmarks.json` (does not belong in model frontmatter)
"We benchmarked this and the model just doesn't run on Orin Nano"	Add `"dnr": ["orin_nano_8gb"]` to that quant in `benchmarks.json`
"We haven't measured this combination yet"	Set the value to `null` in `benchmarks.json`. No marker is drawn; the bar is just absent.

Inference data: serve matrix (engines)

Engine list is built by getSupportedInferenceEngines in src/lib/modelFrontmatterAdapter.ts:

If serving.entries is present and non-empty → those entries are used (preferred for new work).
Otherwise supported_inference_engines (legacy array) is used.

Each engine row maps to SupportedEngineEntry in src/lib/inferenceCommands.ts. Prefer:

serve_command_orin / serve_command_thor for long-running serve commands (vLLM, Ollama, etc.).
Legacy run_command_orin / run_command_thor are deprecated but still accepted as fallbacks.

Optional fields:

modules_supported — see block lever #3 above.
run_commands_by_module: per-cell override map { moduleId: command } for Run matrix cells.
install_command / install_command_orin / install_command_thor: prefixed as # Installation block above the serve command.

Canonical Jetson module ids

Defined in JETSON_MATRIX_MODULES in src/lib/inferenceCommands.ts. Use these exact ids in modules_supported / matrix_modules_disabled / one_shot_inference.modules_supported.

Module id	Matrix tab?	Platform
`thor_t5000`	Yes	thor
`thor_t4000`	Yes	thor
`orin_agx_64`	Yes	orin
`orin_nx_16`	Yes	orin
`orin_nano_8`	Yes	orin
`orin_agx_32`	data-only (no tab)	orin
`orin_nx_8`	data-only (no tab)	orin
`orin_nano_4`	data-only (no tab)	orin

Examples in repo:

serving.entries (new style): src/content/models/ministral3-8b-reasoning.md
Legacy supported_inference_engines with rich per-engine modules_supported: src/content/models/nemotron-3-nano-omni.md
hide_run_button: true: src/content/models/cosmos-reason2-8b.md

One-shot inference

Optional frontmatter block one_shot_inference (schema in src/content/config.ts). Fields:

run_command_orin, run_command_thor — single-command run per platform.
shell, python — non-platform-specific snippets.
shell_by_module, python_by_module — per-module override maps keyed by canonical module id.
modules_supported — restrict which module tabs appear.
intro — optional Markdown intro shown above the tabs.

Visibility: ModelOneShotSection.astro shows the section when at least one of the above command fields is non-empty (hasContent check).

Snippet/module resolution lives in src/lib/evalRunModal.ts (see evalSnippetByModule, evalMatrixModuleSpecs).

Call the model (`client_call`)

Optional frontmatter block — overrides the auto-generated curl example shown in Call the model.

client_call:
  intro: "Once the server is up:"
  shell: |-
    curl -s http://${JETSON_HOST}:8080/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{ "model": "my_model", "messages": [{"role":"user","content":"hi"}] }'
  python: |-
    import openai
    ...

Omit client_call entirely to let ModelCallSection.astro synthesize an example from hf_checkpoint and the first engine's serve command.

Benchmarks data

Adding benchmark data

Add an entry to models[] in src/data/benchmarks.json:

{
  "family": "Family Name",
  "name": "Exact Name (this is the join key)",
  "releaseDate": "YYYY-MM-DD",
  "capabilities": { "language": true, "vision": false, "audio": false, "vla": false },
  "isl": 2048,
  "osl": 128,
  "engines": {
    "vLLM": {
      "NVFP4": {
        "concurrency1": { "t5000": 63.3, "t4000": 51.8, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 59.6 },
        "concurrency8": { "t5000": 181.2, "t4000": 147.4, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 191.2 }
      },
      "FP8": {
        "concurrency1": { "t5000": 54.7, "t4000": null, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 52.3 },
        "concurrency8": { "t5000": 129.8, "t4000": null, "agx_orin_64gb": null, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": 138.6 },
        "oom": ["t4000"]
      }
    },
    "Ollama": {
      "Q4_K_M": {
        "concurrency1": { "t5000": 27, "t4000": null, "agx_orin_64gb": 17, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": null },
        "concurrency8": { "t5000": 28, "t4000": null, "agx_orin_64gb": 17, "orin_nx_16gb": null, "orin_nano_8gb": null, "dgx_spark": null },
        "oom": ["orin_nx_16gb", "orin_nano_8gb"]
      }
    }
  }
}

In the model frontmatter add:

benchmark_key: "Exact Name (this is the join key)"
benchmark_series:
  - "Predecessor Family Name"   # optional — overlays comparison bars

Benchmark product ids ≠ matrix module ids (critical gotcha)

The benchmark axis uses different ids than the Run/Serve matrix:

Hardware	Matrix module id (frontmatter)	Benchmark product id (`benchmarks.json`)
Jetson Thor T5000	`thor_t5000`	`t5000`
Jetson Thor T4000	`thor_t4000`	`t4000`
Jetson AGX Orin 64GB	`orin_agx_64`	`agx_orin_64gb`
Jetson Orin NX 16GB	`orin_nx_16`	`orin_nx_16gb`
Jetson Orin Nano 8GB	`orin_nano_8`	`orin_nano_8gb`
DGX Spark GB10	(not in matrix)	`dgx_spark`

Use the right ids for the right file. Don't paste matrix ids into benchmarks.json or vice versa.

`oom` vs `dnr` vs `null` (per-quant markers)

Inside any engines[engine][quant] you may add either array; both expect benchmark product ids:

Marker	Frontmatter shape	UI rendering	Use when
Numeric value	`"t5000": 63.3`	Real bar at that value	We measured throughput.
`null` value	`"t5000": null`	No marker, no bar (label grayed)	We haven't measured this combo yet.
`"oom": [productId, ...]`	Sibling of `concurrency1` / `concurrency8`	Small green-tinted OOM pill in place of the bar	We attempted the run and it OOM'd. Soft failure — keep visible to communicate "tried, ran out of memory".
`"dnr": [productId, ...]`	Sibling of `concurrency1` / `concurrency8`	Red ✕ in place of the bar	Hard failure — model genuinely cannot run on this hardware (architectural / driver / unsupported precision).

Rule of thumb: OOM = "the box was the wrong size", DNR = "the box was the wrong shape", null = "we didn't try the box".

Comparison series (`benchmark_series`)

Discoverability artifacts (`/llms.txt`, `/llms-full.txt`)

Route	Source file	What it contains
`/llms.txt`	`src/pages/llms.txt.ts`	Short index. Curated tutorial header (static template literal) + auto-generated Supported Models section grouped by vendor. Models grouped by `family` per `MODEL_SECTIONS`, sorted by frontmatter `order`.
`/llms-full.txt`	`src/pages/llms-full.txt.ts`	Full long-form index. Iterates `getCollection('tutorials')` and `getCollection('models')`, emits frontmatter + full Markdown body for every page.

When you add a new model:

Set family to one of the existing families listed in MODEL_SECTIONS (in src/pages/llms.txt.ts). The model lands in the right section automatically.
Set short_description clearly — it is the line text rendered in /llms.txt.
Set order if you want a specific position within the family (lower = higher in the list).

There is no static public/llms.txt — the dynamic route lives at src/pages/llms.txt.ts. Don't reintroduce a static file; Astro would let it shadow the route silently.

Validation (build will fail if wrong)

src/content/config.ts superRefine on the models collection (skipped when hide_run_button: true):

You must provide at least one of:

Non-empty serving.entries, or
Non-empty supported_inference_engines, or
one_shot_inference with at least one trimmed non-empty command among run_command_orin, run_command_thor, shell, python, or values inside shell_by_module / python_by_module.

Otherwise Zod reports a custom error on supported_inference_engines:

Provide supported_inference_engines and/or serving.entries (non-empty), one_shot_inference commands, or set hide_run_button: true.

Checklist before you finish

Required model fields present (title, short_description, precision).
Sidebar fields filled where applicable (parameters, modalities or vision_capable, context_length, license, memory_requirements).
Either serving.entries or supported_inference_engines or meaningful one_shot_inference, or hide_run_button: true.
serve_command_* preferred over deprecated run_command_* for serve flows.
All matrix ids in modules_supported / matrix_modules_disabled / one_shot_inference.modules_supported are from JETSON_MATRIX_MODULES.
If benchmark_key is set, it exactly matches a models[].name in src/data/benchmarks.json. All product ids in concurrency* / oom / dnr arrays are valid benchmark product ids (not matrix ids).
OOM vs DNR vs null chosen per the marker semantics above.
family is set to a value listed in MODEL_SECTIONS (src/pages/llms.txt.ts). If the family is brand-new (new vendor), add a section to MODEL_SECTIONS so /llms.txt groups it correctly.
Run npm run build (see skill jetson-ai-lab-verify-build).

Source of truth (repo files)

Purpose	Path
Model schema + Zod validation	`src/content/config.ts` (`modelsSchema`, `superRefine` ~L144–165)
Frontmatter → engine list adapter	`src/lib/modelFrontmatterAdapter.ts` (`getSupportedInferenceEngines`, `servingEntryToSupportedEngine`, `sortEnginesForUi`)
Engine types, matrix module ids, serve/run helpers	`src/lib/inferenceCommands.ts` (`SupportedEngineEntry`, `JETSON_MATRIX_MODULES`, `JETSON_MATRIX_TAB_MODULE_IDS`, matrix helpers)
Model page layout (Hero, sidebar, sections)	`src/pages/models/[slug].astro`
Serve matrix UI	`src/components/ModelServeSection.astro`
Call / client examples	`src/components/ModelCallSection.astro`
One-shot UI	`src/components/ModelOneShotSection.astro`
One-shot snippet logic	`src/lib/evalRunModal.ts`
Benchmark chart UI (OOM / DNR rendering)	`src/components/ModelBenchmarkSection.astro`
Benchmark data file	`src/data/benchmarks.json`
`/llms.txt` short index (auto-generated, vendor-grouped)	`src/pages/llms.txt.ts` (edit `MODEL_SECTIONS` for new vendors)
`/llms-full.txt` long-form index (auto-generated)	`src/pages/llms-full.txt.ts`

jetson-model-catalog-inference

このリポジトリの他の Skills

Jetson AI Lab: model catalog page configuration

Reality check

Page anatomy (top-to-bottom)

Frontmatter reference

Always-include tags

Model Details sidebar tags (drive the dark spec card on the right of the Hero)

Situational tags

Inference blocks (at least one required unless hide_run_button: true)

Run on Jetson: the three "block" levers (when to OOM vs hide vs gray)

Inference data: serve matrix (engines)

Canonical Jetson module ids

One-shot inference

Call the model (client_call)

Benchmarks data

Adding benchmark data

Benchmark product ids ≠ matrix module ids (critical gotcha)

oom vs dnr vs null (per-quant markers)

Comparison series (benchmark_series)

Discoverability artifacts (/llms.txt, /llms-full.txt)

Validation (build will fail if wrong)

Checklist before you finish

Source of truth (repo files)

Jetson AI Lab: model catalog page configuration

Reality check

Page anatomy (top-to-bottom)

Frontmatter reference

Always-include tags

Model Details sidebar tags (drive the dark spec card on the right of the Hero)

Situational tags

Inference blocks (at least one required unless hide_run_button: true)

Run on Jetson: the three "block" levers (when to OOM vs hide vs gray)

Inference data: serve matrix (engines)

Canonical Jetson module ids

One-shot inference

Call the model (client_call)

Benchmarks data

Adding benchmark data

Benchmark product ids ≠ matrix module ids (critical gotcha)

oom vs dnr vs null (per-quant markers)

Comparison series (benchmark_series)

Discoverability artifacts (/llms.txt, /llms-full.txt)

Validation (build will fail if wrong)

Checklist before you finish

Source of truth (repo files)

このリポジトリの他の Skills

Inference blocks (at least one required unless `hide_run_button: true`)

Call the model (`client_call`)

`oom` vs `dnr` vs `null` (per-quant markers)

Comparison series (`benchmark_series`)

Discoverability artifacts (`/llms.txt`, `/llms-full.txt`)

Inference blocks (at least one required unless `hide_run_button: true`)

Call the model (`client_call`)

`oom` vs `dnr` vs `null` (per-quant markers)

Comparison series (`benchmark_series`)

Discoverability artifacts (`/llms.txt`, `/llms-full.txt`)