一键在 Manus 中运行任何 Skill

serving-runtime-config

星标39

分支26

更新时间2026年5月26日 11:17

Configure custom ServingRuntime CRs on OpenShift AI for model serving frameworks not covered by built-in runtimes. Use when: - "Create a custom serving runtime" - "I need a runtime for ONNX / Triton / custom framework" - "Customize vLLM runtime parameters" - "What serving runtimes are available?" - "Add a custom container image for model serving" Handles listing existing runtimes, creating new ServingRuntime CRs, and validating compatibility with target models. NOT for deploying models (use /model-deploy after runtime is configured). NOT for NIM platform setup (use /nim-setup).

安装

用 Codex 或 Claude 帮你安装复制这段 Prompt，粘贴到 Codex、Claude 或其他助手里，让它检查 Skill 页面并帮你完成安装。

在 Manus 中运行

来源

RHEcosystemAppEng

RHEcosystemAppEng/agentic-collections

打开 GitHub 仓库查看创作者相关仓库

下载

在 Manus 中运行

相关职业SOC

基于 SOC 职业分类

软件开发工程师计算机与数学类职业·SOC 15-1252

文件资源管理器

2 个文件

SKILL.md

readonly

同仓库更多 Skills

同仓库

cve-impact

RHEcosystemAppEng/agentic-collections

**CRITICAL**: Use for ALL CVE discovery and listing. DO NOT call get_cves directly. Use when: "show critical CVEs", "CVEs on hostname X", "remediatable vulnerabilities", "impact of CVE-X", risk assessment. NOT for remediation (use `/remediation`). System-level: FIRST reply = pagination prompt (Step -1). Parsing: references/01-cve-response-parser.py.

2026-06-2339

fleet-inventory

RHEcosystemAppEng/agentic-collections

Query and display Red Hat Lightspeed managed system inventory. This skill focuses on discovery and listing only - for remediation actions, transition to the `/remediation` skill. Use when: - "Show the managed fleet" - "List all systems registered in Lightspeed" - "What systems are affected by CVE-X?" - "How many RHEL 8 systems do we have?" - "Show me production systems" **When NOT to use this skill** (use `/remediation` skill instead): - "Remediate CVE-X on these systems" - "Create a playbook for..." - "Patch system Y" This skill orchestrates MCP tools from lightspeed-mcp for fleet visibility and system inventory management.

2026-06-2339

mcp-lightspeed-validator

RHEcosystemAppEng/agentic-collections

Validate Red Hat Lightspeed MCP server connectivity. Use when the user asks to "validate Lightspeed MCP", "check Lightspeed connection", or when other skills need to verify lightspeed-mcp availability before CVE operations.

2026-06-2239

agentic-contribution-skill

RHEcosystemAppEng/agentic-collections

Interactive skill creation and import with automated validation and marketplace compliance. Use when: - "Create a new skill" - "Import an existing skill" - "Create a new agentic pack" - "Add skill to <pack>" - "Build skill for <rh-product>" - User mentions "skill builder", "contribute", "new skill", "import skill", or "new pack" Two modes: create from scratch or import existing SKILL.md. Guides through discovery, definition, generation, and validation. Enforces SKILL_DESIGN_PRINCIPLES.md and agentskills.io spec.

2026-06-1639

collection-compliance

RHEcosystemAppEng/agentic-collections

Diagnose and fix `.catalog/` validation failures (schema, roster, banners, sample workflows, JSON mirror). Use when: - `make validate` or CI reports collection compliance errors - A PR adds skills but catalog was not updated - `collection.json` is out of sync with `collection.yaml` - Catalog metadata/fragments might have drifted from README/CLAUDE/SKILL golden sources Remediation is via the create-collection workflow and `catalog_yaml_to_json.py`—not by weakening checks.

2026-06-1639

create-collection

RHEcosystemAppEng/agentic-collections

Author or refresh `<pack>/.catalog/collection.yaml` and related `.catalog/` artifacts from golden sources (SKILL.md, README, AGENTS.md, Lola marketplace). Use when: - Adding a new pack or refreshing the collection catalog for GitHub Pages / tooling - Aligning catalog narrative, sample workflows, and decision guide with skills on disk - Preparing a PR after changing skills or marketplace metadata Outputs only under `<pack>/.catalog/` (never overwrite README, SKILL, CLAUDE, or marketplace YAML).

2026-06-1639

name	serving-runtime-config
description	Configure custom ServingRuntime CRs on OpenShift AI for model serving frameworks not covered by built-in runtimes. Use when: - "Create a custom serving runtime" - "I need a runtime for ONNX / Triton / custom framework" - "Customize vLLM runtime parameters" - "What serving runtimes are available?" - "Add a custom container image for model serving" Handles listing existing runtimes, creating new ServingRuntime CRs, and validating compatibility with target models. NOT for deploying models (use /model-deploy after runtime is configured). NOT for NIM platform setup (use /nim-setup).
model	inherit
color	blue
license	Apache-2.0
allowed-tools	resources_get resources_list resources_create_or_update list_serving_runtimes create_serving_runtime list_data_science_projects list_models

/serving-runtime-config Skill

Configure custom ServingRuntime custom resources on Red Hat OpenShift AI. Use when built-in runtimes (vLLM, NIM, Caikit+TGIS) do not support the target model framework, or when customizing an existing runtime's parameters (env vars, model format, container image).

Prerequisites

Required MCP Server: openshift (OpenShift MCP Server)

Required MCP Tools (from openshift):

resources_get - Inspect existing ServingRuntime CRs in detail
resources_list - List ServingRuntime and ClusterServingRuntime CRs (OpenShift fallback)
resources_create_or_update - Create fully custom ServingRuntime CR (when not using templates, or as fallback)

Preferred MCP Server: rhoai (RHOAI MCP Server) — used when available, automatic OpenShift fallback on failure

Preferred MCP Tools (from rhoai):

list_serving_runtimes - List available runtimes and platform templates with supported model formats
create_serving_runtime - Instantiate a serving runtime from a platform template (no YAML needed)
list_data_science_projects - Validate namespace is an RHOAI project

Optional MCP Server: ai-observability (AI Observability MCP)

Optional MCP Tools (from ai-observability):

list_models - Verify deployed models use the new runtime

Common prerequisites (KUBECONFIG, OpenShift+RHOAI cluster, KServe, verification protocol): See skill-conventions.md.

Fallback templates: See openshift-fallback-templates.md for OpenShift YAML templates used when RHOAI tools are unavailable.

When to Use This Skill

Use this skill when you need to:

Create a custom ServingRuntime for a framework not covered by built-in runtimes
Customize an existing runtime's parameters (env vars, container image, model format)
Instantiate a platform template runtime into a namespace
List and compare available serving runtimes and templates

Do NOT use this skill when:

You want to deploy a model using an existing runtime (use /model-deploy)
You need NIM platform setup (use /nim-setup)
You need to troubleshoot a deployment (use /debug-inference)

Workflow

Step 1: Validate Target Namespace

Ask the user for:

Namespace: Target namespace for the ServingRuntime

MCP Tool: list_data_science_projects (from rhoai)

Parameters: none

Verify the user-specified namespace is an RHOAI Data Science Project.

If rhoai unavailable or returns error: Use resources_list (from openshift) with apiVersion: v1, kind: Namespace, labelSelector: opendatahub.io/dashboard=true.

Error Handling:

If namespace not found in project list -> Report: "Namespace [namespace] is not an RHOAI Data Science Project. Use /ds-project-setup to create one, or specify a different namespace." WAIT for user decision.

Step 2: Gather Requirements

Ask the user for:

Use case: What framework/model needs serving? (e.g., "ONNX model", "custom TensorRT engine", "vLLM with custom args")
Intent: New runtime from scratch, or customize an existing one?

Document Consultation (read before listing runtimes):

Action: Read supported-runtimes.md using the Read tool to understand available runtimes and their capabilities
Output to user: "I consulted supported-runtimes.md to understand available runtimes."

MCP Tool: list_serving_runtimes (from rhoai)

Parameters:

namespace: validated namespace from Step 1 - REQUIRED
include_templates: true - REQUIRED (shows both existing runtimes and platform templates)

If rhoai unavailable or returns error: Use resources_list (from openshift) with apiVersion: serving.kserve.io/v1alpha1, kind: ServingRuntime, namespace: [namespace] for namespace runtimes, and kind: ClusterServingRuntime for platform templates. Filter by label opendatahub.io/dashboard=true and check spec.supportedModelFormats for compatibility.

Present findings in a table:

Runtime Name	Model Format	Source	Requires Instantiation
[name]	[format]	namespace / template	[true/false]

The response distinguishes between:

Existing runtimes (source: "namespace") - ready to use with /model-deploy
Platform templates (source: "template", requires_instantiation: true) - must be instantiated first

If an existing runtime fits the user's need, recommend using it directly with /model-deploy. If a platform template fits, offer to instantiate it (Step 5 alternative). Otherwise, proceed to Step 3 for custom runtime creation.

WAIT for user to confirm whether to create a new runtime, instantiate a template, or customize an existing one.

Step 3: Determine Runtime Configuration

Based on the user's framework and model requirements, determine the ServingRuntime spec.

If customizing an existing runtime:

MCP Tool: resources_get (from openshift)

Parameters:

apiVersion: "serving.kserve.io/v1alpha1" - REQUIRED
kind: "ServingRuntime" - REQUIRED
namespace: user-specified namespace - REQUIRED
name: name of the existing runtime to customize - REQUIRED

Extract the current spec as a starting point. Present the current configuration and ask what the user wants to change.

If the user requests a runtime for an unfamiliar framework -> Trigger live doc lookup:

Action: Read live-doc-lookup.md using the Read tool for the lookup protocol
Output to user: "Framework [name] is not in my cached runtimes. I'll look up its serving requirements."
Use WebFetch to retrieve specs from Red Hat OpenShift AI documentation
Extract: container image, model format name, supported protocols, required env vars
Output to user: "I looked up [framework] on [source] to confirm its runtime requirements: [summary]"

Collect runtime parameters:

Parameter	Value	Source
Runtime name	[name]	user input
Container image	[image:tag]	user input / doc lookup
Model format name	[format]	user input / doc lookup
Supported protocol versions	[v1, v2, grpc-v2]	user input / default
Multi-model serving	[true/false]	default: false (single-model)
Environment variables	[list]	user input
GPU resource requirements	[limits]	user input

WAIT for user to confirm or modify parameters.

Step 4: Generate ServingRuntime YAML

Generate the ServingRuntime manifest using values from Steps 2-3.

apiVersion: serving.kserve.io/v1alpha1
kind: ServingRuntime
metadata:
  name: [runtime-name]
  namespace: [namespace]
  labels:
    opendatahub.io/dashboard: "true"
  annotations:
    openshift.io/display-name: "[Display Name]"
spec:
  supportedModelFormats:
    - name: [model-format-name]
      version: "[version]"
      autoSelect: true
  multiModel: false
  containers:
    - name: kserve-container
      image: [container-image:tag]
      ports:
        - containerPort: 8080
          protocol: TCP
      env:
        - name: [ENV_VAR_NON_SECRET]
          value: "[non-sensitive-value]"
        - name: [SECRET_ENV_VAR]
          valueFrom:
            secretKeyRef:
              name: [k8s-secret-name]
              key: [secret-key-name]
      resources:
        limits:
          nvidia.com/gpu: "[gpu-count]"
        requests:
          cpu: "[cpu]"
          memory: "[memory]"

Display the ServingRuntime YAML to the user, redacting any sensitive values.

Ask: "Proceed with creating this ServingRuntime? (yes/no/modify)"

WAIT for explicit confirmation.

If yes -> Proceed to Step 5
If no -> Abort
If modify -> Ask what to change, regenerate YAML, return to this step

Step 5: Create ServingRuntime

If instantiating from a platform template (user chose a template from Step 2):

MCP Tool: create_serving_runtime (from rhoai)

Parameters:

namespace: target namespace - REQUIRED
template_name: name of the template to instantiate (e.g., "vllm-cuda-runtime-template") - REQUIRED

The response includes the created runtime name, display name, and supported model formats.

If rhoai unavailable or returns error: Use resources_get (from openshift) to fetch the ClusterServingRuntime template, copy its spec to a namespace-scoped ServingRuntime, and create via resources_create_or_update (from openshift). See openshift-fallback-templates.md for the pattern.

If creating a fully custom runtime (custom container image, non-template configuration):

MCP Tool: resources_create_or_update (from openshift)

Parameters:

manifest: full ServingRuntime manifest as JSON string - REQUIRED
namespace: user-specified namespace - REQUIRED

Error Handling:

If namespace not found -> Report error, suggest creating namespace or using /ds-project-setup
If runtime name already exists -> Ask user: "ServingRuntime [name] already exists. Update it? (yes/no)"
If CRD not found -> Report: "ServingRuntime CRD not available. Ensure Red Hat OpenShift AI operator is installed."
If RBAC error -> Report insufficient permissions

Step 6: Validate Runtime

MCP Tool: list_serving_runtimes (from rhoai)

Parameters:

namespace: user-specified namespace - REQUIRED
include_templates: false

Verify the runtime appears in the namespace runtime list.

For detailed inspection:

MCP Tool: resources_get (from openshift)

Parameters:

apiVersion: "serving.kserve.io/v1alpha1" - REQUIRED
kind: "ServingRuntime" - REQUIRED
namespace: user-specified namespace - REQUIRED
name: the created runtime name - REQUIRED

Report results showing: runtime name, namespace, model format, container image, and next steps (/model-deploy to deploy a model using this runtime).

Common Issues

For common issues (GPU scheduling, OOMKilled, image pull errors, RBAC), see common-issues.md.

Issue 1: InferenceService Cannot Find Runtime

Error: InferenceService status shows "Unknown" or runtime not matched

Cause: The modelFormat.name in the InferenceService does not match any supportedModelFormats[].name in available ServingRuntimes.

Solution:

Verify the model format name matches exactly (case-sensitive)
Check the runtime is in the same namespace as the InferenceService
Ensure the runtime has opendatahub.io/dashboard: "true" label

Issue 2: Runtime Port Mismatch

Error: InferenceService created but health checks fail, endpoint returns connection refused

Cause: The containerPort in the ServingRuntime does not match the port the serving framework actually listens on.

Solution:

Check the framework's documentation for its default serving port
Update the containerPort in the ServingRuntime spec
Or set an environment variable to configure the framework's listen port to match

Dependencies

MCP Tools

See Prerequisites for the complete list of required and optional MCP tools.

Related Skills

/model-deploy - Deploy a model using the configured runtime
/nim-setup - NIM platform setup (if NIM runtime is needed instead)
/debug-inference - Troubleshoot InferenceService failures after deployment

Reference Documentation

supported-runtimes.md - Runtime capabilities and model format names
live-doc-lookup.md - Protocol for fetching specs for unknown frameworks

Critical: Human-in-the-Loop Requirements

See skill-conventions.md for general HITL and security conventions.

Skill-specific checkpoints:

After namespace validation (Step 1): confirm namespace or redirect to /ds-project-setup
After listing existing runtimes (Step 2): confirm whether to create new or customize existing
After collecting parameters (Step 3): confirm runtime configuration
Before creating ServingRuntime (Step 4): display full YAML, confirm
NEVER overwrite an existing ServingRuntime without user confirmation