تشغيل أي مهارة في Manus بنقرة واحدة

$pwd:

nemo-gym-reward-profiling

Name: Nemo Gym Reward Profiling
Author: NVIDIA-NeMo

// Use to help users get started with Nemo Gym reward profiling. Covers the basic ng_run, ng_collect_rollouts, and ng_reward_profile workflow, repeated rollouts, materialized inputs, rollout JSONL artifacts, task and rollout identity, output inspection, partial profiling, and rollout_infos. For failed jobs, prefer nemo-gym-debugging.

تشغيل في Manus

$ git log --oneline --stat

stars:٩١٤

forks:١٥١

updated:١٩ مايو ٢٠٢٦ في ١٧:٤٦

مستكشف الملفات

3 ملفات

SKILL.md

readonly

name	nemo-gym-reward-profiling
description	Use to help users get started with Nemo Gym reward profiling. Covers the basic ng_run, ng_collect_rollouts, and ng_reward_profile workflow, repeated rollouts, materialized inputs, rollout JSONL artifacts, task and rollout identity, output inspection, partial profiling, and rollout_infos. For failed jobs, prefer nemo-gym-debugging.

Nemo Gym Reward Profiling

Invocation Check

Use this skill when the user wants to run, understand, or lightly modify Nemo Gym reward profiling. Keep the answer oriented around the normal workflow:

ng_run starts model/resource servers, ng_collect_rollouts writes rollout artifacts, and ng_reward_profile generates profiling output from those artifacts.

If the user is primarily debugging a failed job or stack trace, use the nemo-gym-debugging skill first.

Basic Workflow

Identify the environment config paths and input JSONL.
Start Gym servers with ng_run.
Collect rollouts with ng_collect_rollouts; this writes rollouts.jsonl and *_materialized_inputs.jsonl.
Run ng_reward_profile on the materialized inputs and rollout JSONL to generate *_reward_profiling.jsonl.
Inspect line counts and profile rows.

Repeated rollouts are the main profiling lever. num_repeats=1 is valid, but per-task averages and variance are only meaningful with multiple rollouts per task.

Core Concepts

*_materialized_inputs.jsonl: expanded collection inputs after repeat expansion, agent defaults, and task/rollout id assignment.
rollouts.jsonl: one completed rollout/result per materialized input row.
*_reward_profiling.jsonl: one summarized profile row per original task with at least one completed rollout.
_ng_task_index: original task/sample id.
_ng_rollout_index: repeated rollout id for that task.
rollout_infos: compact per-rollout info inside each task profile row, including reward, token usage, and numeric rollout metrics when available.

Keep reward-to-length or reward-to-token analysis keyed by both _ng_task_index and _ng_rollout_index.

Reference Loading

Load references only when the user needs that detail:

Read references/quick-start.md for a generic command template and the minimal run sequence.
Read references/output-format.md to explain materialized inputs, rollout JSONL, reward profile rows, rollout_infos, and partial profiling.

Practical Defaults

Treat ng_reward_profile as the reward profiling step; rollout collection does not write reward profile files.
Run strict profiling by default. If rollout collection stopped early, use ++allow_partial_rollouts=True to profile completed rollouts and drop original input rows with no completed rollout.
Trust the target checkout's CLI help and nemo_gym/reward_profile.py over memory if flags differ.

related-skills.json

نفس المستودع

nemo-gym-debugging.md

from "NVIDIA-NeMo/Gym"

Use when debugging a Nemo Gym run or reward profiling job. Covers rollout collection failures, empty or partial JSONL outputs, stale materialized inputs, verifier/schema errors, Ray or Slurm issues, vLLM readiness, judge failures, tool/sandbox failures, cache problems, and throughput bottlenecks.

2026-05-19914

nemo-gym-pivot-datasets.md

from "NVIDIA-NeMo/Gym"

Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

2026-05-19914

nemo-gym-pivot-datasets.md

from "NVIDIA-NeMo/Gym"

2026-05-12914

nemo-gym-reward-profiling.md

from "NVIDIA-NeMo/Gym"

Use to help users get started with Nemo Gym reward profiling. Covers the basic ng_run, ng_collect_rollouts, and ng_reward_profile workflow, repeated rollouts, materialized inputs, rollout JSONL artifacts, task and rollout identity, output inspection, partial profiling, and rollout_infos. For failed jobs, prefer nemo-gym-debugging.

2026-05-11914

nemo-gym-docs.md

from "NVIDIA-NeMo/Gym"

Maintain the NeMo Gym Fern docs site — add, update, move, or remove pages under fern/. Use for any documentation change. Triggered by: "edit docs", "add doc page", "update docs", "rename page", "fix broken link", "add redirect", "preview docs", "publish docs", any request that touches `fern/`.

2026-05-11914

nemo-gym-debugging.md

from "NVIDIA-NeMo/Gym"

2026-04-28914

package.json

"author": "NVIDIA-NeMo"

"repository": "NVIDIA-NeMo/Gym"

فتح مستودع GitHub عرض مستودعات المنشئ

$ install --global

$ download --local

تشغيل في Manus

$ useful --forSOC

علماء البياناتمهن الحاسوب والرياضيات15-2051L4

name	nemo-gym-reward-profiling
description	Use to help users get started with Nemo Gym reward profiling. Covers the basic ng_run, ng_collect_rollouts, and ng_reward_profile workflow, repeated rollouts, materialized inputs, rollout JSONL artifacts, task and rollout identity, output inspection, partial profiling, and rollout_infos. For failed jobs, prefer nemo-gym-debugging.

Nemo Gym Reward Profiling

Invocation Check

Use this skill when the user wants to run, understand, or lightly modify Nemo Gym reward profiling. Keep the answer oriented around the normal workflow:

ng_run starts model/resource servers, ng_collect_rollouts writes rollout artifacts, and ng_reward_profile generates profiling output from those artifacts.

If the user is primarily debugging a failed job or stack trace, use the nemo-gym-debugging skill first.

Basic Workflow

Identify the environment config paths and input JSONL.
Start Gym servers with ng_run.
Collect rollouts with ng_collect_rollouts; this writes rollouts.jsonl and *_materialized_inputs.jsonl.
Run ng_reward_profile on the materialized inputs and rollout JSONL to generate *_reward_profiling.jsonl.
Inspect line counts and profile rows.

Repeated rollouts are the main profiling lever. num_repeats=1 is valid, but per-task averages and variance are only meaningful with multiple rollouts per task.

Core Concepts

*_materialized_inputs.jsonl: expanded collection inputs after repeat expansion, agent defaults, and task/rollout id assignment.
rollouts.jsonl: one completed rollout/result per materialized input row.
*_reward_profiling.jsonl: one summarized profile row per original task with at least one completed rollout.
_ng_task_index: original task/sample id.
_ng_rollout_index: repeated rollout id for that task.
rollout_infos: compact per-rollout info inside each task profile row, including reward, token usage, and numeric rollout metrics when available.

Keep reward-to-length or reward-to-token analysis keyed by both _ng_task_index and _ng_rollout_index.

Reference Loading

Load references only when the user needs that detail:

Read references/quick-start.md for a generic command template and the minimal run sequence.
Read references/output-format.md to explain materialized inputs, rollout JSONL, reward profile rows, rollout_infos, and partial profiling.

Practical Defaults

Treat ng_reward_profile as the reward profiling step; rollout collection does not write reward profile files.
Run strict profiling by default. If rollout collection stopped early, use ++allow_partial_rollouts=True to profile completed rollouts and drop original input rows with no completed rollout.
Trust the target checkout's CLI help and nemo_gym/reward_profile.py over memory if flags differ.

nemo-gym-reward-profiling

Nemo Gym Reward Profiling

Invocation Check

Basic Workflow

Core Concepts

Reference Loading

Practical Defaults

المزيد من هذا المستودع

المزيد من هذا المستودع

Nemo Gym Reward Profiling

Invocation Check

Basic Workflow

Core Concepts

Reference Loading

Practical Defaults