원클릭으로 Manus에서 모든 스킬 실행

$pwd:

nemo-gym-pivot-datasets

Name: Nemo Gym Pivot Datasets
Author: NVIDIA-NeMo

// Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

Manus에서 실행

$ git log --oneline --stat

stars:914

forks:151

updated:2026년 5월 12일 22:51

파일 탐색기

10 개 파일

SKILL.md

readonly

name	nemo-gym-pivot-datasets
description	Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

Nemo Gym Pivot Datasets

Paper Reference

This skill operationalizes PivotRL: create local single-step pivot datasets from successful trajectories, prefer informative mixed-reward states, and train with verifier-based local rewards rather than exact trajectory imitation.

Invocation Check

Use this skill when the task is to turn existing agent trajectories or rollout artifacts into a Nemo Gym pivot dataset, or to validate whether a pivot JSONL/config pair can be used for single-step local RL or evaluation.

Before writing a converter, inspect representative source rows and the target resource server. Do not assume the source field names are the contract. Convert by reconstructing the semantic pieces needed by Gym's Responses-style row format.

Core Workflow

Inspect the source data shape and count the candidate assistant decision points.
Identify the semantic fields needed for each pivot:

model-call input context before the pivot action
available tools at that decision point
expected assistant action
reward/verifier target if it is separate from the demonstrated action
optional provenance such as task id, source trajectory id, rollout id, uuid, depth, and original metadata

Convert each accepted decision point into one pivot JSONL row.
Generate or update the matching Gym config so the pivot-format JSONL can be used directly.
Validate with the bundled validator and, when available, the target Gym resource-server models.
Write metrics that make skipped rows, action types, tool names, depth, and provenance coverage easy to inspect.

Row Shape

Read references/row-contract.md when implementing or reviewing a converter. For single_step_tool_use_with_argument_comparison, the essential row fields are:

responses_create_params: Responses API-style input and tool specs for the model call.
expected_action: one function_call or one message.
agent_ref: row-level agent routing that matches the generated config.

Do not copy optional null fields into responses_create_params; omit them unless the target contract explicitly wants them.

expected_action is singular. If a source assistant turn has more than one tool call, filter that turn out of the pivot dataset and keep it only in a skipped-row audit if it needs review.

Conversion Patterns

Read references/conversion-patterns.md when the source data is not already in pivot shape. The rule is to normalize by meaning, not by source container.

Useful reference scripts live under scripts/reference/. They are copied from real conversions and may contain dataset-specific paths, assumptions, or older branch behavior, so treat them as examples to borrow from rather than canonical commands to run unchanged:

generic_pivot_dataset_reference.py: generic source rows to pivot rows.
chat_messages_to_pivot_dataset_reference.py: chat-completion messages to pivot rows.
conversational_messages_to_pivot_dataset_reference.py: conversational message trajectories to pivot rows with reasoning/provenance handling.
tool_messages_to_pivot_dataset_reference.py: message/tool-use style rows to pivot rows.

Pivot Selection

Use clean, positive source trajectories for the demonstrated pivots. When multiple source trajectories exist for a task, prefer tasks whose source trajectory group has mixed rewards instead of all success or all failure; this avoids spending data on tasks that were trivial or impossible for the source model. Treat that source-task filter as preferred, not mandatory, because the source model and downstream policy may have different capabilities.

When possible, profile candidate pivots with local on-policy rollouts from the downstream or initial policy. Use at least 8 sampled local rollouts per candidate as the default. Keep candidates with mixed local rewards, discard all-1 and all-0 reward groups, and if data is abundant, drop the easiest/high-pass-rate pivots first so training concentrates on hard but learnable states.

Config And Training

Read references/config-training-and-agent-ref.md when creating the Gym YAML or explaining how to train/evaluate from the dataset.

Key points:

The pivot JSONL is the training/eval dataset; point the config's train dataset entry directly at it.
agent_ref.name in each row must match the agent block used by the config unless the launcher overrides routing intentionally.
word_count_similarity_threshold is the main string-argument matching knob for the single-step tool-use verifier.
Use tool_choice: "auto" for these rows; tool_choice: "required" can route some inference engines into structured decoding paths.
Validate configs and datasets together; a valid JSONL file can still be unusable if the agent/resource-server names do not line up.

Validation

Run the bundled validator before calling a pivot dataset done:

python scripts/validate_pivot_dataset.py --path /path/to/pivot.jsonl --agent-ref expected_agent_name

When the Gym repo is available, also validate against the resource-server Pydantic models:

python scripts/validate_pivot_dataset.py \
  --path /path/to/pivot.jsonl \
  --agent-ref expected_agent_name \
  --gym-repo /path/to/Gym-github

Use --require-field and --require-any-field only when a dataset-specific workflow needs extra provenance checks. Provenance is useful for debugging and filtering, but it is not required by the resource-server request model.

The validator accepts both supported expected-action types by default (function_call and message) and prints an end summary split between tool-call and message pivots.

related-skills.json

같은 저장소

nemo-gym-debugging.md

from "NVIDIA-NeMo/Gym"

Use when debugging a Nemo Gym run or reward profiling job. Covers rollout collection failures, empty or partial JSONL outputs, stale materialized inputs, verifier/schema errors, Ray or Slurm issues, vLLM readiness, judge failures, tool/sandbox failures, cache problems, and throughput bottlenecks.

2026-05-19914

nemo-gym-pivot-datasets.md

from "NVIDIA-NeMo/Gym"

Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

2026-05-19914

nemo-gym-reward-profiling.md

from "NVIDIA-NeMo/Gym"

Use to help users get started with Nemo Gym reward profiling. Covers the basic ng_run, ng_collect_rollouts, and ng_reward_profile workflow, repeated rollouts, materialized inputs, rollout JSONL artifacts, task and rollout identity, output inspection, partial profiling, and rollout_infos. For failed jobs, prefer nemo-gym-debugging.

2026-05-19914

nemo-gym-reward-profiling.md

from "NVIDIA-NeMo/Gym"

2026-05-11914

nemo-gym-docs.md

from "NVIDIA-NeMo/Gym"

Maintain the NeMo Gym Fern docs site — add, update, move, or remove pages under fern/. Use for any documentation change. Triggered by: "edit docs", "add doc page", "update docs", "rename page", "fix broken link", "add redirect", "preview docs", "publish docs", any request that touches `fern/`.

2026-05-11914

nemo-gym-debugging.md

from "NVIDIA-NeMo/Gym"

2026-04-28914

package.json

"author": "NVIDIA-NeMo"

"repository": "NVIDIA-NeMo/Gym"

GitHub 저장소 열기 Creator 저장소 보기

$ install --global

$ download --local

Manus에서 실행

$ useful --forSOC

데이터 과학자컴퓨터 및 수학직15-2051L4

name	nemo-gym-pivot-datasets
description	Use when creating, validating, or documenting Nemo Gym pivot datasets from rollout, trajectory, chat-completion, Responses API, or tool-call artifacts. Covers Gym Responses-style row conversion, pivot selection, single-step tool-use configs, agent_ref alignment, verifier knobs, expected-action row contracts, and train/eval usage.

Nemo Gym Pivot Datasets

Paper Reference

Invocation Check

Core Workflow

Inspect the source data shape and count the candidate assistant decision points.
Identify the semantic fields needed for each pivot:

model-call input context before the pivot action
available tools at that decision point
expected assistant action
reward/verifier target if it is separate from the demonstrated action
optional provenance such as task id, source trajectory id, rollout id, uuid, depth, and original metadata

Convert each accepted decision point into one pivot JSONL row.
Generate or update the matching Gym config so the pivot-format JSONL can be used directly.
Validate with the bundled validator and, when available, the target Gym resource-server models.
Write metrics that make skipped rows, action types, tool names, depth, and provenance coverage easy to inspect.

Row Shape

Read references/row-contract.md when implementing or reviewing a converter. For single_step_tool_use_with_argument_comparison, the essential row fields are:

responses_create_params: Responses API-style input and tool specs for the model call.
expected_action: one function_call or one message.
agent_ref: row-level agent routing that matches the generated config.

Do not copy optional null fields into responses_create_params; omit them unless the target contract explicitly wants them.

expected_action is singular. If a source assistant turn has more than one tool call, filter that turn out of the pivot dataset and keep it only in a skipped-row audit if it needs review.

Conversion Patterns

Read references/conversion-patterns.md when the source data is not already in pivot shape. The rule is to normalize by meaning, not by source container.

generic_pivot_dataset_reference.py: generic source rows to pivot rows.
chat_messages_to_pivot_dataset_reference.py: chat-completion messages to pivot rows.
conversational_messages_to_pivot_dataset_reference.py: conversational message trajectories to pivot rows with reasoning/provenance handling.
tool_messages_to_pivot_dataset_reference.py: message/tool-use style rows to pivot rows.

Pivot Selection

Config And Training

Read references/config-training-and-agent-ref.md when creating the Gym YAML or explaining how to train/evaluate from the dataset.

Key points:

The pivot JSONL is the training/eval dataset; point the config's train dataset entry directly at it.
agent_ref.name in each row must match the agent block used by the config unless the launcher overrides routing intentionally.
word_count_similarity_threshold is the main string-argument matching knob for the single-step tool-use verifier.
Use tool_choice: "auto" for these rows; tool_choice: "required" can route some inference engines into structured decoding paths.
Validate configs and datasets together; a valid JSONL file can still be unusable if the agent/resource-server names do not line up.

Validation

Run the bundled validator before calling a pivot dataset done:

python scripts/validate_pivot_dataset.py --path /path/to/pivot.jsonl --agent-ref expected_agent_name

When the Gym repo is available, also validate against the resource-server Pydantic models:

python scripts/validate_pivot_dataset.py \
  --path /path/to/pivot.jsonl \
  --agent-ref expected_agent_name \
  --gym-repo /path/to/Gym-github

The validator accepts both supported expected-action types by default (function_call and message) and prints an end summary split between tool-call and message pivots.

nemo-gym-pivot-datasets

Nemo Gym Pivot Datasets

Paper Reference

Invocation Check

Core Workflow

Row Shape

Conversion Patterns

Pivot Selection

Config And Training

Validation

이 저장소의 다른 Skills

이 저장소의 다른 Skills

Nemo Gym Pivot Datasets

Paper Reference

Invocation Check

Core Workflow

Row Shape

Conversion Patterns

Pivot Selection

Config And Training

Validation