Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

$pwd:

configs

Name: Configs
Author: PrimeIntellect-ai

// How the prime-rl config system works — TOML files, CLI overrides, composition, and special patterns. Use when creating configs, debugging config errors, or overriding values via CLI.

In Manus ausführen

$ git log --oneline --stat

stars:1.413

forks:302

updated:22. Mai 2026 um 04:09

SKILL.md

readonly

name	configs
description	How the prime-rl config system works — TOML files, CLI overrides, composition, and special patterns. Use when creating configs, debugging config errors, or overriding values via CLI.

Configs

prime-rl uses pydantic-config — a Pydantic-based TOML + CLI config system (no tyro). Every entrypoint accepts TOML files via @ and CLI overrides.

Loading and composition

uv run rl @ examples/reverse_text/rl.toml                                  # single TOML
uv run rl @ examples/reverse_text/rl.toml --max-steps 50                   # CLI override
uv run rl @ base.toml @ overlay.toml                                       # left-to-right merge
uv run rl --model @ model.toml --data @ data.toml                          # nested section files
uv run rl @ base.toml --trainer @ trainer.toml --trainer.lr 1e-3           # mixed

Resolution order: CLI > config files (left-to-right) > class defaults. Merging is deep — unset fields in an overlay are preserved from the base.

Naming: CLI uses kebab-case (--model.max-model-len); TOML uses snake_case (max_model_len).

Inspect & validate

uv run rl --help                                  # all fields and defaults
uv run rl @ rl.toml --dry-run --output-dir /tmp/x # write resolved TOML to /tmp/x/configs

Validators

Incompatible combinations (e.g. CP requires flash attention) must raise in a model_validator at resolve time, not at runtime. When renaming a field, emit a deprecation warning with a migration hint — never silently drop.

Special syntax

Booleans — CLI --flag / --no-flag; TOML must be explicit (enforce_eager = true).

None — TOML has no null, use the string "None" (max_model_len = "None"); CLI: --model.max-model-len None.

Lists — TOML uses array of tables; later config files replace lists wholesale, so overlays must include the full desired list:

[[orchestrator.env]]
id = "reverse-text"

CLI: --env.0.id reverse-text --env.1.id math-env.

Dicts — TOML uses a section; CLI takes a JSON string: --vllm-extra '{"key1": "value1"}'.

Discriminated unions — set the type field to pick the variant ([trainer.loss] type = "sft"). Omit type to keep the default variant.

BaseModel | None fields — bare flag enables defaults; nested override enables and sets:

--model.compile             # enables compile with defaults
--model.compile.fullgraph   # enables and sets fullgraph=true

In TOML, an empty section header ([ckpt]) does the same.

RL trainer token exports

For rollout debugging, enable trainer-side token export under trainer.experimental.token_export (or experimental.token_export when running the trainer entrypoint directly). It writes one JSONL record per exported sequence under output_dir/token_exports/step_<step>/rank_<rank>.jsonl. Each record stores aligned per-token arrays for token ids, loss mask, advantage, reward, entropy, mismatch KL, inference/trainer logprobs, importance ratios, probability deltas, and masking diagnostics. It does not decode token text in the trainer.

[trainer.experimental.token_export]

Leave it unset for normal training. When enabled, it exports every sequence from each exporting rank.

Key files

packages/prime-rl-configs/src/prime_rl/ — config classes under configs/; utils/config.py re-exports BaseConfig and cli
configs/debug/ — minimal debug configs
configs/private/ — private configs submodule (internal)
examples/ — full example configs

related-skills.json

gleiches Repository

release.md

from "PrimeIntellect-ai/prime-rl"

How to prepare and publish GitHub releases for prime-rl. Use when drafting release notes, tagging versions, or publishing releases.

2026-05-201.4k

install.md

from "PrimeIntellect-ai/prime-rl"

How to install prime-rl and its optional dependencies. Use when setting up the project, installing extras like deep-gemm for FP8 models, or troubleshooting dependency issues.

2026-05-201.4k

monitor-run.md

from "PrimeIntellect-ai/prime-rl"

Monitor an ongoing prime-rl training run — find the output directory, tail logs, check key metrics, inspect SLURM jobs, and restart safely. Use when asked to check on a run, debug training, or investigate performance.

2026-05-201.4k

training.md

from "PrimeIntellect-ai/prime-rl"

Launch and monitor prime-rl training runs. Use when starting, supervising, or debugging an RL/SFT run. Routes to `start-run` (entrypoints + how to launch) and `monitor-run` (logs, metrics, check-ins).

2026-05-201.4k

start-run.md

from "PrimeIntellect-ai/prime-rl"

How to launch prime-rl training runs — the `rl`, `sft`, and `inference` entrypoints, their config classes, and single-node/SLURM/dry-run modes. Use when starting a run or picking the right entrypoint.

2026-05-201.4k

package.json

"author": "PrimeIntellect-ai"

"repository": "PrimeIntellect-ai/prime-rl"

GitHub-Repository öffnen Creator-Repositorys ansehen

$ install --global

$ download --local

In Manus ausführen

$ useful --forSOC

ComputersystemanalytikerInformatik- und Mathematikberufe15-1211L4

name	configs
description	How the prime-rl config system works — TOML files, CLI overrides, composition, and special patterns. Use when creating configs, debugging config errors, or overriding values via CLI.

Configs

prime-rl uses pydantic-config — a Pydantic-based TOML + CLI config system (no tyro). Every entrypoint accepts TOML files via @ and CLI overrides.

Loading and composition

uv run rl @ examples/reverse_text/rl.toml                                  # single TOML
uv run rl @ examples/reverse_text/rl.toml --max-steps 50                   # CLI override
uv run rl @ base.toml @ overlay.toml                                       # left-to-right merge
uv run rl --model @ model.toml --data @ data.toml                          # nested section files
uv run rl @ base.toml --trainer @ trainer.toml --trainer.lr 1e-3           # mixed

Resolution order: CLI > config files (left-to-right) > class defaults. Merging is deep — unset fields in an overlay are preserved from the base.

Naming: CLI uses kebab-case (--model.max-model-len); TOML uses snake_case (max_model_len).

Inspect & validate

uv run rl --help                                  # all fields and defaults
uv run rl @ rl.toml --dry-run --output-dir /tmp/x # write resolved TOML to /tmp/x/configs

Validators

Special syntax

Booleans — CLI --flag / --no-flag; TOML must be explicit (enforce_eager = true).

None — TOML has no null, use the string "None" (max_model_len = "None"); CLI: --model.max-model-len None.

Lists — TOML uses array of tables; later config files replace lists wholesale, so overlays must include the full desired list:

[[orchestrator.env]]
id = "reverse-text"

CLI: --env.0.id reverse-text --env.1.id math-env.

Dicts — TOML uses a section; CLI takes a JSON string: --vllm-extra '{"key1": "value1"}'.

Discriminated unions — set the type field to pick the variant ([trainer.loss] type = "sft"). Omit type to keep the default variant.

BaseModel | None fields — bare flag enables defaults; nested override enables and sets:

--model.compile             # enables compile with defaults
--model.compile.fullgraph   # enables and sets fullgraph=true

In TOML, an empty section header ([ckpt]) does the same.

RL trainer token exports

[trainer.experimental.token_export]

Leave it unset for normal training. When enabled, it exports every sequence from each exporting rank.

Key files

packages/prime-rl-configs/src/prime_rl/ — config classes under configs/; utils/config.py re-exports BaseConfig and cli
configs/debug/ — minimal debug configs
configs/private/ — private configs submodule (internal)
examples/ — full example configs

configs

Configs

Loading and composition

Inspect & validate

Validators

Special syntax

RL trainer token exports

Key files

Mehr aus diesem Repository

Mehr aus diesem Repository

Configs

Loading and composition

Inspect & validate

Validators

Special syntax

RL trainer token exports

Key files