Jeden Skill in Manus ausführen
mit einem Klick

Jeden Skill in Manus mit einem Klick ausführen

Loslegen

change-grug

Sterne1.129

Forks133

Aktualisiert19. Juni 2026 um 21:26

Modify or upstream a Grug/Grugformer experiment variant.

Installation

Mit Codex oder Claude installieren Kopieren Sie diesen Prompt, fügen Sie ihn in Codex, Claude oder einen anderen Assistant ein und lassen Sie die Skill-Seite prüfen und installieren.

In Manus ausführen

Quelle

marin-community

marin-community/marin

GitHub-Repository öffnen Creator-Repositorys ansehen

Download

In Manus ausführen

Verwandte BerufeSOC

Basierend auf der SOC-Berufsklassifikation

SoftwareentwicklerInformatik- und Mathematikberufe·SOC 15-1252

SKILL.md

readonly

name	change-grug
description	Modify or upstream a Grug/Grugformer experiment variant.

Skill: Changing Grug (Template-First)

Grug is intentionally template-first: the canonical edit surface lives in experiments/grug/base/, not in a shared levanter.grug trainer stack.

This skill covers two steps: trying a change in an experiment copy, and upstreaming it into the base template when it proves out.

Source Of Truth

Canonical template: experiments/grug/base/ — model.py, train.py, launch.py.
Variants: experiments/grug/<variant>/ — copy from base and modify locally (e.g. MoE).
One-off speedruns: experiments/speedrun/... — useful for exploration, not canonical.
Reference branch for array-stacked grug variant wiring: https://github.com/marin-community/marin/tree/codex/array-stacked-grug-variant-pointer — useful for perf-focused experiments, especially improving compile times and reducing peak HBM.

Workflow

1) Pick one change bucket

Keep each pass scoped to one bucket:

attention/masking
block wiring/norm ordering
MLP/activation
loss kernel behavior
optimizer/training loop behavior

2) Experiment in a copy

Copy experiments/grug/base to a new variant directory.
Keep edits local and explicit (copy/paste over abstraction).
Avoid introducing reusable framework surface unless there's clear repeated use.

3) Record the experiment

Update docs/reports/grug-archive.md with: path, origin (base, moe, or another source variant), commit SHA (when known), purpose, status (active, superseded, deleted), and diff link (prefer the CI-posted PR comment link; fallback to local report path).

For PRs that add a new experiments/grug/<variant>/, CI posts a visual diff comment automatically — copy that link into the archive entry.

For a local fallback, generate the diff report manually and link the report in the archive entry:

uv run python scripts/ci/grug_dir_diff.py \
  experiments/grug/base \
  experiments/grug/<variant> \
  --out /tmp/grug-diff

4) Upstream to base if it wins

Port the successful change back into experiments/grug/base/model.py, train.py, and launch.py. Keep it grug-style:

plain JAX arrays and explicit sharding
Equinox modules with init + __call__
minimal config knobs
legibility first; if a block gets hard to read, introduce a small local helper instead of framework indirection
when HBM is tight, use docs/references/hbm-optimization.md before bespoke memory hacks
when compile time or peak HBM is the bottleneck, evaluate an array-stacked variant first (see reference branch above)

5) Delete stale paths

After upstreaming, delete superseded experiment code; keep only the archive trail in docs/reports/grug-archive.md.

6) Validate

./infra/pre-commit.py --all-files
uv run pytest tests/test_grug_variant_contracts.py

Add focused tests for any behavior changes.

This workflow is inspired by modded-nanogpt: iterate quickly in copy-paste experiments, then upstream only what stays simple and useful.

Mehr aus diesem Repository

gleiches Repository

commit

marin-community/marin

Lint, run the pre-PR checks, commit, push, and author or update the branch's pull request in the required plain-text format. Use when committing, pushing, or creating/updating a PR.

2026-06-201.1k

evaluate-zephyr-perf

marin-community/marin

Run a perf gate on a PR that touches lib/zephyr internals.

2026-06-191.1k

organize-experiments

marin-community/marin

Curate the experiment report index at docs/reports/index.md.

2026-06-191.1k

triage-canary

marin-community/marin

Triage a failed canary ferry run (CI-invoked).

2026-06-191.1k

refresh-tpu-vllm-forks

marin-community/marin

Refresh Marin TPU-vLLM forks from a tpu-inference release/LKG pair, update exact SHA pins, run TPU smokes, and open the Marin PR.

2026-06-171.1k

profile-training

marin-community/marin

Profile JAX training and analyze hotspots. Use when profiling or optimizing training throughput.

2026-06-171.1k

name	change-grug
description	Modify or upstream a Grug/Grugformer experiment variant.

Skill: Changing Grug (Template-First)

Grug is intentionally template-first: the canonical edit surface lives in experiments/grug/base/, not in a shared levanter.grug trainer stack.

This skill covers two steps: trying a change in an experiment copy, and upstreaming it into the base template when it proves out.

Source Of Truth

Canonical template: experiments/grug/base/ — model.py, train.py, launch.py.
Variants: experiments/grug/<variant>/ — copy from base and modify locally (e.g. MoE).
One-off speedruns: experiments/speedrun/... — useful for exploration, not canonical.
Reference branch for array-stacked grug variant wiring: https://github.com/marin-community/marin/tree/codex/array-stacked-grug-variant-pointer — useful for perf-focused experiments, especially improving compile times and reducing peak HBM.

Workflow

1) Pick one change bucket

Keep each pass scoped to one bucket:

attention/masking
block wiring/norm ordering
MLP/activation
loss kernel behavior
optimizer/training loop behavior

2) Experiment in a copy

Copy experiments/grug/base to a new variant directory.
Keep edits local and explicit (copy/paste over abstraction).
Avoid introducing reusable framework surface unless there's clear repeated use.

3) Record the experiment

For PRs that add a new experiments/grug/<variant>/, CI posts a visual diff comment automatically — copy that link into the archive entry.

For a local fallback, generate the diff report manually and link the report in the archive entry:

uv run python scripts/ci/grug_dir_diff.py \
  experiments/grug/base \
  experiments/grug/<variant> \
  --out /tmp/grug-diff

4) Upstream to base if it wins

Port the successful change back into experiments/grug/base/model.py, train.py, and launch.py. Keep it grug-style:

plain JAX arrays and explicit sharding
Equinox modules with init + __call__
minimal config knobs
legibility first; if a block gets hard to read, introduce a small local helper instead of framework indirection
when HBM is tight, use docs/references/hbm-optimization.md before bespoke memory hacks
when compile time or peak HBM is the bottleneck, evaluate an array-stacked variant first (see reference branch above)

5) Delete stale paths

After upstreaming, delete superseded experiment code; keep only the archive trail in docs/reports/grug-archive.md.

6) Validate

./infra/pre-commit.py --all-files
uv run pytest tests/test_grug_variant_contracts.py

Add focused tests for any behavior changes.

This workflow is inspired by modded-nanogpt: iterate quickly in copy-paste experiments, then upstream only what stays simple and useful.