Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

parameter-golf-submission

Étoiles317

Forks25

Mis à jour26 mai 2026 à 06:30

Prepare and validate Parameter Golf record folders: self-contained train_gpt.py, README.md, submission.json, FineWeb SP1024 BPB accounting, artifact-size logging, run logs, and PR-ready folder hygiene.

Installation

Installer avec Codex ou Claude Copiez ce prompt, collez-le dans Codex, Claude ou un autre assistant, puis laissez-le vérifier la page du skill et l'installer pour vous.

Exécuter dans Manus

Source

mkurman

mkurman/zorai

Ouvrir le dépôt GitHub Voir les dépôts du créateur

Téléchargement

Exécuter dans Manus

Métiers associésSOC

Basé sur la classification professionnelle SOC

Développeurs de logicielsProfessions informatiques et mathématiques·SOC 15-1252

SKILL.md

readonly

name	parameter-golf-submission
description	Prepare and validate Parameter Golf record folders: self-contained train_gpt.py, README.md, submission.json, FineWeb SP1024 BPB accounting, artifact-size logging, run logs, and PR-ready folder hygiene.
tags	["parameter-golf","competition","fineweb","bpb","model-craft","submission"]

Parameter Golf Submission

Use this skill when creating or reviewing a Parameter Golf submission folder, independent of the cloud provider used for the run.

Record Folder Contract

A submission folder must contain:

records/<track>/<submission-name>/
  README.md
  submission.json
  train_gpt.py
  train.log          # after a real run

train_gpt.py must compile and run from inside this folder in a clean Parameter Golf checkout.

Competition Constraints To Preserve

Artifact cap: 16,000,000 decimal bytes.
Training cap for leaderboard records: 10 minutes on 8xH100 SXM-class hardware.
Evaluation metric: FineWeb validation bits per byte (val_bpb).
No validation-set leakage. Test-time training may only use validation tokens already scored, if implemented.
No hidden downloads/network calls during evaluation.
No local repository imports unless included and counted in the record folder.
If tokenizer changes, prove BPB accounting carefully; stock SP1024 is safest for first participation.

Self-Contained Script Checklist

Before running:

python -m py_compile train_gpt.py passes.
Imports are standard/allowed environment packages only (torch, numpy, sentencepiece, etc.).
DATA_PATH and TOKENIZER_PATH are env-configurable.
Script loads fineweb_train_*.bin and fineweb_val_*.bin with the Parameter Golf binary header format.
Script computes validation BPB from SentencePiece byte accounting, not just token loss.
Script logs parameter count and artifact-size estimate.
Script writes a compressed artifact, usually final_model.int8.ptz.
Script reloads/dequantizes the compressed artifact and evaluates the round-trip model.
Final log includes final_int8_zlib_roundtrip_exact.

README Contents

The README must include:

short architecture summary
dataset/tokenizer used
exact command
run hardware and time budget
final metrics after run
artifact-size line after run
caveats if the run is smoke/non-record/pending verification

submission.json Contents

Use actual values after the run, not placeholders:

{
  "run_name": "...",
  "author": "...",
  "github_id": "...",
  "track": "track_10min_16mb or track_non_record_16mb",
  "val_bpb": 1.2345,
  "val_loss": 2.1234,
  "artifact_size_bytes": 12345678,
  "command": "...",
  "status": "completed"
}

Add architecture fields as useful, but avoid claiming record eligibility unless the log proves it.

Post-Run Extraction

After a run, extract these lines:

grep -E "final_int8_zlib_roundtrip_exact|Total submission size int8\+zlib|stopping_early|train_time|model_params" train.log

Update:

submission.json.val_bpb
submission.json.val_loss
submission.json.artifact_size_bytes
README metrics section

Status Labels

Use precise status:

prepared_pending_run: folder created, no real run yet
smoke_passed: short/non-final run passed
completed_non_record: full run but not leaderboard-valid or not SOTA
completed_record_candidate: 8xH100 10-minute compliant run with full log and artifact under cap
failed: include failure reason and last good checkpoint/log line

Common Failure Modes

Accidentally importing local model code (from src...) not present in record folder.
Forgetting to copy train.log from logs/<RUN_ID>.txt.
Reporting pre-quant BPB instead of int8 round-trip BPB.
Exceeding 16MB after counting code + compressed artifact.
Running on 1 GPU and calling it leaderboard-valid.
Using a custom tokenizer without exact byte accounting proof.

Plus depuis ce dépôt

même dépôt

prime-intellect-cli

mkurman/zorai

Use when provisioning Prime Intellect GPU compute, managing pods/disks/sandboxes, running hosted RL training via prime lab, installing or publishing RL environments, or exposing local services via Prime Tunnel. Covers the `prime` CLI (PyPI: prime) for all Prime Intellect platform operations.

2026-06-24317

scienceskillscommon

mkurman/zorai

Shared Python package for Science Skills, currently containing http_client -- a unified HTTP client with rate limiting, retries, and exponential backoff. Not a standalone agent skill. Do not invoke directly.

2026-06-01317

alphafold-database-fetch-and-analyze

mkurman/zorai

Retrieve and analyze AlphaFold predicted structures for a protein. Use when the user provides a specific UniProt Accession ID and wants structural confidence metrics (pLDDT), domain boundary analysis, or disorder assessment. Do not use if the user only has a protein name, gene name, or amino acid sequence — ask for a UniProt ID first.

2026-06-01317

chembl-database

mkurman/zorai

Query the ChEMBL database for bioactive molecules, drug targets, bioactivity data, approved drugs, and chemical structures. Use when the user asks about compounds, targets, IC50/Ki values, drug mechanisms, or structure searches.

2026-06-01317

clinical-trials-database

mkurman/zorai

Query ClinicalTrials.gov via APIv2. Use when you want to search for trials by condition, drug, location, status, or phase; retrieve trial details by NCT ID; check eligibility/inclusion criteria; count trials across conditions or time periods; identify a sponsor's trial portfolio; find recruiting trials for patient matching.

2026-06-01317

clinvar-database

mkurman/zorai

Use when needing clinical significance, pathogenicity classifications (e.g., Pathogenic, Benign, VUS), clinical evidence rationales, or finding "hard positive" benchmark controls for human genomic variants.

2026-06-01317

name	parameter-golf-submission
description	Prepare and validate Parameter Golf record folders: self-contained train_gpt.py, README.md, submission.json, FineWeb SP1024 BPB accounting, artifact-size logging, run logs, and PR-ready folder hygiene.
tags	["parameter-golf","competition","fineweb","bpb","model-craft","submission"]

Parameter Golf Submission

Use this skill when creating or reviewing a Parameter Golf submission folder, independent of the cloud provider used for the run.

Record Folder Contract

A submission folder must contain:

records/<track>/<submission-name>/
  README.md
  submission.json
  train_gpt.py
  train.log          # after a real run

train_gpt.py must compile and run from inside this folder in a clean Parameter Golf checkout.

Competition Constraints To Preserve

Artifact cap: 16,000,000 decimal bytes.
Training cap for leaderboard records: 10 minutes on 8xH100 SXM-class hardware.
Evaluation metric: FineWeb validation bits per byte (val_bpb).
No validation-set leakage. Test-time training may only use validation tokens already scored, if implemented.
No hidden downloads/network calls during evaluation.
No local repository imports unless included and counted in the record folder.
If tokenizer changes, prove BPB accounting carefully; stock SP1024 is safest for first participation.

Self-Contained Script Checklist

Before running:

python -m py_compile train_gpt.py passes.
Imports are standard/allowed environment packages only (torch, numpy, sentencepiece, etc.).
DATA_PATH and TOKENIZER_PATH are env-configurable.
Script loads fineweb_train_*.bin and fineweb_val_*.bin with the Parameter Golf binary header format.
Script computes validation BPB from SentencePiece byte accounting, not just token loss.
Script logs parameter count and artifact-size estimate.
Script writes a compressed artifact, usually final_model.int8.ptz.
Script reloads/dequantizes the compressed artifact and evaluates the round-trip model.
Final log includes final_int8_zlib_roundtrip_exact.

README Contents

The README must include:

short architecture summary
dataset/tokenizer used
exact command
run hardware and time budget
final metrics after run
artifact-size line after run
caveats if the run is smoke/non-record/pending verification

submission.json Contents

Use actual values after the run, not placeholders:

{
  "run_name": "...",
  "author": "...",
  "github_id": "...",
  "track": "track_10min_16mb or track_non_record_16mb",
  "val_bpb": 1.2345,
  "val_loss": 2.1234,
  "artifact_size_bytes": 12345678,
  "command": "...",
  "status": "completed"
}

Add architecture fields as useful, but avoid claiming record eligibility unless the log proves it.

Post-Run Extraction

After a run, extract these lines:

grep -E "final_int8_zlib_roundtrip_exact|Total submission size int8\+zlib|stopping_early|train_time|model_params" train.log

Update:

submission.json.val_bpb
submission.json.val_loss
submission.json.artifact_size_bytes
README metrics section

Status Labels

Use precise status:

prepared_pending_run: folder created, no real run yet
smoke_passed: short/non-final run passed
completed_non_record: full run but not leaderboard-valid or not SOTA
completed_record_candidate: 8xH100 10-minute compliant run with full log and artifact under cap
failed: include failure reason and last good checkpoint/log line

Common Failure Modes

Accidentally importing local model code (from src...) not present in record folder.
Forgetting to copy train.log from logs/<RUN_ID>.txt.
Reporting pre-quant BPB instead of int8 round-trip BPB.
Exceeding 16MB after counting code + compressed artifact.
Running on 1 GPU and calling it leaderboard-valid.
Using a custom tokenizer without exact byte accounting proof.