Run any Skill in Manus
with one click
with one click
Run any Skill in Manus with one click
Get Started$pwd:
$ git log --oneline --stat
stars:1,162
forks:134
updated:March 8, 2026 at 03:42
SKILL.md
| name | experiment |
| description | Plan and run a series of training experiments, then compare results |
| disable-model-invocation | true |
| allowed-tools | Bash, Read, Glob, Grep |
| argument-hint | ["experiment description"] |
Plan, execute, and analyze a series of training runs based on the user's experiment description in $ARGUMENTS.
| Run | Name | Key Changes | Command |
CRITICAL: Run training jobs SEQUENTIALLY, one at a time. NEVER run jobs in parallel — the machine is compute-limited and parallel training will degrade performance for all runs.
For each run:
/train skill conventions:
RAY_ADDRESS= uv run python run_experiment.py train --env <ENV> ...--logdir /tmp/experiments/<experiment_name>/<run_name> for organized outputrun_in_background). Use a generous timeout (600000ms / 10 min).After each run completes, extract these metrics from the training stdout:
Per-iteration metrics (from the table printed each iteration):
Mean Eprew — episode rewardMean Eplen — episode lengthActor loss, Critic lossMean KL Div — policy divergenceMean Entropy — explorationClip Fraction — PPO clipping rateMean noise std — action noiseSummary metrics (from eval and timing lines):
fps — frames per secondAnomaly detection — flag these issues:
nan or inf in any metricFor each completed run, report:
After all runs complete, produce a comparison summary:
Comparison table:
| Run | Final Reward | Peak Eval Reward | Peak Iter | Stable? | Key Hyperparam Diffs |
|-----|-------------|-----------------|-----------|---------|---------------------|
Analysis:
--n-itr 100-500 with --eval-freq 50--no-mirror--num-procs consistent across runs in the same experiment for fair FPS comparisongamma095, lr1e3)