| name | autoloop |
| description | Universal autonomous optimization loop. Use this skill when the user wants to: continuously optimize any metric (latency, bundle size, test coverage, benchmark scores, build time, prompt quality, algorithm speed), run overnight experiments, set up an autonomous improvement loop on any codebase, or apply the autoresearch pattern to a new domain. Also trigger when the user mentions "autoloop", "optimize in a loop", "run experiments overnight", "fitness function", "keep or discard", "autonomous optimization", or references Karpathy's autoresearch pattern for non-ML workloads.
|
| argument-hint | [recipe name or custom description] |
Autoloop — Universal Autonomous Optimization
You are running an autonomous optimization loop — the generalized form of Karpathy's autoresearch. The pattern is simple and universal:
- Define genome files — the files you are allowed to modify
- Define a fitness command — a command that produces a measurable score
- Loop forever: mutate the genome, measure fitness, keep improvements, discard regressions
This works for ANY optimization problem that has a numeric score.
Quick Start
When the user invokes /autoloop, follow this protocol:
1. Detect or Ask
Determine what to optimize. Check for signals:
- Is there a
autoloop.config.json in the project root? If so, load it.
- Does the project have benchmarks, tests, or build scripts? Suggest a recipe.
- Did the user specify a recipe name? Load it from
references/recipes/.
- Otherwise, ask the user three questions:
- What files can I modify? (genome)
- What command measures success? (fitness)
- Should I minimize or maximize the score? (direction)
2. Configure
Create or update autoloop.config.json in the project root. See the schema below.
3. Branch
git checkout -b autoloop/<tag>
Use a tag based on today's date and the optimization target (e.g., autoloop/mar21-latency).
4. Baseline
Run the fitness command once without modifications. Record the result as the baseline in the results log.
5. Loop
Hand off to the loop-runner agent. It runs autonomously and never stops.
Config Schema
{
"name": "string — human-readable name for this optimization run",
"genome": {
"files": ["glob patterns for files the agent may modify"],
"constraints": ["natural language rules the agent must follow"]
},
"fitness": {
"command": "shell command that produces output containing the metric",
"metric_pattern": "regex with one capture group extracting the numeric score",
"direction": "minimize | maximize",
"timeout_seconds": 300
},
"budget": {
"time_per_experiment": 300,
"max_experiments": 0
},
"logging": {
"file": "autoloop-results.tsv",
"columns": ["commit", "score", "status", "description"]
},
"guardrails": {
"max_score_regression": 0.0,
"required_tests": "shell command that must pass before fitness is measured",
"max_files_changed": 10
}
}
Field Reference
| Field | Required | Description |
|---|
name | Yes | Descriptive name for the run |
genome.files | Yes | Glob patterns for modifiable files. Everything else is off-limits. |
genome.constraints | No | Natural-language rules (e.g., "Do not change the public API") |
fitness.command | Yes | Command whose stdout/stderr contains the metric |
fitness.metric_pattern | Yes | Regex with one capture group () that extracts the numeric value |
fitness.direction | Yes | "minimize" (lower is better) or "maximize" (higher is better) |
fitness.timeout_seconds | No | Kill the fitness command after this many seconds (default: 300) |
budget.time_per_experiment | No | Expected wall-clock seconds per experiment (for estimation) |
budget.max_experiments | No | Stop after N experiments. 0 = infinite (default) |
logging.file | No | Path to TSV results log (default: autoloop-results.tsv) |
logging.columns | No | Column names for the TSV (default shown above) |
guardrails.required_tests | No | Command that must exit 0 before fitness is measured |
guardrails.max_score_regression | No | Maximum allowed regression from best score (for multi-metric) |
guardrails.max_files_changed | No | Refuse to run if more than N files are modified in one experiment |
Recipes
Pre-built configurations for common domains. Load with /autoloop <recipe>.
| Recipe | Optimizes | Typical Fitness |
|---|
web-perf | Lighthouse / Core Web Vitals scores | npx lighthouse --output json |
api-latency | HTTP endpoint response time | wrk or autocannon benchmarks |
test-coverage | Code coverage percentage | jest --coverage, pytest --cov |
prompt-engineering | LLM output quality via eval harness | Custom eval script |
build-optimization | Build time or artifact size | time npm run build |
algorithm-speed | Benchmark execution time | hyperfine, pytest-benchmark |
bundle-size | JavaScript bundle size in bytes | size-limit or webpack stats |
security-scan | Security finding count | npm audit, bandit, semgrep |
custom | Anything — user defines everything | User-provided |
Multi-Metric Strategy
Sometimes you want to optimize one metric while keeping others from regressing. Use guardrails.required_tests for hard constraints and track secondary metrics in the log.
See references/strategies.md for advanced patterns:
- Pareto optimization (multiple fitness functions)
- Staged optimization (optimize A first, then B)
- Constrained optimization (optimize A subject to B > threshold)
The Autoresearch Lineage
This plugin generalizes the pattern from Karpathy's autoresearch:
| Autoresearch | Autoloop |
|---|
train.py | Any genome files |
val_bpb | Any fitness metric |
uv run train.py | Any fitness command |
| ML hyperparameters | Any code modifications |
| GPU training | Any compute workload |
The core loop is identical. The abstraction makes it work everywhere.