Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

lab-autoresearch

Name: Lab Autoresearch
Author: oliver-kriska

// Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.

Ejecutar en Manus

$ git log --oneline --stat

stars:351

forks:25

updated:27 de mayo de 2026, 18:29

Explorador de archivos

13 archivos

SKILL.md

readonly

name	lab:autoresearch
description	Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.
effort	high
argument-hint	[--skill NAME] [--strategy targeted\|sweep\|random] [--dry-run] [--max-iterations N]
disable-model-invocation	true

Autoresearch — Plugin Skill Self-Improvement

Iteratively improve plugin skills via the autoresearch pattern: propose one mutation -> eval -> keep/revert -> repeat.

Usage

/lab:autoresearch                           # Targeted: attack weakest skill+dimension
/lab:autoresearch --skill review            # Focus on one skill
/lab:autoresearch --strategy sweep          # Process all skills alphabetically
/lab:autoresearch --dry-run                 # Show what would change, don't commit

For overnight runs:

/loop 5m /lab:autoresearch --strategy sweep --max-iterations 200

Iron Laws

ONE mutation per iteration — if description needs "and", split into two
NEVER mutate read-only files — check program.md before every write
EVAL is deterministic — always use the wrapper script, never LLM-judge
REVERT on regression OR checks failure — no exceptions
LOG every iteration — use keep or revert command (never skip)
CHECK ideas.md before proposing — don't rediscover known optimizations

Wrapper Script Commands

All eval/git/journal operations go through ONE script. Do NOT run these manually.

# Find the weakest skill+dimension
python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

# Score a skill (before mutation, to get baseline)
python3 lab/autoresearch/scripts/run-iteration.py score <skill-name>

# After mutation: score + checks + compare → verdict (KEEP or REVERT)
python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>

# Act on verdict:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "what changed" --asi '{"hypothesis": "why", "mechanism": "how"}'

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "what was attempted" --asi '{"hypothesis": "why", "regression": "what broke", "avoid": "do not retry this"}'

# Check overall progress
python3 lab/autoresearch/scripts/run-iteration.py status

Core Loop (ONE iteration)

Step 1: Read State

Read lab/autoresearch/program.md (goals, mutable surface, rules)
Read lab/autoresearch/ideas.md if it exists (deferred optimizations)
Run: python3 lab/autoresearch/scripts/run-iteration.py status

Step 2: Select Target

Run: python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

Parse the JSON: skill, dimension, failing_checks. If all_perfect → STOP.

Step 3: Read + Propose

Read target SKILL.md and its references/ listing
Read eval definition from lab/eval/evals/{skill}.json
Check ideas.md for deferred ideas about this skill
Check recent journal entries for prior failures on this skill (avoid repeats)
Consult ${CLAUDE_SKILL_DIR}/references/mutation-strategies.md
Propose exactly ONE change targeting the failing checks

Step 4: Apply + Evaluate

Apply the mutation via Edit tool
Run: python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>
Parse JSON → check verdict field

Step 5: Keep or Revert

If verdict is KEEP:

python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "mechanism": "..."}'

If verdict is REVERT:

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "regression": "...", "avoid": "..."}'

Step 6: Ideas Backlog

If during analysis you discovered a promising optimization you can't act on now:

Append it to lab/autoresearch/ideas.md as a bullet
On next resume: prune stale/tried ideas, experiment with the rest

Step 7: Continue or Stop

All targets >= 0.95? Print "AUTORESEARCH_COMPLETE"
Max iterations reached? Print "AUTORESEARCH_COMPLETE"
50 consecutive discards? Print "AUTORESEARCH_STUCK"
Otherwise: immediately start Step 1 again

References

${CLAUDE_SKILL_DIR}/references/mutation-strategies.md — mutation type catalog
${CLAUDE_SKILL_DIR}/references/state-management.md — git protocol, journaling
lab/autoresearch/program.md — research agenda (read every iteration)

related-skills.json

mismo repositorio

session-scan.md

from "oliver-kriska/claude-elixir-phoenix"

Compute metrics for Claude Code sessions. Discovers via ccrider, filters trivial, computes friction/opportunity/fingerprint scores. Use for broad session triage.

2026-05-25351

phx-audit.md

from "oliver-kriska/claude-elixir-phoenix"

Project health audit and health check — architecture, performance, tests, dependencies, code quality. Use when assessing overall project health, before releases, or after refactors.

2026-05-25351

phx-permissions.md

from "oliver-kriska/claude-elixir-phoenix"

Recommend safe Bash permissions for Elixir mix commands in settings.json. Use when permission prompts slow workflow, "fix permissions", "reduce prompts", "auto-allow mix".

2026-05-20351

phx-brainstorm.md

from "oliver-kriska/claude-elixir-phoenix"

Brainstorm Elixir/Phoenix features — explore ideas, compare approaches, gather requirements. Use when vague idea, not sure how to approach, or want to discuss before plan.

2026-05-20351

phx-perf.md

from "oliver-kriska/claude-elixir-phoenix"

Analyze Elixir/Phoenix performance — N+1 queries, assign bloat, ecto optimization, genserver bottlenecks. Use when slowness, timeouts, or high memory reported.

2026-05-20351

phx-pr-review.md

from "oliver-kriska/claude-elixir-phoenix"

Address PR review comments on Elixir/Phoenix code — fetch comments, draft responses, optionally fix code. Use when the user shares a PR URL or mentions reviewer feedback.

2026-05-20351

package.json

"author": "oliver-kriska"

"repository": "oliver-kriska/claude-elixir-phoenix"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

name	lab:autoresearch
description	Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.
effort	high
argument-hint	[--skill NAME] [--strategy targeted\|sweep\|random] [--dry-run] [--max-iterations N]
disable-model-invocation	true

Autoresearch — Plugin Skill Self-Improvement

Iteratively improve plugin skills via the autoresearch pattern: propose one mutation -> eval -> keep/revert -> repeat.

Usage

/lab:autoresearch                           # Targeted: attack weakest skill+dimension
/lab:autoresearch --skill review            # Focus on one skill
/lab:autoresearch --strategy sweep          # Process all skills alphabetically
/lab:autoresearch --dry-run                 # Show what would change, don't commit

For overnight runs:

/loop 5m /lab:autoresearch --strategy sweep --max-iterations 200

Iron Laws

ONE mutation per iteration — if description needs "and", split into two
NEVER mutate read-only files — check program.md before every write
EVAL is deterministic — always use the wrapper script, never LLM-judge
REVERT on regression OR checks failure — no exceptions
LOG every iteration — use keep or revert command (never skip)
CHECK ideas.md before proposing — don't rediscover known optimizations

Wrapper Script Commands

All eval/git/journal operations go through ONE script. Do NOT run these manually.

# Find the weakest skill+dimension
python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

# Score a skill (before mutation, to get baseline)
python3 lab/autoresearch/scripts/run-iteration.py score <skill-name>

# After mutation: score + checks + compare → verdict (KEEP or REVERT)
python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>

# Act on verdict:
python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "what changed" --asi '{"hypothesis": "why", "mechanism": "how"}'

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "what was attempted" --asi '{"hypothesis": "why", "regression": "what broke", "avoid": "do not retry this"}'

# Check overall progress
python3 lab/autoresearch/scripts/run-iteration.py status

Core Loop (ONE iteration)

Step 1: Read State

Read lab/autoresearch/program.md (goals, mutable surface, rules)
Read lab/autoresearch/ideas.md if it exists (deferred optimizations)
Run: python3 lab/autoresearch/scripts/run-iteration.py status

Step 2: Select Target

Run: python3 lab/autoresearch/scripts/run-iteration.py target --strategy targeted

Parse the JSON: skill, dimension, failing_checks. If all_perfect → STOP.

Step 3: Read + Propose

Read target SKILL.md and its references/ listing
Read eval definition from lab/eval/evals/{skill}.json
Check ideas.md for deferred ideas about this skill
Check recent journal entries for prior failures on this skill (avoid repeats)
Consult ${CLAUDE_SKILL_DIR}/references/mutation-strategies.md
Propose exactly ONE change targeting the failing checks

Step 4: Apply + Evaluate

Apply the mutation via Edit tool
Run: python3 lab/autoresearch/scripts/run-iteration.py eval <skill-name>
Parse JSON → check verdict field

Step 5: Keep or Revert

If verdict is KEEP:

python3 lab/autoresearch/scripts/run-iteration.py keep <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "mechanism": "..."}'

If verdict is REVERT:

python3 lab/autoresearch/scripts/run-iteration.py revert <skill> <dim> <old> <new> \
  --desc "..." --asi '{"hypothesis": "...", "regression": "...", "avoid": "..."}'

Step 6: Ideas Backlog

If during analysis you discovered a promising optimization you can't act on now:

Append it to lab/autoresearch/ideas.md as a bullet
On next resume: prune stale/tried ideas, experiment with the rest

Step 7: Continue or Stop

All targets >= 0.95? Print "AUTORESEARCH_COMPLETE"
Max iterations reached? Print "AUTORESEARCH_COMPLETE"
50 consecutive discards? Print "AUTORESEARCH_STUCK"
Otherwise: immediately start Step 1 again

References

${CLAUDE_SKILL_DIR}/references/mutation-strategies.md — mutation type catalog
${CLAUDE_SKILL_DIR}/references/state-management.md — git protocol, journaling
lab/autoresearch/program.md — research agenda (read every iteration)

lab-autoresearch

Autoresearch — Plugin Skill Self-Improvement

Usage

Iron Laws

Wrapper Script Commands

Core Loop (ONE iteration)

Step 1: Read State

Step 2: Select Target

Step 3: Read + Propose

Step 4: Apply + Evaluate

Step 5: Keep or Revert

Step 6: Ideas Backlog

Step 7: Continue or Stop

References

Más de este repositorio

Más de este repositorio

Autoresearch — Plugin Skill Self-Improvement

Usage

Iron Laws

Wrapper Script Commands

Core Loop (ONE iteration)

Step 1: Read State

Step 2: Select Target

Step 3: Read + Propose

Step 4: Apply + Evaluate

Step 5: Keep or Revert

Step 6: Ideas Backlog

Step 7: Continue or Stop

References