원클릭으로 Manus에서 모든 스킬 실행

loop-supervisor

스타14

포크1

업데이트2026년 6월 20일 23:00

Scaffold a SUPERVISOR.md runbook for watching a long-running agent loop in tmux. Use when asked to "supervise a loop", "watch a loop", "babysit a loop", "set up a supervisor", or when a task-loop / rl run needs someone operating the harness around it.

설치

Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.

Manus에서 실행

출처

connorads

connorads/dotfiles

GitHub 저장소 열기 Creator 저장소 보기

다운로드

Manus에서 실행

파일 탐색기

6 개 파일

SKILL.md

readonly

name	loop-supervisor
description	Scaffold a SUPERVISOR.md runbook for watching a long-running agent loop in tmux. Use when asked to "supervise a loop", "watch a loop", "babysit a loop", "set up a supervisor", or when a task-loop / rl run needs someone operating the harness around it.

Loop Supervisor

Scaffold a SUPERVISOR.md runbook that captures how to supervise a specific agent loop. A fresh agent session (or a future you) reads that file, launches or attaches to the loop in tmux, watches state files, and intervenes sparingly per a project-specific taxonomy.

The supervisor's golden rule is always the same shape: operate the harness, don't do the inner loop's work. What counts as "harness" vs "inner work" is project-specific — that's what discovery pins down.

When to use

Invoke when the user has (or is about to have):

A task-loop / ralph-loop / rl-style outer runner driving fresh agent sessions against a PROMPT.md or equivalent contract
State artefacts like run-log.md, loop-state.md, backlog.md, or a domain-specific index (hypothesis tree, frontier state, etc.)
A tmux session (or intent to launch one) where the loop runs

The output is a single file: TASKS/<name>/SUPERVISOR.md, co-located with the loop's other artefacts. Git-tracked — it's a contract that evolves with the project between runs, not ephemeral state.

What SUPERVISOR.md contains

Eight sections. Self-contained for taxonomy + golden rule so the file stands alone; references-based for mechanics (trusts the consumer has the tmux skill loaded rather than inlining capture-pane syntax).

Use references/runbook-template.md as the skeleton when generating the file.

Role + golden rule — one line setting the frame.
Mission + stop conditions — what success looks like, what exhaustion looks like, when to stop.
State files to watch — paths the supervisor reads each cycle.
Intervention taxonomy — project-specific triggers + responses.
Out-of-scope / don't-touch — the inner loop's domain.
Budgets — max interventions, poll cadence.
Escalation — Ctrl-C the loop pane, explain in the final message. No dedicated escalation file; the supervisor's last chat turn is the report.
Launch — tmux session name + launch command. Consumer runs tmux has-session -t <name> first; launches if absent, attaches if present (safety against double-start).

Process

1. Locate the loop

Find the loop this supervisor will watch:

Glob TASKS/*/ for PROMPT.md + run-log.md pairs (task-loop shape)
If multiple loops exist, ask which one
If none, ask whether to scaffold one first via /task-loop — don't try to supervise a loop that doesn't exist yet

2. Auto-infer silently

Before asking any questions, read what's already on disk:

Tmux session name — tmux ls to see if a session is running that matches the loop's directory name; if none, derive one from TASKS/<name>/ (e.g. <name>-loop)
State file paths — anything in TASKS/<name>/ that looks like state: run-log.md, loop-state.md, backlog.md, plus any INDEX.md, frontier-state.*, or similar domain-specific indices
Stop token — whatever the loop's own contract declares as its completion signal. Grep PROMPT.md / README.md / project docs for file-existence markers (e.g. FOUND_SECRET.txt) and emit tokens (e.g. __PROMISE_RL_DONE__ if the loop is task-loop / rl-shaped). Inherit what the loop already says rather than imposing a default.
Launch command — if the loop ships a run command in its README or PROMPT.md, use it verbatim. For task-loop / rl-shaped loops this typically looks like rl <N> -- cxys '<PROMPT.md path>'. Iteration count defaults to 100 unless specified or mentioned.
Existing contract / preconditions — read the loop's PROMPT.md (or equivalent) end-to-end. Absorb its declared preconditions (e.g. "build must be green before committing"), its own stop conditions, its scope boundaries. The supervisor should inherit these organically — they're not separate rules, they're already in the loop's contract.
Context priors — read AGENTS.md / CLAUDE.md / project README for terminology, conventions, existing supervision patterns.

Don't bother the user with any of this if it can be inferred.

3. Interview (grill-me style, one question per turn)

Ask only what needs human judgement. Aim for ~5 focused questions. Surface the catalogues from references/ as menus — the user picks from worked examples rather than generating from scratch.

Q1 — Golden rule (one line). What's the inner loop's domain, and what's the supervisor's domain? Frame with a motivating example: hackmonty's was "operate the harness, don't do the research"; for a source-port project it might be "keep lanes balanced and worktrees clean". Read references/golden-rule-examples.md for seed material.

Q2 — Top 3–5 project-specific triggers. Offer the catalogue from references/trigger-examples.md and ask which apply, plus any bespoke ones. Each trigger is a pair: detection signal (what you'd see in state files) + response (what the supervisor does).

Q3 — Authority stance. Pick from references/authority-stances.md: escalate-only (read + Ctrl-C + report), harness-only (edit loop contract, commit infra fixes, mark indices — never touch loop-domain artefacts), or autonomous (may commit anything, highest risk). Different projects justifiably want different stances.

Q4 — Intervention budget. Default 3 before hard stop. Lower for tight supervisors, higher for long runs where more drift is expected.

Q5 — Out-of-scope paths. What the supervisor must never touch. For harness-only: probe code, task implementations, hypothesis bodies. For autonomous: still worth listing anything sacred (secrets, migrations, production configs).

Skip any question the auto-inference in step 2 already answered.

4. Write SUPERVISOR.md

Fill in references/runbook-template.md with the interview results and inferred values. Write to TASKS/<name>/SUPERVISOR.md.

The consumer (a future agent session, potentially you, potentially a human) reads this file top-to-bottom. Make it self-contained enough that a fresh agent with just the tmux skill loaded can execute it.

5. Present and hand off

Tell the user:

Where the file was written
Which triggers + authority stance got captured
How to start supervising: open a fresh agent session (or tell the current one) to "read TASKS/<name>/SUPERVISOR.md and follow it." The runbook handles the rest — checks for the tmux session, launches if absent, attaches if present, begins supervision.

If the user wants to run it right now in the current session, just read the runbook back in and execute it. Otherwise hand off.

Composition

tmux skill — the consumer loads this for session management. SUPERVISOR.md never inlines capture-pane / send-keys — it describes what to watch, not how to read a pane.
task-loop skill — scaffolds the loop SUPERVISOR.md watches. If no loop exists yet, suggest /task-loop first.
task-plan skill — produces the backlog that task-loop consumes. Upstream of this skill by two steps.
grill-me skill — the interview flow in step 3 follows its one-question-per-turn discipline.

What this skill does not do

No runtime execution. This skill only scaffolds the runbook. Starting the loop + watching it happens when a consumer reads SUPERVISOR.md — not here.
No runtime scripts bundled. Pure prompt + reference material. The consumer uses its own tool access (tmux, filesystem, git).
No live-update of SUPERVISOR.md mid-run. The supervisor treats it as read-only during a run. New failure classes surface in the supervisor's final message; the human folds them in before the next run. Keeps the contract stable within a run.
No generic shipped taxonomy. Everything in the generated SUPERVISOR.md is project-specific. The references/ directory holds examples to pick from, not defaults to inherit.

이 저장소의 다른 Skills

같은 저장소

tmux-plugin-fork-updates

connorads/dotfiles

Safely review, sync, and locally update forked tmux plugins. Use whenever the user mentions tmux-upstream, tmux plugin forks, `prefix + U`, a `connorads/<plugin>` fork being commits behind upstream, asks whether a tmux plugin update is dodgy/compromised/safe, or asks to sync/update a forked tmux plugin. Default to review-only and ask before syncing unless the user explicitly requested automatic safe sync.

2026-06-2614

mechanical-enforcement

connorads/dotfiles

Catalogue of preferred linter rules, TypeScript flags, clippy thresholds, and architectural boundary checks for making bug classes and design drift mechanically impossible. Use when setting up linting in a new project, hardening an existing project, responding to a class of bug by encoding a rule, or deciding which linter to reach for on a given stack. Pairs with the `hk` skill which handles wiring hooks.

2026-06-2514

testing

connorads/dotfiles

Design and write effective tests for behavioural changes, bug fixes, and refactors. Use when choosing a test layer, practising TDD, picking doubles/fakes, reducing brittle or flaky tests, refactoring safely, or applying property-based, snapshot/approval, differential/metamorphic, or contract testing. For coverage, thresholds, mutation testing, fuzzing, and CI/hook enforcement, use the test-coverage skill.

2026-06-2514

homebrew-formula-authoring

connorads/dotfiles

Create, update, validate, and submit Homebrew formulae (homebrew-core, built from source). Use when the user mentions a Homebrew formula, Homebrew/homebrew-core, adding/updating a formula, brew create, building from source, a build system in a brew context (cargo/rust, go, cmake, meson, autotools/configure, make, python virtualenv, node/npm, ruby gem), resource blocks, depends_on/keg_only/uses_from_macos, the mandatory test do block, bottles, livecheck, brew bump-formula-pr, or when asked to run brew audit --new / brew test / brew style for a formula. For macOS GUI apps and prebuilt binaries use the homebrew-cask-authoring skill instead.

2026-06-2414

test-coverage

connorads/dotfiles

Systematically audit, improve, and enforce test coverage, and gate test quality in CI — across any ecosystem (TypeScript, Python, Go, Rust). Use to raise coverage, set thresholds, audit gaps, manage exclusions, merge reports, wire coverage into CI/hooks, or add mutation testing and fuzzing as quality gates. Composes with the hk skill for pre-commit enforcement. For how to design and write good tests — property-based, snapshot/approval, differential, contract, flaky-test handling — use the testing skill.

2026-06-2414

update-vendored-skills

connorads/dotfiles

Safely refresh the vendored third-party agent skills in this dotfiles repo. Use whenever the user wants to update, refresh, upgrade, or re-pull vendored skills (`skills update`), or asks to check whether a skill refresh is safe / dodgy / compromised before committing. `skills update` is an unauthenticated git clone with no quarantine, no signature, and no scan — and skill files are instructions injected into every agent session — so this skill gates each refresh by reading the diff and only auto-commits trusted-source, clean-diff updates.

2026-06-2214

name	loop-supervisor
description	Scaffold a SUPERVISOR.md runbook for watching a long-running agent loop in tmux. Use when asked to "supervise a loop", "watch a loop", "babysit a loop", "set up a supervisor", or when a task-loop / rl run needs someone operating the harness around it.

Loop Supervisor

When to use

Invoke when the user has (or is about to have):

A task-loop / ralph-loop / rl-style outer runner driving fresh agent sessions against a PROMPT.md or equivalent contract
State artefacts like run-log.md, loop-state.md, backlog.md, or a domain-specific index (hypothesis tree, frontier state, etc.)
A tmux session (or intent to launch one) where the loop runs

The output is a single file: TASKS/<name>/SUPERVISOR.md, co-located with the loop's other artefacts. Git-tracked — it's a contract that evolves with the project between runs, not ephemeral state.

What SUPERVISOR.md contains

Use references/runbook-template.md as the skeleton when generating the file.

Role + golden rule — one line setting the frame.
Mission + stop conditions — what success looks like, what exhaustion looks like, when to stop.
State files to watch — paths the supervisor reads each cycle.
Intervention taxonomy — project-specific triggers + responses.
Out-of-scope / don't-touch — the inner loop's domain.
Budgets — max interventions, poll cadence.
Escalation — Ctrl-C the loop pane, explain in the final message. No dedicated escalation file; the supervisor's last chat turn is the report.
Launch — tmux session name + launch command. Consumer runs tmux has-session -t <name> first; launches if absent, attaches if present (safety against double-start).

Process

1. Locate the loop

Find the loop this supervisor will watch:

Glob TASKS/*/ for PROMPT.md + run-log.md pairs (task-loop shape)
If multiple loops exist, ask which one
If none, ask whether to scaffold one first via /task-loop — don't try to supervise a loop that doesn't exist yet

2. Auto-infer silently

Before asking any questions, read what's already on disk:

Tmux session name — tmux ls to see if a session is running that matches the loop's directory name; if none, derive one from TASKS/<name>/ (e.g. <name>-loop)
State file paths — anything in TASKS/<name>/ that looks like state: run-log.md, loop-state.md, backlog.md, plus any INDEX.md, frontier-state.*, or similar domain-specific indices
Stop token — whatever the loop's own contract declares as its completion signal. Grep PROMPT.md / README.md / project docs for file-existence markers (e.g. FOUND_SECRET.txt) and emit tokens (e.g. __PROMISE_RL_DONE__ if the loop is task-loop / rl-shaped). Inherit what the loop already says rather than imposing a default.
Launch command — if the loop ships a run command in its README or PROMPT.md, use it verbatim. For task-loop / rl-shaped loops this typically looks like rl <N> -- cxys '<PROMPT.md path>'. Iteration count defaults to 100 unless specified or mentioned.
Existing contract / preconditions — read the loop's PROMPT.md (or equivalent) end-to-end. Absorb its declared preconditions (e.g. "build must be green before committing"), its own stop conditions, its scope boundaries. The supervisor should inherit these organically — they're not separate rules, they're already in the loop's contract.
Context priors — read AGENTS.md / CLAUDE.md / project README for terminology, conventions, existing supervision patterns.

Don't bother the user with any of this if it can be inferred.

3. Interview (grill-me style, one question per turn)

Ask only what needs human judgement. Aim for ~5 focused questions. Surface the catalogues from references/ as menus — the user picks from worked examples rather than generating from scratch.

Q4 — Intervention budget. Default 3 before hard stop. Lower for tight supervisors, higher for long runs where more drift is expected.

Skip any question the auto-inference in step 2 already answered.

4. Write SUPERVISOR.md

Fill in references/runbook-template.md with the interview results and inferred values. Write to TASKS/<name>/SUPERVISOR.md.

5. Present and hand off

Tell the user:

Where the file was written
Which triggers + authority stance got captured
How to start supervising: open a fresh agent session (or tell the current one) to "read TASKS/<name>/SUPERVISOR.md and follow it." The runbook handles the rest — checks for the tmux session, launches if absent, attaches if present, begins supervision.

If the user wants to run it right now in the current session, just read the runbook back in and execute it. Otherwise hand off.

Composition

tmux skill — the consumer loads this for session management. SUPERVISOR.md never inlines capture-pane / send-keys — it describes what to watch, not how to read a pane.
task-loop skill — scaffolds the loop SUPERVISOR.md watches. If no loop exists yet, suggest /task-loop first.
task-plan skill — produces the backlog that task-loop consumes. Upstream of this skill by two steps.
grill-me skill — the interview flow in step 3 follows its one-question-per-turn discipline.

What this skill does not do

No runtime execution. This skill only scaffolds the runbook. Starting the loop + watching it happens when a consumer reads SUPERVISOR.md — not here.
No runtime scripts bundled. Pure prompt + reference material. The consumer uses its own tool access (tmux, filesystem, git).
No live-update of SUPERVISOR.md mid-run. The supervisor treats it as read-only during a run. New failure classes surface in the supervisor's final message; the human folds them in before the next run. Keeps the contract stable within a run.
No generic shipped taxonomy. Everything in the generated SUPERVISOR.md is project-specific. The references/ directory holds examples to pick from, not defaults to inherit.