mit einem Klick
branch-review
// Pre-push branch reviewer — runs lint+typecheck+tests, then fans /code-cleanup, /test-review, /docs-review at the branch diff, merges findings by file
// Pre-push branch reviewer — runs lint+typecheck+tests, then fans /code-cleanup, /test-review, /docs-review at the branch diff, merges findings by file
Incorporate feedback from an independent code reviewer to improve your solution. The reviewer is a different agent that analyzed your work.
Run agent benchmarks, create tasks, analyze results, and manage agents using BenchFlow. Use when asked to benchmark an AI coding agent, run a benchmark suite, create tasks, view trajectories, or compare agent performance.
Verify academic citations, detect hallucinated BibTeX entries, repair DOI metadata, and produce normalized bibliography outputs without inventing sources.
Delegate complex coding tasks to a specialist model. Use when facing algorithmic challenges, performance optimization, or tricky debugging that benefits from focused code expertise.
Run agent benchmarks, create tasks, analyze results, and manage agents using BenchFlow. Use when asked to benchmark an AI coding agent, run a benchmark suite, create tasks, view trajectories, or compare agent performance.
Two-pass subagent sweep for trivial/small refactoring wins — find candidates, then verify each before recommending
| name | branch-review |
| description | Pre-push branch reviewer — runs lint+typecheck+tests, then fans /code-cleanup, /test-review, /docs-review at the branch diff, merges findings by file |
| user-invocable | true |
Pre-push sanity check. Runs correctness gates first (Pass 0), then invokes the three review skills against the branch diff (origin/main...HEAD), collects their native outputs verbatim, appends a merge-by-file index. Use before git push / PR.
Do not auto-fix. Each sibling refuses to auto-edit; this skill preserves that.
Sibling to /launch-prep — that skill runs the full release pipeline (CHANGELOG, version bump, PR to main); this one is the lightweight pre-push counterpart.
/branch-review — default: Pass 0 + /code-cleanup + /test-review (docs skipped)/branch-review --full — includes /docs-review/branch-review --fast — skips Pass 0's test run; uses --recent on code-cleanup where useful/branch-review --code-only / --tests-only / --docs-only — single-skill shortcut/branch-review --base=<ref> — non-default base (default: origin/main)/branch-review --include-dirty — add staged+unstaged to the diff scopegit fetch origin main --quiet (or <base>). Stale local main re-surfaces upstream-merged commits.git status --porcelain=2 --branch → look for in progress). Warn if working tree is dirty; offer --include-dirty to add staged+unstaged to scope, else diff is branch-only.git diff --name-only origin/main...HEAD (+ git diff --name-only HEAD if --include-dirty).--fast and print a "large branch" warning.--base=<recent-commit>), scope to subtree (--paths=src/benchflow/agents), or invoke the siblings directly.Route changed files by path:
/test-review: tests/**/test_*.py, tests/**/*_test.py/code-cleanup: src/benchflow/**/*.py/docs-review: *.md, docs/**/*, .dev-docs/**/*, src/benchflow/**/*.mdSkip silently (no routing, no findings, no warning): uv.lock, .venv/**, __pycache__/**, *.egg-info/**, dist/**, build/**, .pytest_cache/**, generated files.
Flag in rollup but do not review — config changes the author should eyeball themselves: pyproject.toml, .pre-commit-config.yaml, .github/**, .claude/**, ruff.toml, pytest.ini.
Renames: if git diff -M --name-status shows R100 (pure rename, no content delta) → surface as rename, route nothing. R<100 → route the new path normally.
Test-orphan detection: if src/benchflow/X.py changed but tests/test_X.py (or tests/**/test_X.py) did not, include the matching test file in the test partition. Benchflow's test layout is flat under tests/ with some subdirs — glob for test_<basename>.py rather than assuming co-location. Do not extrapolate beyond direct name match.
--fast)Run in parallel Bash calls. CI gates all four per AGENTS.md, so this skill mirrors CI:
.venv/bin/ruff format --check src/ tests/ — formatting..venv/bin/ruff check src/ tests/ — lint..venv/bin/ty check src/ — typecheck (whole-src, cheap)..venv/bin/python -m pytest <changed test files + orphan-detected siblings> — scoped to the test partition, not full suite. Unit only; do not pass -m live (requires Docker + API key, not appropriate for a pre-push gate).With --fast: skip the pytest run only. Keep ruff + ty — they're fast enough to always run.
If any gate fails, surface at the top of the report as BLOCKERS and short-circuit the review: there's no point polishing a branch that doesn't lint, compile, or pass tests. Author fixes those, then re-runs.
Invoke the enabled siblings via the Skill tool in the main thread, one after another. Pass each the space-separated partition as the argument. Each sibling already runs its internal subagent fan-out in parallel, so within-skill concurrency is preserved; the orchestrator just waits on each skill's native report.
Example invocations:
Skill("code-cleanup", "<src paths>")Skill("test-review", "<test paths>")Skill("docs-review", "<doc paths>") (only if --full)With --fast, prefer Skill("code-cleanup", "--recent") when the branch touches >10 src files — it caps subagent work. test-review and docs-review take explicit path lists; pass the partition directly.
Collect the three native outputs verbatim. Do not summarize, rewrite, or translate their vocabularies — each skill's bucketed structure (cleanup's 7 categories, test-review's delete/collapse/loosen/add, docs-review's drift/stale/duplicate) encodes its calibration. Preserving it lets the author re-use their mental model from direct invocations.
[If BLOCKERS from Pass 0]
## Blockers
- ruff format: <files needing reformat>
- ruff check: <rule violations, file:line>
- ty: <type errors, file:line>
- pytest: <failing test names>
[Else continue to reviews]
<verbatim code-cleanup report>
<verbatim test-review report>
<verbatim docs-review report, if --full>
## Merge-by-file index
Author-oriented cross-reference. Does NOT re-rank; lists which skills flagged which files so co-located fixes surface together.
### src/benchflow/agents/registry.py
- [cleanup] 2 findings (1 verified, 1 needs-judgment)
- [test] tests/test_registry.py: 1 delete, 1 loosen
### docs/architecture.md
- [docs] 1 blocker, 2 stale
### Flagged config (not reviewed)
- pyproject.toml, .github/workflows/ci.yml
...
End with a one-line rollup: N files reviewed · <cleanup-count> cleanup · <test-count> test · <docs-count> docs · <blocker-count> blockers.
/arch-audit directly.--fast. A clean review of uncompileable code is worse than useless.pytest -m live. Live tests need Docker + an API key; pre-push gates should be hermetic and fast.<path> argument, --recent) to re-verify just the touched files./launch-prep. If the user is cutting a release (version bump, CHANGELOG, PR to main), route them there instead.