بنقرة واحدة
بنقرة واحدة
Use when the user wants to re-evaluate a previous arksim simulation with different metrics, thresholds, or judge model without re-running the agent. Cheaper than re-simulating.
Use when the user wants to inspect arksim evaluation results, debug specific failures turn by turn, or compare two runs to measure improvement.
Use when the user wants to generate, edit, or extend arksim test scenarios. Reads the agent's source code to derive realistic scenarios; can build regression scenarios from past failures.
Use when the user wants to simulate multi-turn conversations against an AI agent. Alias for the arksim-test skill; the canonical flow lives there.
Use when the user wants to test, simulate, or evaluate an AI agent against multi-turn scenarios (also exposed as the arksim-simulate alias). Discovers the agent, generates scenarios, runs simulation and evaluation, surfaces failures.
Use when the user wants to launch the arksim web dashboard to browse evaluation results visually rather than in CLI output.
| name | pre-review |
| description | Check your branch before opening a PR (lint, test, title validation) |
| disable-model-invocation | true |
| context | fork |
| agent | Explore |
| allowed-tools | ["Bash","Read","Grep","Glob"] |
Run local checks on the current branch before opening a pull request.
Gather branch context
BRANCH=$(git branch --show-current)
BASE="main"
git log --oneline "$BASE"..HEAD
git diff --stat "$BASE"..HEAD
Group changed files by area
arksim/), tests (tests/), docs (docs/), config, otherDraft and validate title
^(feat|fix|docs|chore|ci|build|refactor|test|perf|style|revert)(\([a-z][a-z0-9_-]*\))?!?: .+$Run ruff check
ruff check . 2>&1
Run ruff format check
ruff format --check . 2>&1
Run unit tests
pytest tests/unit/ -x --tb=short 2>&1
Check code quality in diff
.py files in the diff, verify:
# SPDX-License-Identifier: Apache-2.0 header presentfrom __future__ import annotations presentCheck changelog
[Unreleased]Print a readiness checklist:
Pre-review checklist
--------------------
[ ] Branch: <branch-name>
[x] Ruff check: PASS / FAIL (N issues)
[x] Ruff format: PASS / FAIL (N files)
[x] Unit tests: PASS / FAIL (N passed, N failed)
[x] License headers: PASS / FAIL / N/A
[x] Future annotations: PASS / FAIL / N/A
[x] Type hints: PASS / WARN / N/A
[x] Absolute imports: PASS / FAIL / N/A
[x] Changelog updated: YES / NO / N/A
[x] Suggested title: <title>
Ready to open PR: YES / NO