بنقرة واحدة
team-shinchan-eval
Use when you need to view agent evaluation history or detect performance regressions.
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
القائمة
Use when you need to view agent evaluation history or detect performance regressions.
التثبيت باستخدام Codex أو Claude انسخ هذا Prompt والصقه في Codex أو Claude أو مساعد آخر ليراجع صفحة Skill ويثبّتها لك.
استنادا إلى تصنيف SOC المهني
Use when you have a large-scale, multi-phase project requiring orchestrated execution.
Use when you need persistent looping until a task is fully complete.
Deterministic adversarial code review for high-stakes scope — independent per-dimension review, a non-skippable per-finding refutation, completeness + interaction critics, and a deterministic 3-lens rubric judge panel. Opt-in main-loop Workflow tier.
Use when the user wants to review accumulated skill feedback, verdict trends, or improvement candidates collected during Stage 4 retrospectives. Trigger on "show skill feedback", "스킬 피드백 보여줘", or finding which skills need /writing-skills work.
Deterministic competitive code tournament — N builders independently solve one task and return patches, an Action-Kamen judge scores them head-to-head, the winner is picked by score and applied. Opt-in main-loop Workflow tier.
Deterministic adversarial debate for high-stakes or irreversible decisions — mandatory refutation plus a scored judge panel. Opt-in main-loop Workflow tier.
| name | team-shinchan:eval |
| description | Use when you need to view agent evaluation history or detect performance regressions. |
| user-invocable | false |
View agent evaluations, detect regressions, and compare performance.
/team-shinchan:eval # All agents summary
/team-shinchan:eval --agent bo # Single agent detail
/team-shinchan:eval --regression # Regression report only
/team-shinchan:eval --compare # Side-by-side comparison
| Arg | Default | Description |
|---|---|---|
--agent {name} | (all) | Show evaluation for a specific agent |
--regression | false | Show only agents with detected regressions |
--compare | false | Side-by-side comparison of all agents |
Execute node src/regression-detect.js .shinchan-docs/eval-history.jsonl --format table
If --agent is provided, add --agent {name}.
If file does not exist or is empty:
No evaluation history found.
Evaluations are recorded automatically during auto-retrospective.
Default (all agents):
Evaluation Summary
Agent | Evals | Correctness | Efficiency | Compliance | Quality
bo | 12 | 4.2 | 4.5 | 4.0 | 4.3
aichan | 8 | 4.0 | 3.8 | 4.2 | 4.1
...
--agent (single): Show full history with trend arrows and latest notes.
--regression:
Filter to only agents where has_regression is true.
Show dimension, latest score, moving average, and delta.
--compare:
Agent Comparison (last 5 evaluations)
Dimension | bo | aichan | buriburi | masao
correctness | 4.2 | 4.0 | 3.8 | 4.5
efficiency | 4.5 | 3.8 | 4.2 | 4.0
compliance | 4.0 | 4.2 | 4.0 | 3.9
quality | 4.3 | 4.1 | 4.3 | 4.2
If any regressions detected, display:
!! Regression detected for {agent} in {dimension}
Latest: {score} | Avg: {avg} | Delta: {delta}
Action: Review recent {agent} outputs and adjust prompts.
.shinchan-docs/eval-history.jsonlsrc/regression-detect.js