원클릭으로
team-shinchan-eval
Use when you need to view agent evaluation history or detect performance regressions.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
메뉴
Use when you need to view agent evaluation history or detect performance regressions.
Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.
SOC 직업 분류 기준
Use when you have a large-scale, multi-phase project requiring orchestrated execution.
Use when you need persistent looping until a task is fully complete.
Deterministic adversarial code review for high-stakes scope — independent per-dimension review, a non-skippable per-finding refutation, completeness + interaction critics, and a deterministic 3-lens rubric judge panel. Opt-in main-loop Workflow tier.
Use when the user wants to review accumulated skill feedback, verdict trends, or improvement candidates collected during Stage 4 retrospectives. Trigger on "show skill feedback", "스킬 피드백 보여줘", or finding which skills need /writing-skills work.
Deterministic competitive code tournament — N builders independently solve one task and return patches, an Action-Kamen judge scores them head-to-head, the winner is picked by score and applied. Opt-in main-loop Workflow tier.
Deterministic adversarial debate for high-stakes or irreversible decisions — mandatory refutation plus a scored judge panel. Opt-in main-loop Workflow tier.
| name | team-shinchan:eval |
| description | Use when you need to view agent evaluation history or detect performance regressions. |
| user-invocable | false |
View agent evaluations, detect regressions, and compare performance.
/team-shinchan:eval # All agents summary
/team-shinchan:eval --agent bo # Single agent detail
/team-shinchan:eval --regression # Regression report only
/team-shinchan:eval --compare # Side-by-side comparison
| Arg | Default | Description |
|---|---|---|
--agent {name} | (all) | Show evaluation for a specific agent |
--regression | false | Show only agents with detected regressions |
--compare | false | Side-by-side comparison of all agents |
Execute node src/regression-detect.js .shinchan-docs/eval-history.jsonl --format table
If --agent is provided, add --agent {name}.
If file does not exist or is empty:
No evaluation history found.
Evaluations are recorded automatically during auto-retrospective.
Default (all agents):
Evaluation Summary
Agent | Evals | Correctness | Efficiency | Compliance | Quality
bo | 12 | 4.2 | 4.5 | 4.0 | 4.3
aichan | 8 | 4.0 | 3.8 | 4.2 | 4.1
...
--agent (single): Show full history with trend arrows and latest notes.
--regression:
Filter to only agents where has_regression is true.
Show dimension, latest score, moving average, and delta.
--compare:
Agent Comparison (last 5 evaluations)
Dimension | bo | aichan | buriburi | masao
correctness | 4.2 | 4.0 | 3.8 | 4.5
efficiency | 4.5 | 3.8 | 4.2 | 4.0
compliance | 4.0 | 4.2 | 4.0 | 3.9
quality | 4.3 | 4.1 | 4.3 | 4.2
If any regressions detected, display:
!! Regression detected for {agent} in {dimension}
Latest: {score} | Avg: {avg} | Delta: {delta}
Action: Review recent {agent} outputs and adjust prompts.
.shinchan-docs/eval-history.jsonlsrc/regression-detect.js