원클릭으로 Manus에서 모든 스킬 실행

시작하기

team-shinchan-eval

스타8

포크2

업데이트2026년 6월 4일 07:43

Use when you need to view agent evaluation history or detect performance regressions.

설치

Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.

Manus에서 실행

출처

seokan-jeong

seokan-jeong/team-shinchan

GitHub 저장소 열기 Creator 저장소 보기

다운로드

Manus에서 실행

Eval Skill

View agent evaluations, detect regressions, and compare performance.

Usage

/team-shinchan:eval                     # All agents summary
/team-shinchan:eval --agent bo          # Single agent detail
/team-shinchan:eval --regression        # Regression report only
/team-shinchan:eval --compare           # Side-by-side comparison

Arguments

Arg	Default	Description
`--agent {name}`	(all)	Show evaluation for a specific agent
`--regression`	false	Show only agents with detected regressions
`--compare`	false	Side-by-side comparison of all agents

Process

Step 1: Run Regression Detection

Execute node src/regression-detect.js .shinchan-docs/eval-history.jsonl --format table

If --agent is provided, add --agent {name}.

If file does not exist or is empty:

No evaluation history found.
Evaluations are recorded automatically during auto-retrospective.

Step 2: Display Results

Default (all agents):

Evaluation Summary
  Agent       | Evals | Correctness | Efficiency | Compliance | Quality
  bo          |    12 |     4.2     |    4.5     |    4.0     |   4.3
  aichan      |     8 |     4.0     |    3.8     |    4.2     |   4.1
  ...

--agent (single): Show full history with trend arrows and latest notes.

--regression: Filter to only agents where has_regression is true. Show dimension, latest score, moving average, and delta.

--compare:

Agent Comparison (last 5 evaluations)
  Dimension    | bo   | aichan | buriburi | masao
  correctness  | 4.2  | 4.0    | 3.8   | 4.5
  efficiency   | 4.5  | 3.8    | 4.2   | 4.0
  compliance   | 4.0  | 4.2    | 4.0   | 3.9
  quality      | 4.3  | 4.1    | 4.3   | 4.2

Step 3: Warnings

If any regressions detected, display:

!! Regression detected for {agent} in {dimension}
   Latest: {score} | Avg: {avg} | Delta: {delta}
   Action: Review recent {agent} outputs and adjust prompts.

Important

Eval history: .shinchan-docs/eval-history.jsonl
Detection script: src/regression-detect.js
Dimensions: correctness, efficiency, compliance, quality (1-5 scale)
Moving average window: last 5 evaluations per agent

이 저장소의 다른 Skills

같은 저장소

team-shinchan-bigproject

seokan-jeong/team-shinchan

Use when you have a large-scale, multi-phase project requiring orchestrated execution.

2026-06-238

team-shinchan-ralph

seokan-jeong/team-shinchan

Use when you need persistent looping until a task is fully complete.

2026-06-238

team-shinchan-fierce-review

seokan-jeong/team-shinchan

Deterministic adversarial code review for high-stakes scope — independent per-dimension review, a non-skippable per-finding refutation, completeness + interaction critics, and a deterministic 3-lens rubric judge panel. Opt-in main-loop Workflow tier.

2026-06-238

team-shinchan-skill-feedback

seokan-jeong/team-shinchan

Use when the user wants to review accumulated skill feedback, verdict trends, or improvement candidates collected during Stage 4 retrospectives. Trigger on "show skill feedback", "스킬 피드백 보여줘", or finding which skills need /writing-skills work.

2026-06-198

team-shinchan-fierce-compete

seokan-jeong/team-shinchan

Deterministic competitive code tournament — N builders independently solve one task and return patches, an Action-Kamen judge scores them head-to-head, the winner is picked by score and applied. Opt-in main-loop Workflow tier.

2026-06-198

team-shinchan-fierce-debate

seokan-jeong/team-shinchan

Deterministic adversarial debate for high-stakes or irreversible decisions — mandatory refutation plus a scored judge panel. Opt-in main-loop Workflow tier.

2026-06-198

name	team-shinchan:eval
description	Use when you need to view agent evaluation history or detect performance regressions.
user-invocable	false

Eval Skill

View agent evaluations, detect regressions, and compare performance.

Usage

/team-shinchan:eval                     # All agents summary
/team-shinchan:eval --agent bo          # Single agent detail
/team-shinchan:eval --regression        # Regression report only
/team-shinchan:eval --compare           # Side-by-side comparison

Arguments

Arg	Default	Description
`--agent {name}`	(all)	Show evaluation for a specific agent
`--regression`	false	Show only agents with detected regressions
`--compare`	false	Side-by-side comparison of all agents

Process

Step 1: Run Regression Detection

Execute node src/regression-detect.js .shinchan-docs/eval-history.jsonl --format table

If --agent is provided, add --agent {name}.

If file does not exist or is empty:

No evaluation history found.
Evaluations are recorded automatically during auto-retrospective.

Step 2: Display Results

Default (all agents):

Evaluation Summary
  Agent       | Evals | Correctness | Efficiency | Compliance | Quality
  bo          |    12 |     4.2     |    4.5     |    4.0     |   4.3
  aichan      |     8 |     4.0     |    3.8     |    4.2     |   4.1
  ...

--agent (single): Show full history with trend arrows and latest notes.

--regression: Filter to only agents where has_regression is true. Show dimension, latest score, moving average, and delta.

--compare:

Agent Comparison (last 5 evaluations)
  Dimension    | bo   | aichan | buriburi | masao
  correctness  | 4.2  | 4.0    | 3.8   | 4.5
  efficiency   | 4.5  | 3.8    | 4.2   | 4.0
  compliance   | 4.0  | 4.2    | 4.0   | 3.9
  quality      | 4.3  | 4.1    | 4.3   | 4.2

Step 3: Warnings

If any regressions detected, display:

!! Regression detected for {agent} in {dimension}
   Latest: {score} | Avg: {avg} | Delta: {delta}
   Action: Review recent {agent} outputs and adjust prompts.

Important

Eval history: .shinchan-docs/eval-history.jsonl
Detection script: src/regression-detect.js
Dimensions: correctness, efficiency, compliance, quality (1-5 scale)
Moving average window: last 5 evaluations per agent