Run any Skill in Manus with one click

Get Started

adversarial-evaluator

Stars0

Forks0

UpdatedJune 20, 2026 at 11:04

Evaluates agent output adversarially before committing — prevents self-assessment bias

Installation

Install with Codex or Claude Copy this prompt, paste it into Codex, Claude, or another assistant, and let it review the skill page and install it for you.

Run Skill in Manus

Source

CleanExpo

CleanExpo/Unite-Group

View GitHub Repository View Creator Repositories

Download

Run Skill in Manus

SKILL.md

readonly

name	adversarial-evaluator
description	Evaluates agent output adversarially before committing — prevents self-assessment bias

Adversarial Evaluator Skill

When to invoke

After any agent produces output that will be committed, published, or acted on. Do NOT self-evaluate — spawn a separate evaluator instance.

Evaluation contract

The generator agent produces output in this format:

GENERATOR OUTPUT:
[the work product]

GENERATOR CLAIM:
[what the generator says was achieved]

EVALUATION CRITERIA:
[the rubric — 3-5 specific, checkable criteria]

The evaluator agent (spawned separately, no access to the generator's reasoning) must:

Read ONLY the output and criteria — not the generator's process
For each criterion: PASS / FAIL / PARTIAL with one-sentence evidence
Produce a final VERDICT: ACCEPT / REVISE / REJECT
For REVISE: list exactly what must change (no vague "improve X")
For REJECT: state the blocking failure

Usage in workspace-dispatch

After the worker agent completes a task, call this skill:

Package the output + original task as GENERATOR OUTPUT
Spawn a fresh Claude instance with only this skill + the packaged output
Do not pass the worker's chat history to the evaluator
Only ACCEPT verdict proceeds to the next task in the DAG
REVISE sends the specific changes back to the worker (max 2 revision loops)
REJECT escalates to the orchestrator

Evaluation rubrics by task type

Code implementation

Does it compile / type-check?
Does it match the spec exactly (no scope creep)?
Are there security issues (XSS, injection, exposed secrets)?
Are existing tests passing?

Content / copy

Does it satisfy the stated purpose?
Is it factually accurate (no hallucinated claims)?
Does it use the correct tone and brand voice?
Is it complete — no placeholders or [TODO] markers?

Research / analysis

Are all claims sourced or tagged [INFERENCE]/[UNCONFIRMED]?
Is the conclusion supported by the evidence presented?
Are counterarguments addressed?

Anti-patterns (evaluator must flag these)

"Looks good overall" without criterion-by-criterion evidence
Passing output that contains [TODO], [placeholder], or stub functions
Accepting output that adds scope beyond what was requested
Letting "probably works" pass — only verified pass or fail

More from this repository

same repository

handoff

CleanExpo/Unite-Group

Writes a structured session handoff so a fresh agent can continue without losing thread

2026-06-200

fable-engine

CleanExpo/Unite-Group

Turn a plain-English vision or task into a verified, build-ready spec before any code. Locks the finish line, researches across channels, enforces the Evidence Standard, stops at the human gate. Adopted from CleanExpo/Fabel-Prompt-Engineer for the Authority-Site / Pi-CEO / Unite-Group build process.

2026-06-170

ask-the-board

CleanExpo/Unite-Group

Loop through advisor profiles in knowledge/board/ and return a combined multi-persona critique of a spec, draft, or decision.

2026-06-150

fable-engine

CleanExpo/Unite-Group

Turn a plain-English vision into a verified, build-ready spec. Locks the finish line, researches across channels, enforces the Evidence Standard, and stops at the human approval gate.

2026-06-150

improve

CleanExpo/Unite-Group

Capture the user's feedback about an output and update the relevant skill or knowledge files so future outputs get sharper.

2026-06-150

ingest

CleanExpo/Unite-Group

File new articles, transcripts, and notes into knowledge/ in the right place with source metadata and an evidence tag.

2026-06-150

name	adversarial-evaluator
description	Evaluates agent output adversarially before committing — prevents self-assessment bias

Adversarial Evaluator Skill

When to invoke

After any agent produces output that will be committed, published, or acted on. Do NOT self-evaluate — spawn a separate evaluator instance.

Evaluation contract

The generator agent produces output in this format:

GENERATOR OUTPUT:
[the work product]

GENERATOR CLAIM:
[what the generator says was achieved]

EVALUATION CRITERIA:
[the rubric — 3-5 specific, checkable criteria]

The evaluator agent (spawned separately, no access to the generator's reasoning) must:

Read ONLY the output and criteria — not the generator's process
For each criterion: PASS / FAIL / PARTIAL with one-sentence evidence
Produce a final VERDICT: ACCEPT / REVISE / REJECT
For REVISE: list exactly what must change (no vague "improve X")
For REJECT: state the blocking failure

Usage in workspace-dispatch

After the worker agent completes a task, call this skill:

Package the output + original task as GENERATOR OUTPUT
Spawn a fresh Claude instance with only this skill + the packaged output
Do not pass the worker's chat history to the evaluator
Only ACCEPT verdict proceeds to the next task in the DAG
REVISE sends the specific changes back to the worker (max 2 revision loops)
REJECT escalates to the orchestrator

Evaluation rubrics by task type

Code implementation

Does it compile / type-check?
Does it match the spec exactly (no scope creep)?
Are there security issues (XSS, injection, exposed secrets)?
Are existing tests passing?

Content / copy

Does it satisfy the stated purpose?
Is it factually accurate (no hallucinated claims)?
Does it use the correct tone and brand voice?
Is it complete — no placeholders or [TODO] markers?

Research / analysis

Are all claims sourced or tagged [INFERENCE]/[UNCONFIRMED]?
Is the conclusion supported by the evidence presented?
Are counterarguments addressed?

Anti-patterns (evaluator must flag these)

"Looks good overall" without criterion-by-criterion evidence
Passing output that contains [TODO], [placeholder], or stub functions
Accepting output that adds scope beyond what was requested
Letting "probably works" pass — only verified pass or fail