원클릭으로 Manus에서 모든 스킬 실행

adversarial-debate-truthseeking

스타378

포크27

업데이트2026년 6월 21일 14:06

Strategy: Dialectic engine retuned for truth-seeking, not survival. A defender steelmans a claim into its MOST falsifiable form, a critic attacks to refute it, a judge classifies the exchange into BROKEN/CORROBORATED/UNFALSIFIABLE — the judge does NOT pick a winner or score persuasiveness. Methods: Irving debate (repurposed), Toulmin argumentation, Mayo severe testing.

설치

Codex 또는 Claude로 설치 이 Prompt를 복사해 Codex, Claude 또는 다른 어시스턴트에 붙여 넣으면 Skill 페이지를 검토하고 설치를 진행할 수 있습니다.

Manus에서 실행

출처

yogsoth-ai

yogsoth-ai/de-anthropocentric-research-engine

GitHub 저장소 열기 Creator 저장소 보기

다운로드

Manus에서 실행

Adversarial Debate (Truth-Seeking Variant)

A retuning of classic critic-defender-judge debate. The classic version asks "which side argued better?" and outputs a survival/resilience verdict — a persuasiveness metric. That is wrong for research: a claim can win a debate by being slippery (un-pin-down-able) while being scientifically empty. This variant changes the roles and the judge's job so the debate produces a falsifiability classification, not a winner.

What changed from the original (multiagent-debate)

Element	Original (publication)	This variant (truth-seeking)
Defender's job	Make the claim look strong / survive	State the claim in its MOST falsifiable form — maximize what it forbids
Critic's job	Find weaknesses to score against	Find one concrete observation/computation that would refute it
Judge's verdict	Winner + resilience score	Bucket: BROKEN / CORROBORATED / UNFALSIFIABLE
Success	Artifact survives	We learn whether the claim is even testable, and if so whether it holds
Slippery claim	Wins (un-attackable)	Flagged UNFALSIFIABLE (worst outcome)

Roles

Defender (steelman-to-falsifiable)

The defender does NOT defend the claim as comfortable or vague. The defender's sole job is to restate the claim in the sharpest, most-forbidding form that is still faithful to what we actually meant. A claim of the form "the unification is elegant" is not a defendable form — it forbids nothing. "Under intervention X the law collapses to form A and NOT form B" is — it stakes out something an observation could contradict. If the defender cannot produce a forbidding form, that itself is the verdict (UNFALSIFIABLE) — the defender must report this honestly rather than retreat to vagueness.

Critic (refuter)

The critic does NOT accumulate debating points. The critic tries to produce ONE of: (a) a counterexample (a case the claim forbids but that occurs / could occur), (b) a derivation error (the claim does not follow from its stated premises), (c) a demonstration that the claim's "forbidden" set is empty (it forbids nothing → UNFALSIFIABLE). The critic must commit to a specific refuter before arguing, to prevent goalpost-shifting.

Judge (classifier, not picker)

The judge reads the exchange and assigns a bucket. The judge explicitly does NOT decide who argued better. The judge asks three questions in order:

Did the defender produce a falsifiable form? If NO → UNFALSIFIABLE.
Did the critic produce a valid refuter against that form? If YES → BROKEN (record the refutation).
If the form was falsifiable and survived a severe critic attack (one that would likely have succeeded if the claim were false) → CORROBORATED (record what was forbidden-and-held). A survival against a weak/lazy attack is NOT corroboration — the judge sends it back for a more severe round.

Execution (per claim)

debate-architect (import) — set rounds (budget), pick the single claim, brief the three roles on the truth-seeking variant explicitly.
Defender turn — produce the most-falsifiable form. Record it verbatim.
Critic turn — commit to a refuter type, then attack.
Cross-examination (import cross-examination SOP) — judge probes the critic's refuter for validity and the defender's form for hidden escape hatches (the moves a slippery claim uses to dodge: redefining terms mid-debate, retreating to a weaker claim, "it's just a model").
Severity check — judge asks: if the claim were false, would this attack have caught it? If not, escalate (more severe critic) before any CORROBORATED verdict.
Bucketing — assign and record.

Severity rubric (Mayo)

A test/attack is severe to the degree that the claim would probably have FAILED it if the claim were false. Cheap severity wins: prefer the attack that, had the claim been wrong, would most obviously have broken it. A claim that passes only weak tests stays at "untested," never "corroborated."

Anti-patterns the judge must catch

Definitional retreat — defender redefines a term to escape the counterexample. → counts as BROKEN of the original claim; the redefined claim is a NEW claim needing its own debate.
Persuasiveness creep — either side scoring rhetorical points. → judge ignores; only refuters and forbidding-content count.
Model-shield — "it's only a model, of course it's idealized" used to deflect a refuter that targets a load-bearing idealization. → does not save the claim; the idealization is the claim under test.

Output

DebateBucketing: per claim → {most-falsifiable form produced (or "none — UNFALSIFIABLE") | critic's committed refuter | cross-exam findings | severity of attack | bucket | recorded refutation-or-forbidden-content}.

이 저장소의 다른 Skills

같은 저장소

isomorphism-falsification

yogsoth-ai/de-anthropocentric-research-engine

Strategy: Attack an isomorphism claim by demanding an explicit structure-preserving map and trying to break it. Targets any multi-language claim of the form 'X ≅ Y ≅ … across N mathematical languages'. Forces the claim to either earn the word 'isomorphism' or be demoted to 'analogy'. Methods: category theory (functor/natural-iso criteria), model theory, Lakatos monster-barring.

2026-06-21378

circular-validation-audit

yogsoth-ai/de-anthropocentric-research-engine

Strategy: Run BEFORE building any validator (sandbox/simulation/benchmark). Builds a non-circularity matrix of theory-claim × validator-assumption to detect when a validator would 'confirm' a theory only because it was built on the theory's own premises. A circular validator's PASS carries zero evidential weight. Methods: Cartwright nomological machines, Winsberg sanctioning-of-simulations, tautology detection.

2026-06-21378

elegance-trap-probe

yogsoth-ai/de-anthropocentric-research-engine

Strategy: Attack a beautiful unified result on the suspicion that its beauty is the bug. Distinguishes EARNED simplicity (forbids/predicts/subsumes) from DECORATIVE simplicity (re-describes/relabels/accommodates). Directly serves the Occam aesthetic by making it a falsifiable bar, not a vibe. Methods: Sober parsimony-as-evidence, MDL, Meehl risky prediction, accommodation-vs-prediction.

2026-06-21378

falsification-first-stress-test

yogsoth-ai/de-anthropocentric-research-engine

Campaign: Truth-seeking adversarial validation for scientific research artifacts (NOT publication defense). Core question: Where have we fooled ourselves, and is each load-bearing claim even falsifiable? Win-condition is INVERTED from survival/resilience to active refutation. Methods: Popper falsificationism, Lakatos Proofs and Refutations, Mayo severe testing, Platt strong inference.

2026-06-21378

independent-convergence-audit

yogsoth-ai/de-anthropocentric-research-engine

Strategy: Attack the evidential weight of an 'independent convergence' claim. When N reasoning paths all reach the same conclusion, the confidence boost is real only if the paths were actually independent. Measures shared-prior / shared-blindspot contamination and corrects the over-counted confidence. Methods: Bayesian agreement-as-evidence, correlated-error analysis, jury theorem assumptions.

2026-06-21378

red-team-truthseeking

yogsoth-ai/de-anthropocentric-research-engine

Strategy: Systematic adversarial probing retuned for truth-seeking. Threat surface = the set of load-bearing claims. Output is NOT a resilience score and NOT a hardening list — it is, per claim, the specific observation/computation that would refute it, plus which attacks succeeded. Methods: UFMCS Key Assumptions Check (repurposed), CIA Devil's Advocacy, Platt strong inference.

2026-06-21378

name	adversarial-debate-truthseeking
description	Strategy: Dialectic engine retuned for truth-seeking, not survival. A defender steelmans a claim into its MOST falsifiable form, a critic attacks to refute it, a judge classifies the exchange into BROKEN/CORROBORATED/UNFALSIFIABLE — the judge does NOT pick a winner or score persuasiveness. Methods: Irving debate (repurposed), Toulmin argumentation, Mayo severe testing.
type	strategy
produces	DebateBucketing
dependencies	{"sops":["cross-examination","debate-architect"]}

Adversarial Debate (Truth-Seeking Variant)

What changed from the original (multiagent-debate)

Element	Original (publication)	This variant (truth-seeking)
Defender's job	Make the claim look strong / survive	State the claim in its MOST falsifiable form — maximize what it forbids
Critic's job	Find weaknesses to score against	Find one concrete observation/computation that would refute it
Judge's verdict	Winner + resilience score	Bucket: BROKEN / CORROBORATED / UNFALSIFIABLE
Success	Artifact survives	We learn whether the claim is even testable, and if so whether it holds
Slippery claim	Wins (un-attackable)	Flagged UNFALSIFIABLE (worst outcome)

Roles

Defender (steelman-to-falsifiable)

Critic (refuter)

Judge (classifier, not picker)

The judge reads the exchange and assigns a bucket. The judge explicitly does NOT decide who argued better. The judge asks three questions in order:

Did the defender produce a falsifiable form? If NO → UNFALSIFIABLE.
Did the critic produce a valid refuter against that form? If YES → BROKEN (record the refutation).
If the form was falsifiable and survived a severe critic attack (one that would likely have succeeded if the claim were false) → CORROBORATED (record what was forbidden-and-held). A survival against a weak/lazy attack is NOT corroboration — the judge sends it back for a more severe round.

Execution (per claim)

debate-architect (import) — set rounds (budget), pick the single claim, brief the three roles on the truth-seeking variant explicitly.
Defender turn — produce the most-falsifiable form. Record it verbatim.
Critic turn — commit to a refuter type, then attack.
Cross-examination (import cross-examination SOP) — judge probes the critic's refuter for validity and the defender's form for hidden escape hatches (the moves a slippery claim uses to dodge: redefining terms mid-debate, retreating to a weaker claim, "it's just a model").
Severity check — judge asks: if the claim were false, would this attack have caught it? If not, escalate (more severe critic) before any CORROBORATED verdict.
Bucketing — assign and record.

Severity rubric (Mayo)

Anti-patterns the judge must catch

Definitional retreat — defender redefines a term to escape the counterexample. → counts as BROKEN of the original claim; the redefined claim is a NEW claim needing its own debate.
Persuasiveness creep — either side scoring rhetorical points. → judge ignores; only refuters and forbidding-content count.
Model-shield — "it's only a model, of course it's idealized" used to deflect a refuter that targets a load-bearing idealization. → does not save the claim; the idealization is the claim under test.