Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

autoresearch-reason

Name: Autoresearch Reason
Author: wjgoarxiv

// Adversarial multi-round reasoning with blind-judge panel to reach rigorous conclusions. TRIGGER when: user wants rigorous reasoning or argument evaluation; user wants a decision analyzed from multiple angles; user wants devil's advocate critique; user asks "what are the strongest arguments for/against"; user wants a structured debate; user wants to avoid groupthink or anchoring; user invokes /autoresearch:reason. DO NOT TRIGGER when: user wants a simple recommendation; user wants a quick summary; user wants factual lookup; user just wants pros/cons without adversarial pressure.

Ejecutar en Manus

$ git log --oneline --stat

stars:16

forks:2

updated:5 de abril de 2026, 06:56

SKILL.md

readonly

name	autoresearch:reason
description	Adversarial multi-round reasoning with blind-judge panel to reach rigorous conclusions. TRIGGER when: user wants rigorous reasoning or argument evaluation; user wants a decision analyzed from multiple angles; user wants devil's advocate critique; user asks "what are the strongest arguments for/against"; user wants a structured debate; user wants to avoid groupthink or anchoring; user invokes /autoresearch:reason. DO NOT TRIGGER when: user wants a simple recommendation; user wants a quick summary; user wants factual lookup; user just wants pros/cons without adversarial pressure.
allowed-tools	["Read","Write","Edit","Bash","WebFetch","WebSearch"]

autoresearch:reason

Adversarial multi-round reasoning loop with a blind-judge panel. Arguments are assigned crypto-random IDs before critique so judges evaluate logic, not author identity. Runs until convergence or budget exhausted.

Autonomy Directive

You are an autonomous reasoning agent. Once the debate begins:

NEVER STOP to ask permission between rounds.
NEVER ASK "should I continue?" mid-debate.
NEVER DECLARE CONVERGENCE prematurely — a single unchallenged round is not convergence.
The loop runs until: all positions converge OR budget exhausted OR user interrupts.
If neither condition is true, begin the next round immediately.

Setup

Step 1 — Define the question

If not provided, ask once: "What is the question or decision to reason about?" The question must be specific enough to allow falsifiable positions. Refuse vague inputs like "think about AI" — ask for a concrete framing.

Step 2 — Set parameters

Ask if not provided:

Number of positions (default: 3). Minimum 2, maximum 5.
Max rounds (default: 4). Minimum 2.
Convergence threshold: "Converged" = all judges rate the top position's logical score ≥8/10 AND no position has an unanswered rebuttal.

Round Structure

Each round follows this exact sequence:

Phase 1 — Propose Positions (Round 1 only)

Generate N distinct positions on the question. Positions must:

Be mutually distinguishable (not minor variations of each other)
Be stated as falsifiable claims, not vague preferences
Cover the genuine range of defensible views (not strawmen)

Write each position to reason/rounds.md as: Position [PENDING-ID]: [statement]

Phase 2 — Assign Crypto-Random IDs

Assign each position a random alphanumeric ID (e.g., ARG-7F3A, ARG-2C91). Write the mapping to reason/id-map.md — this file is sealed until the end (not read during debate). Replace all position labels in reason/rounds.md with their assigned IDs. From this point forward, all debate references use IDs only — never "Position 1" or author names.

Phase 3 — Blind Critique Round

For each argument ID, write a rigorous critique:

Identify the strongest assumption and test whether it holds
Find the weakest logical link in the chain
Propose a concrete counterexample or falsifying scenario
Rate logical strength: 1–10 (where 10 = logically airtight)

Critiques reference IDs only: "ARG-7F3A assumes X, which fails when Y..." Write all critiques to reason/rounds.md under ## Round [N] — Critiques.

Phase 4 — Rebuttal Round

Each argument ID responds to the critiques it received:

Acknowledge valid critiques (update or narrow the claim)
Rebut invalid critiques with specific evidence or reasoning
A position may concede partially — update the statement if weakened by critique

Write rebuttals to reason/rounds.md under ## Round [N] — Rebuttals.

Phase 5 — Judge Evaluation

Three independent judges evaluate all arguments:

Judge A: Logic and internal consistency
Judge B: Evidence quality and falsifiability
Judge C: Practical applicability

Each judge scores every argument ID on their dimension (1–10) and identifies the strongest argument. Judges reference IDs only. Write scores to reason/rounds.md under ## Round [N] — Judgment.

Phase 6 — Convergence Check

After each round:

If top-scoring argument has score ≥8/10 from all judges AND no unanswered rebuttal exists → converged.
If max rounds reached → converged by budget.
Otherwise → begin next round (return to Phase 3 with updated positions).

Output Files

All output goes to a reason/ folder in the working directory.

File	Purpose
`reason/rounds.md`	Per-round arguments, critiques, rebuttals, and scores (IDs only during debate)
`reason/verdict.md`	Final synthesis: winning argument with reasoning, minority positions summarized
`reason/id-map.md`	Revealed at end only: maps each ID → original position label

verdict.md structure

# Verdict: [Question]
Rounds completed: [N]
Convergence: [yes/no — budget exhausted]

## Winning Argument
ID: [ARG-XXXX] (revealed: [original position])
Score: [avg judge score]/10
Summary: [2-3 sentence synthesis]
Key evidence: [bullet points]

## Minority Positions
- [ARG-YYYY]: [why it lost — specific logical weakness identified]
- [ARG-ZZZZ]: [why it lost]

## Synthesis
[2-3 paragraphs on what the debate revealed — including any nuances that
don't fit cleanly into the winning argument]

## Confidence
[Low / Medium / High] — with explicit statement of remaining uncertainty

Anti-Anchoring Protocol

The ID system exists to prevent these failure modes:

Position ordering bias — "the first argument always sounds best"
Framing bias — "the argument labeled 'conservative' gets dismissed without evaluation"
Authority bias — "the argument by the named expert wins by default"

Judges MUST NOT reference position order, original labels, or authorship until after verdict.md is written and id-map.md is revealed.

Edge Cases

Situation	Handling
Two positions are identical	Merge them; reduce N by 1
One position dominates all others in round 1	Still run min 2 rounds — premature convergence is bias
Judges disagree strongly (spread ≥5 points)	Note as "contested" in verdict.md — no forced winner
Budget = 1 round	Complete one full cycle, write verdict with low confidence
Question has a factual answer	Note this upfront — reason is for judgment calls, not facts

related-skills.json

mismo repositorio

autoresearch-skill.md

from "wjgoarxiv/autoresearch-skill"

Autonomous research and experimentation toolkit with 10 commands. Core loop inspired by Karpathy's autoresearch — generalizes to any domain with mechanical evaluation, overnight persistence, and zero dependencies. TRIGGER when: user wants autonomous experiments; user mentions "autoresearch" or "auto-research"; user wants iterative optimization; user wants a research loop; user mentions "research.md"; user wants to iterate until some condition; user wants to optimize code, prompts, configs, or parameters iteratively; user invokes any /autoresearch:* subcommand. DO NOT TRIGGER when: user wants a one-shot answer; user wants manual step-by-step guidance; user just wants to read a single paper; user wants a simple web search.

2026-04-0516

pdf.md

from "wjgoarxiv/autoresearch-skill"

Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When the LLM (Claude, ChatGPT, Gemini, or others) needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.

2026-04-0516

autoresearch.md

from "wjgoarxiv/autoresearch-skill"

Core autonomous research loop. Reads research.md, proposes hypotheses, runs experiments, evaluates results mechanically, keeps improvements, discards failures, and iterates until the target metric is achieved or the iteration budget is exhausted. TRIGGER when: user invokes "autoresearch" (no subcommand); research.md exists; user wants the 5-stage loop; user wants iterative optimization overnight.

2026-04-0516

autoresearch-debug.md

from "wjgoarxiv/autoresearch-skill"

Scientific bug hunting using falsifiable hypotheses. Forms hypotheses, designs falsifying tests, eliminates candidates systematically, and logs the full investigation trail in a structured debug/ folder. TRIGGER when: user has a bug to investigate scientifically; user wants systematic root-cause analysis; user says "debug", "investigate", "root cause", "why is this failing"; user invokes /autoresearch:debug. DO NOT TRIGGER when: user wants to optimize a metric (use /autoresearch); user wants to fix a known error automatically (use /autoresearch:fix); user just wants a quick one-line answer about what a function does.

2026-04-0516

autoresearch-fix.md

from "wjgoarxiv/autoresearch-skill"

Iterative error-crusher loop that auto-stops at 0 errors. Cascade-aware: fixes dependency errors before their dependents. Refuses anti-patterns that hide errors instead of fixing them. TRIGGER when: user has errors or failures to fix iteratively; user asks to "fix all errors"; user has a failing test suite; user has compilation errors; user has linter errors; user wants systematic error elimination; user invokes /autoresearch:fix. DO NOT TRIGGER when: user wants a one-shot fix for a single obvious bug; user wants debugging guidance only; user wants code review without fixing.

2026-04-0516

autoresearch-plan.md

from "wjgoarxiv/autoresearch-skill"

7-step setup wizard that produces a complete, ready-to-run research.md without executing the research loop. Walks the user through goal, metric, search space, constraints, evaluator design, and baseline measurement, then writes the file. TRIGGER when: user wants to set up a research project; user wants to plan before running the loop; user says "plan my research"; user has a goal but no research.md; user invokes /autoresearch:plan. DO NOT TRIGGER when: research.md already exists and the user wants to run the loop; user wants a one-shot answer; user wants to debug, not optimize.

2026-04-0516

package.json

"author": "wjgoarxiv"

"repository": "wjgoarxiv/autoresearch-skill"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Profesores de derecho, postsecundarioEducación y bibliotecas25-1112L4

Profesores de ciencias políticas, postsecundarioL4

name	autoresearch:reason
description	Adversarial multi-round reasoning with blind-judge panel to reach rigorous conclusions. TRIGGER when: user wants rigorous reasoning or argument evaluation; user wants a decision analyzed from multiple angles; user wants devil's advocate critique; user asks "what are the strongest arguments for/against"; user wants a structured debate; user wants to avoid groupthink or anchoring; user invokes /autoresearch:reason. DO NOT TRIGGER when: user wants a simple recommendation; user wants a quick summary; user wants factual lookup; user just wants pros/cons without adversarial pressure.
allowed-tools	["Read","Write","Edit","Bash","WebFetch","WebSearch"]

autoresearch:reason

Autonomy Directive

You are an autonomous reasoning agent. Once the debate begins:

NEVER STOP to ask permission between rounds.
NEVER ASK "should I continue?" mid-debate.
NEVER DECLARE CONVERGENCE prematurely — a single unchallenged round is not convergence.
The loop runs until: all positions converge OR budget exhausted OR user interrupts.
If neither condition is true, begin the next round immediately.

Setup

Step 1 — Define the question

Step 2 — Set parameters

Ask if not provided:

Number of positions (default: 3). Minimum 2, maximum 5.
Max rounds (default: 4). Minimum 2.
Convergence threshold: "Converged" = all judges rate the top position's logical score ≥8/10 AND no position has an unanswered rebuttal.

Round Structure

Each round follows this exact sequence:

Phase 1 — Propose Positions (Round 1 only)

Generate N distinct positions on the question. Positions must:

Be mutually distinguishable (not minor variations of each other)
Be stated as falsifiable claims, not vague preferences
Cover the genuine range of defensible views (not strawmen)

Write each position to reason/rounds.md as: Position [PENDING-ID]: [statement]

Phase 2 — Assign Crypto-Random IDs

Phase 3 — Blind Critique Round

For each argument ID, write a rigorous critique:

Identify the strongest assumption and test whether it holds
Find the weakest logical link in the chain
Propose a concrete counterexample or falsifying scenario
Rate logical strength: 1–10 (where 10 = logically airtight)

Critiques reference IDs only: "ARG-7F3A assumes X, which fails when Y..." Write all critiques to reason/rounds.md under ## Round [N] — Critiques.

Phase 4 — Rebuttal Round

Each argument ID responds to the critiques it received:

Acknowledge valid critiques (update or narrow the claim)
Rebut invalid critiques with specific evidence or reasoning
A position may concede partially — update the statement if weakened by critique

Write rebuttals to reason/rounds.md under ## Round [N] — Rebuttals.

Phase 5 — Judge Evaluation

Three independent judges evaluate all arguments:

Judge A: Logic and internal consistency
Judge B: Evidence quality and falsifiability
Judge C: Practical applicability

Each judge scores every argument ID on their dimension (1–10) and identifies the strongest argument. Judges reference IDs only. Write scores to reason/rounds.md under ## Round [N] — Judgment.

Phase 6 — Convergence Check

After each round:

If top-scoring argument has score ≥8/10 from all judges AND no unanswered rebuttal exists → converged.
If max rounds reached → converged by budget.
Otherwise → begin next round (return to Phase 3 with updated positions).

Output Files

All output goes to a reason/ folder in the working directory.

File	Purpose
`reason/rounds.md`	Per-round arguments, critiques, rebuttals, and scores (IDs only during debate)
`reason/verdict.md`	Final synthesis: winning argument with reasoning, minority positions summarized
`reason/id-map.md`	Revealed at end only: maps each ID → original position label

verdict.md structure

# Verdict: [Question]
Rounds completed: [N]
Convergence: [yes/no — budget exhausted]

## Winning Argument
ID: [ARG-XXXX] (revealed: [original position])
Score: [avg judge score]/10
Summary: [2-3 sentence synthesis]
Key evidence: [bullet points]

## Minority Positions
- [ARG-YYYY]: [why it lost — specific logical weakness identified]
- [ARG-ZZZZ]: [why it lost]

## Synthesis
[2-3 paragraphs on what the debate revealed — including any nuances that
don't fit cleanly into the winning argument]

## Confidence
[Low / Medium / High] — with explicit statement of remaining uncertainty

Anti-Anchoring Protocol

The ID system exists to prevent these failure modes:

Position ordering bias — "the first argument always sounds best"
Framing bias — "the argument labeled 'conservative' gets dismissed without evaluation"
Authority bias — "the argument by the named expert wins by default"

Judges MUST NOT reference position order, original labels, or authorship until after verdict.md is written and id-map.md is revealed.

Edge Cases

Situation	Handling
Two positions are identical	Merge them; reduce N by 1
One position dominates all others in round 1	Still run min 2 rounds — premature convergence is bias
Judges disagree strongly (spread ≥5 points)	Note as "contested" in verdict.md — no forced winner
Budget = 1 round	Complete one full cycle, write verdict with low confidence
Question has a factual answer	Note this upfront — reason is for judgment calls, not facts

autoresearch-reason

autoresearch:reason

Autonomy Directive

Setup

Step 1 — Define the question

Step 2 — Set parameters

Round Structure

Phase 1 — Propose Positions (Round 1 only)

Phase 2 — Assign Crypto-Random IDs

Phase 3 — Blind Critique Round

Phase 4 — Rebuttal Round

Phase 5 — Judge Evaluation

Phase 6 — Convergence Check

Output Files

verdict.md structure

Anti-Anchoring Protocol

Edge Cases

Más de este repositorio

Más de este repositorio

autoresearch:reason

Autonomy Directive

Setup

Step 1 — Define the question

Step 2 — Set parameters

Round Structure

Phase 1 — Propose Positions (Round 1 only)

Phase 2 — Assign Crypto-Random IDs

Phase 3 — Blind Critique Round

Phase 4 — Rebuttal Round

Phase 5 — Judge Evaluation

Phase 6 — Convergence Check

Output Files

verdict.md structure

Anti-Anchoring Protocol

Edge Cases