Autonomous research and experimentation toolkit with 10 commands. Core loop inspired by Karpathy's autoresearch — generalizes to any domain with mechanical evaluation, overnight persistence, and zero dependencies. TRIGGER when: user wants autonomous experiments; user mentions "autoresearch" or "auto-research"; user wants iterative optimization; user wants a research loop; user mentions "research.md"; user wants to iterate until some condition; user wants to optimize code, prompts, configs, or parameters iteratively; user invokes any /autoresearch:* subcommand. DO NOT TRIGGER when: user wants a one-shot answer; user wants manual step-by-step guidance; user just wants to read a single paper; user wants a simple web search.
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When the LLM (Claude, ChatGPT, Gemini, or others) needs to fill in a PDF form or programmatically process, generate, or analyze PDF documents at scale.
Core autonomous research loop. Reads research.md, proposes hypotheses, runs experiments, evaluates results mechanically, keeps improvements, discards failures, and iterates until the target metric is achieved or the iteration budget is exhausted. TRIGGER when: user invokes "autoresearch" (no subcommand); research.md exists; user wants the 5-stage loop; user wants iterative optimization overnight.
Scientific bug hunting using falsifiable hypotheses. Forms hypotheses, designs falsifying tests, eliminates candidates systematically, and logs the full investigation trail in a structured debug/ folder. TRIGGER when: user has a bug to investigate scientifically; user wants systematic root-cause analysis; user says "debug", "investigate", "root cause", "why is this failing"; user invokes /autoresearch:debug. DO NOT TRIGGER when: user wants to optimize a metric (use /autoresearch); user wants to fix a known error automatically (use /autoresearch:fix); user just wants a quick one-line answer about what a function does.
Iterative error-crusher loop that auto-stops at 0 errors. Cascade-aware: fixes dependency errors before their dependents. Refuses anti-patterns that hide errors instead of fixing them. TRIGGER when: user has errors or failures to fix iteratively; user asks to "fix all errors"; user has a failing test suite; user has compilation errors; user has linter errors; user wants systematic error elimination; user invokes /autoresearch:fix. DO NOT TRIGGER when: user wants a one-shot fix for a single obvious bug; user wants debugging guidance only; user wants code review without fixing.
7-step setup wizard that produces a complete, ready-to-run research.md without executing the research loop. Walks the user through goal, metric, search space, constraints, evaluator design, and baseline measurement, then writes the file. TRIGGER when: user wants to set up a research project; user wants to plan before running the loop; user says "plan my research"; user has a goal but no research.md; user invokes /autoresearch:plan. DO NOT TRIGGER when: research.md already exists and the user wants to run the loop; user wants a one-shot answer; user wants to debug, not optimize.
Multi-perspective deliberation engine. Gathers independent positions from diverse personas, runs cross-examination and rebuttal rounds, detects herd behavior, and synthesizes a neutral judge verdict with confidence levels. TRIGGER when: user wants multi-perspective prediction, forecasting, scenario analysis, decision analysis, "what will happen if", "should we", "predict the outcome of", structured devil's advocacy, or any question benefiting from adversarial deliberation.
Adversarial multi-round reasoning with blind-judge panel to reach rigorous conclusions. TRIGGER when: user wants rigorous reasoning or argument evaluation; user wants a decision analyzed from multiple angles; user wants devil's advocate critique; user asks "what are the strongest arguments for/against"; user wants a structured debate; user wants to avoid groupthink or anchoring; user invokes /autoresearch:reason. DO NOT TRIGGER when: user wants a simple recommendation; user wants a quick summary; user wants factual lookup; user just wants pros/cons without adversarial pressure.