| name | Science |
| description | The scientific method as a universal problem-solving algorithm — goal-first, hypothesis-plural, falsifiable experiments, honest measurement. Seven core workflows: DefineGoal, GenerateHypotheses (minimum 3 required — single-hypothesis testing is confirmation bias), DesignExperiment, MeasureResults, AnalyzeResults, Iterate, and FullCycle. Two diagnostic shortcuts: QuickDiagnosis (15-minute rule for fast debugging) and StructuredInvestigation (complex multi-factor issues). Scales across micro (TDD, minutes), meso (feature validation, hours-days), and macro (MVP launch, weeks-months). Reference files: METHODOLOGY.md (deep dive on each phase), Protocol.md (how other skills invoke Science), Templates.md (goal/hypothesis/experiment/results templates), Examples.md (worked examples across scales). Integrates with Council (hypothesis validation), Evals (measurement), Development (parallel experiment worktrees), and RedTeam (stress-test hypotheses). RootCauseAnalysis applies Science to failure investigation — pair them when investigating incidents. NOT FOR multi-angle lens passes on requirements (use IterativeDepth for pre-build exploration). USE WHEN think about, figure out, experiment, iterate, improve, optimize, hypothesis, science, full cycle, quick diagnosis, structured investigation, what might work, how do we test, what happened, analyze results. |
| effort | high |
Customization
Before executing, check for user customizations at:
~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Science/
If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.
🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)
You MUST send this notification BEFORE doing anything else when this skill is invoked.
-
Send voice notification:
curl -s -X POST http://localhost:31337/notify \
-H "Content-Type: application/json" \
-d '{"message": "Running the WORKFLOWNAME workflow in the Science skill to ACTION"}' \
> /dev/null 2>&1 &
-
Output text notification:
Running the **WorkflowName** workflow in the **Science** skill to ACTION...
This is not optional. Execute this curl command immediately upon skill invocation.
Science - The Universal Algorithm
The scientific method applied to everything. The meta-skill that governs all other skills.
The Universal Cycle
GOAL -----> What does success look like?
|
OBSERVE --> What is the current state?
|
HYPOTHESIZE -> What might work? (Generate MULTIPLE)
|
EXPERIMENT -> Design and run the test
|
MEASURE --> What happened? (Data collection)
|
ANALYZE --> How does it compare to the goal?
|
ITERATE --> Adjust hypothesis and repeat
|
+------> Back to HYPOTHESIZE
The goal is CRITICAL. Without clear success criteria, you cannot judge results.
Workflow Routing
Output when executing: Running the **WorkflowName** workflow in the **Science** skill to ACTION...
Core Workflows
| Trigger | Workflow |
|---|
| "define the goal", "what are we trying to achieve" | Workflows/DefineGoal.md |
| "what might work", "ideas", "hypotheses" | Workflows/GenerateHypotheses.md |
| "how do we test", "experiment design" | Workflows/DesignExperiment.md |
| "what happened", "measure", "results" | Workflows/MeasureResults.md |
| "analyze", "compare to goal" | Workflows/AnalyzeResults.md |
| "iterate", "try again", "next cycle" | Workflows/Iterate.md |
| Full structured cycle | Workflows/FullCycle.md |
Diagnostic Workflows
| Trigger | Workflow |
|---|
| Quick debugging (15-min rule) | Workflows/QuickDiagnosis.md |
| Complex investigation | Workflows/StructuredInvestigation.md |
Resource Index
| Resource | Description |
|---|
METHODOLOGY.md | Deep dive into each phase |
Protocol.md | How skills implement Science |
Templates.md | Goal, Hypothesis, Experiment, Results templates |
Examples.md | Worked examples across scales |
Domain Applications
| Domain | Manifestation | Related Skill |
|---|
| Coding | TDD (Red-Green-Refactor) | Development |
| Products | MVP -> Measure -> Iterate | Development |
| Research | Question -> Study -> Analyze | Research |
| Prompts | Prompt -> Eval -> Iterate | Evals |
| Decisions | Options -> Council -> Choose | Council |
Scale of Application
| Level | Cycle Time | Example |
|---|
| Micro | Minutes | TDD: test, code, refactor |
| Meso | Hours-Days | Feature: spec, implement, validate |
| Macro | Weeks-Months | Product: MVP, launch, measure PMF |
Integration Points
| Phase | Skills to Invoke |
|---|
| Goal | Council for validation |
| Observe | Research for context |
| Hypothesize | Council for ideas, RedTeam for stress-test |
| Experiment | Development (Worktrees) for parallel tests |
| Measure | Evals for structured measurement |
| Analyze | Council for multi-perspective analysis |
Key Principles (Quick Reference)
- Goal-First - Define success before starting
- Hypothesis Plurality - NEVER just one idea (minimum 3)
- Minimum Viable Experiments - Smallest test that teaches
- Falsifiability - Experiments must be able to fail
- Measure What Matters - Only goal-relevant data
- Honest Analysis - Compare to goal, not expectations
- Rapid Iteration - Cycle speed > perfect experiments
Anti-Patterns
| Bad | Good |
|---|
| "Make it better" | "Reduce load time from 3s to 1s" |
| "I think X will work" | "Here are 3 approaches: X, Y, Z" |
| "Prove I'm right" | "Design test that could disprove" |
| "Pretend failure didn't happen" | "What did we learn?" |
| "Keep experimenting forever" | "Ship and learn from production" |
Quick Start
- Goal - What does success look like?
- Observe - What do we know?
- Hypothesize - At least 3 ideas
- Experiment - Minimum viable tests
- Measure - Collect goal-relevant data
- Analyze - Compare to success criteria
- Iterate - Adjust and repeat
The answer emerges from the cycle, not from guessing.
Gotchas
- Hypothesis-test-analyze is the core loop. Don't skip the hypothesis step — going straight to testing is just trial-and-error, not science.
- Minimum 3 hypotheses before testing. Single-hypothesis testing is confirmation bias.
- Measurements must be specific and reproducible. "It seems better" is not a measurement.
- Full cycle is for systematic investigation. For quick debugging, use quick diagnosis mode.
Examples
Example 1: Quick diagnosis
User: "figure out why Surface time filters show stale items"
→ Quick diagnosis mode
→ Hypothesis: timestamp format mismatch in D1
→ Test: query D1 for actual stored format
→ Analyze: compare stored vs expected format
→ Result: ISO string vs Unix timestamp mismatch
Example 2: Full systematic investigation
User: "experiment with different prompt structures for better output"
→ Full cycle mode
→ 3+ hypotheses generated
→ Controlled experiments with measurements
→ Analysis identifies winning approach
→ Iterates until convergence
Execution Log
After completing any workflow, append a single JSONL entry:
echo '{"ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","skill":"Science","workflow":"WORKFLOW_USED","input":"8_WORD_SUMMARY","status":"ok|error","duration_s":SECONDS}' >> ~/.claude/PAI/MEMORY/SKILLS/execution.jsonl
Replace WORKFLOW_USED with the workflow executed, 8_WORD_SUMMARY with a brief input description, and SECONDS with approximate wall-clock time. Log status: "error" if the workflow failed.