Run any Skill in Manus with one click

$pwd:

science-review-pipeline

Name: Science Review Pipeline
Author: khughitt

// Critically review a pipeline plan against an evidence rubric — coverage, assumptions, data availability, identifiability, reproducibility, validation, scope. Use when the user wants to review or audit a pipeline before implementation.

Run Skill in Manus

$ git log --oneline --stat

stars:0

forks:0

updated:May 6, 2026 at 09:39

SKILL.md

readonly

package.json

"author": "khughitt"

"repository": "khughitt/science"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	science-review-pipeline
description	Critically review a pipeline plan against an evidence rubric — coverage, assumptions, data availability, identifiability, reproducibility, validation, scope. Use when the user wants to review or audit a pipeline before implementation.

Review Pipeline

Converted from Claude command /science:review-pipeline.

Science Codex Command Preamble

Before executing any research command:

Resolve project profile: Read science.yaml and identify the project's profile. Use the canonical layout for that profile:
- research → doc/, specs/, tasks/, knowledge/, papers/, models/, data/, code/
- software → doc/, specs/, tasks/, knowledge/, plus native implementation roots such as src/ and tests/
Load role prompt: .ai/prompts/<role>.md if present, else references/role-prompts/<role>.md.
Load the science-research-methodology and science-scientific-writing Codex skills. If native skill loading is unavailable, use codex-skills/INDEX.md to map canonical Science skill names to generated skill files and source paths.
Read specs/research-question.md for project context when it exists.
Load project aspects: Read aspects from science.yaml (default: empty list). For each declared aspect, resolve the aspect file in this order:
1. aspects/<name>/<name>.md — canonical Science aspects
2. .ai/aspects/<name>.md — project-local aspect override or addition
If neither path exists (the project declares an aspect that isn't shipped with Science and has no project-local definition), do not block: log a single line like aspect "<name>" declared in science.yaml but no definition found — proceeding without it and continue. Suggest the user either (a) drop the aspect from science.yaml, (b) author it under .ai/aspects/<name>.md, or (c) align the name with one shipped under aspects/.

When executing command steps, incorporate the additional sections, guidance, and signal categories from loaded aspects. Aspect-contributed sections are whole sections inserted at the placement indicated in each aspect file.

Check for missing aspects: Scan for structural signals that suggest aspects the project could benefit from but hasn't declared:

Signal	Suggests
Files in `specs/hypotheses/`	`hypothesis-testing`
Files in `models/` (`.dot`, `.json` DAG files)	`causal-modeling`
Workflow files, notebooks, or benchmark scripts in `code/`	`computational-analysis`
Package manifests (`pyproject.toml`, `package.json`, `Cargo.toml`) at project root with project source code (not just tool dependencies)	`software-development`

If a signal is detected and the corresponding aspect is not in the aspects list, briefly note it to the user before proceeding:

"This project has [signal] but the [aspect] aspect isn't enabled. This would add [brief description of what the aspect contributes]. Want me to add it to science.yaml?"

If the user agrees, add the aspect to science.yaml and load the aspect file before continuing. If they decline, proceed without it.

Only check once per command invocation — do not re-prompt for the same aspect if the user has previously declined it in this session.

Resolve templates: When a command says "Read .ai/templates/<name>.md", check the project's .ai/templates/ directory first. If not found, read from templates/<name>.md. If neither exists, warn the user and proceed without a template — the command's Writing section provides sufficient structure.
Resolve science CLI invocation: When a command says to run science, prefer the project-local install path: uv run science <command>. This assumes the root pyproject.toml includes science as a dev dependency installed via uv add --dev --editable "$SCIENCE_TOOL_PATH" (the distribution is science; the entry point it installs is science). If that fails (no root pyproject.toml or science not in dependencies), fall back to: uv run --with <science-plugin-root>/science science <command>

Prerequisites:

Read docs/proposition-and-evidence-model.md and docs/specs/2026-03-01-knowledge-graph-design.md for graph/entity semantics

Load the science-research-methodology Codex skill

Read the discussant role prompt from prompts/roles/discussant.md (if available)

Overview

This command performs a systematic review of an inquiry and its pipeline plan. It operates as a critical discussant — looking for weaknesses, missing evidence, and unjustified assumptions.

The review is NOT a rubber stamp. It should surface problems the user hasn't considered.

Tool invocation

All science commands below use this pattern:

uv run science <command>

Rules

MUST run structural validation first (inquiry validate)
MUST evaluate all 9 rubric dimensions
MUST be critical — surface weaknesses, don't just confirm the plan is good
MUST provide specific, actionable recommendations for each issue
MUST save review report to doc/inquiries/<slug>-review.md
SHOULD cross-reference claims against existing literature (LLM knowledge + web search)
MUST NOT change the inquiry or plan — only report findings

Workflow

Step 1: Load inquiry and plan

science inquiry show "<slug>" --format table
science inquiry validate "<slug>" --format json

Step 2: Evaluate each rubric dimension

Dimension 1: Evidence Coverage

Does every non-trivial parameter have sci:paramSource and sci:paramRef?
Are there [UNVERIFIED] markers in the inquiry doc?
Do any sci:Unknown nodes remain?

Scoring: PASS (all params sourced), WARN (some missing refs), FAIL (unsourced causal claims)

Dimension 2: Assumption Audit

For each sci:Assumption and scic:causes edge:

Is the assumption justified with evidence?
Could confounders explain the relationship?
Is the causal direction justified?

Scoring: PASS (all justified), WARN (minor gaps), FAIL (unjustified causal claims)

Dimension 3: Data Availability

For each input data source (every BoundaryIn node or data-acquisition step in the plan):

Does it resolve to a dataset:<slug> entity?
Per origin (verification gate):
- external: access.verified: true OR access.exception.mode != "". access.source_url populated when verified. access.last_reviewed within the last 12 months.
- derived: derivation.workflow_run exists; symmetric produces: edge present; derivation.inputs transitively pass.
Runtime stageability (separate gate, runs in addition to verification):
- At least one of entity.datapackage or entity.local_path is populated AND the referenced runtime file exists on disk.
consumed_by includes plan:<this-plan-file-stem>.
All eleven state invariants hold (see the spec at docs/specs/2026-04-19-dataset-entity-lifecycle-design.md).

Scoring:

PASS — all sources resolve; verification gate satisfied per origin; runtime stageability satisfied; backlink present; freshness OK; invariants hold.
WARN — stale last_reviewed (> 12 months); missing canonical plan:<stem> backlink; cached-field drift between entity and runtime (ontology_terms/license/update_cadence only); lineage drift.
FAIL — any of:
- A source does not resolve to a dataset entity.
- External access.verified: false with access.exception.mode: "".
- External access.verified: true but verification_method: "" or no last_reviewed.
- Derived missing workflow_run entity, asymmetric produces: edge, or broken transitive input chain.
- Runtime stageability fails: neither datapackage nor local_path populated, OR the referenced runtime file does not exist on disk.
- A plan references an umbrella entity (non-empty siblings:).
- Origin/block-exclusion violation (#7 or #8).
- research-package symmetry violation (#11).

Dimension 4: Identifiability

Is every BoundaryOut reachable from BoundaryIn via directed edges?
Are there disconnected components?
Can the target hypothesis actually be tested?

Scoring: PASS (fully connected), FAIL (disconnected or unreachable)

Dimension 5: Reproducibility

Are random seeds specified?
Are software versions pinned?
Are environments reproducible?

Scoring: PASS (fully specified), WARN (partial), FAIL (no reproducibility measures)

Dimension 6: Validation Criteria

Does every sci:Transformation have a sci:validatedBy check?
Is the check specific enough to catch failures?

Scoring: PASS (all steps validated), WARN (gaps), FAIL (no validation)

Dimension 7: Scope Check

Does the inquiry stay within specs/scope-boundaries.md?
Are there scope-creep risks?

Scoring: PASS (in scope), WARN (borderline), FAIL (out of scope)

Dimension 8: Integration Boundary Check

Does the plan's output format match the consuming module's input format?
Check tensor dimensions, data schemas, and API contracts across module boundaries
Verify that intermediate representations are compatible between pipeline stages
Read the actual code at integration points (model input shapes, data loader expectations, etc.)

Scoring: PASS (all boundaries verified), WARN (some unchecked), FAIL (mismatches found)

Dimension 9: Manifest Completeness

Does the workflow produce a datapackage.json manifest in its output directory?
Are all output resources listed?
Are entity cross-references specified?
Is provenance DAG included?

Scoring: PASS (complete manifest with resources + entities + provenance) / WARN (manifest present but incomplete) / FAIL (no manifest generation)

Step 3: Write review report

Save to doc/inquiries/<slug>-review.md:

# Pipeline Review: {{label}}

- **Inquiry:** {{slug}}
- **Date:** {{date}}
- **Overall:** {{PASS|WARN|FAIL}}

## Summary

{{2-3 sentence assessment}}

## Rubric Results

| Dimension | Score | Issues |
|---|---|---|
| Evidence coverage | {{score}} | {{brief}} |
| Assumption audit | {{score}} | {{brief}} |
| Data availability | {{score}} | {{brief}} |
| Identifiability | {{score}} | {{brief}} |
| Reproducibility | {{score}} | {{brief}} |
| Validation criteria | {{score}} | {{brief}} |
| Scope check | {{score}} | {{brief}} |
| Integration boundaries | {{score}} | {{brief}} |
| Manifest completeness | {{score}} | {{brief}} |

## Detailed Findings

### {{Dimension with issues}}

{{Specific findings with actionable recommendations}}

## Recommendations

1. {{Highest priority action}}
2. {{Next priority}}

## Strengths

{{What's done well}}

Update the inquiry status to reviewed.

Step 4: Present to user

Show the summary table and top recommendations. Ask if they want to:

Address the findings (modify inquiry/plan)
Accept the risks and proceed
Discuss specific findings in more depth

Important Notes

Be genuinely critical. The value is in finding problems before implementation.
Cross-check claims. Use LLM knowledge and web search to verify factual claims.
Look for circular reasoning. If A justifies B and B justifies A, flag it.
Consider failure modes. For each transformation: what happens if it fails?

Process Reflection

Reflect on the template and workflow used above.

If you have feedback (friction, gaps, suggestions, or things that worked well), report each item via:

science feedback add \
  --target "command:review-pipeline" \
  --category <friction|gap|guidance|suggestion|positive> \
  --summary "<one-line summary>" \
  --detail "<optional prose>"

Guidelines:

One entry per distinct issue (not one big dump)
If the same issue has occurred before, the tool will detect it and increment recurrence automatically
Skip if everything worked smoothly — no feedback is valid feedback
For template-specific issues, use --target "template:<name>" instead

name	science-review-pipeline
description	Critically review a pipeline plan against an evidence rubric — coverage, assumptions, data availability, identifiability, reproducibility, validation, scope. Use when the user wants to review or audit a pipeline before implementation.