一键在 Manus 中运行任何 Skill

$pwd:

consensus-interpret

Name: Consensus Interpret
Author: TianGzlab

// LLM-grounded biological interpretation of a verified typed consensus run. Reads the typed run dir + the original adata, runs inline per-cluster DE, looks up markers in a bundled tissue-keyed marker DB, and asks the chair LLM to (γ) name each cluster's likely cell type with mandatory marker citations and (β) recommend top-3 next-step skills with mandatory evidence_refs. Output banner [A+I: Interpreted on verified consensus]. Failure-mode contract per ADR 0012.

在 Manus 中运行

$ git log --oneline --stat

stars:142

forks:24

updated:2026年5月20日 07:47

文件资源管理器

30 个文件

SKILL.md

readonly

related-skills.json

同仓库

sc-consensus-clustering.md

from "TianGzlab/OmicsClaw"

Multi-resolution typed consensus over sc-clustering. Fans out leiden / louvain at several resolutions in parallel, scores members by silhouette + cross-method NMI, runs kmode / weighted / LCA consensus on the surviving base clusterings, and emits a verified report carrying the mandatory A-path banner per ADR 0010.

2026-05-19142

consensus-domains.md

from "TianGzlab/OmicsClaw"

Multi-method consensus over spatial-domains. Fans out 5 methods in parallel, computes a SACCELERATOR-style base-clustering ranking, runs typed consensus (kmode / weighted / LCA), and emits a verified consensus report with the mandatory A-path banner per ADR 0010.

2026-05-19142

omics-skill-builder.md

from "TianGzlab/OmicsClaw"

Load when scaffolding a NEW OmicsClaw skill from a natural-language request — generates the skill directory layout (SKILL.md, parameters.yaml, references/, tests/) under the chosen domain. Skip when modifying an existing skill (edit its files directly) or when only routing a query (use `orchestrator`).

2026-05-15142

orchestrator.md

from "TianGzlab/OmicsClaw"

Load when routing a natural-language omics query to the correct domain skill across spatial / singlecell / genomics / proteomics / metabolomics / bulkrna domains via keyword / LLM / hybrid matching. Skip when the target skill is already known — invoke that skill directly.

2026-05-15142

skill-template.md

from "TianGzlab/OmicsClaw"

Load when copying this directory to bootstrap a new OmicsClaw v2 skill (rename, fill in, then `git add`). Skip when an existing skill already covers the request.

2026-05-12142

sc-gene-programs.md

from "TianGzlab/OmicsClaw"

Load when extracting gene programs (NMF / cNMF factorisation) and per-cell program usage scores from a non-negative scRNA AnnData. Skip when ranking marker genes per cluster (use sc-markers) or for inferring TF → target regulons (use sc-grn).

2026-05-11142

package.json

"author": "TianGzlab"

"repository": "TianGzlab/OmicsClaw"

打开 GitHub 仓库查看创作者相关仓库

$ install --global

$ download --local

在 Manus 中运行

$ useful --forSOC

数据科学家计算机与数学类职业15-2051L4

name	consensus-interpret
description	LLM-grounded biological interpretation of a verified typed consensus run. Reads the typed run dir + the original adata, runs inline per-cluster DE, looks up markers in a bundled tissue-keyed marker DB, and asks the chair LLM to (γ) name each cluster's likely cell type with mandatory marker citations and (β) recommend top-3 next-step skills with mandatory evidence_refs. Output banner [A+I: Interpreted on verified consensus]. Failure-mode contract per ADR 0012.
version	0.1.0
author	OmicsClaw
license	Apache-2.0
tags	["spatial","consensus","interpreted-layer","biology-annotation","marker-grounded","backward-proof-driven-recommendation"]
requires	["anndata","scanpy","numpy","pandas","scikit-learn"]

consensus-interpret

When to use

The user has just finished a verified typed consensus run (consensus-domains or sc-consensus-clustering) and wants the next manual step (read cross_method_nmi.csv → run spatial-de → cross-reference markers → name cell types → decide downstream skill) done automatically with falsifiable evidence binding every LLM claim.

This skill does NOT replace the typed run. It is a strictly downstream consumer: it reads <typed_run_dir>/{plan.json, consensus_labels.tsv, member_scores.csv, cross_method_nmi.csv} plus the original adata, and writes its output to a different directory under analysis://interpreted/<typed_run_id>. The verified-vs-exploratory boundary established by ADR 0010 is preserved.

Skip when:

The typed run did not produce consensus_labels.tsv (i.e. consensus-domains exited non-zero — fix the typed run first).
You are looking for a generic query → skill dispatcher; use orchestrator for that (forward direction).
You want to refine the consensus itself based on LLM judgment; that is explicitly forbidden by §11.4 "LLM never participates in statistical merging" and would be rejected by this skill's T3 invariants.

Inputs & Outputs

Input	Format	Required
Typed run directory	`--input <typed_run_dir>` (must contain `plan.json` written by `consensus-domains` / `sc-consensus-clustering`)	yes
Output directory	`--output <interpreted_dir>`	yes
Tissue hint	`--tissue {brain, immune, kidney, liver}` (selects bundled marker DB)	yes (unless `--markers` provided)
Original AnnData path	`--adata <preprocessed.h5ad>`	no — defaults to `plan.json` `input_path`
User-provided marker DB	`--markers <file.tsv>` (overrides bundled DB; same schema)	no
Disable LLM (structural-only)	`--no-llm`	no — default fails-fast on LLM unavailability
LLM seed (for self-consistency runs)	`--seed 0`	no
Top-K markers reported per cluster	`--top-k-markers 20`	no
Top-K next-step recommendations	`--top-k-next-steps 3`	no (capped at 3 by ADR 0012)
Coverage floor for T2 escalation	`--coverage-floor 0.5`	no

Output	Path	Notes
Interpreted report	`interpreted_report.md`	First line: `[A+I: Interpreted on verified consensus]` (or `[I-noLLM: ...]` if `--no-llm`). Non-configurable per ADR 0012.
Structured cell-type assignments	`interpreted_assignments.json`	machine-readable; schema below
Per-cluster DE table	`de_per_cluster.csv`	inline `scanpy.tl.rank_genes_groups` output; consensus-interpret computes this
Contradiction regions	`contradiction_regions.csv`	rows where cross_method_nmi indicates disagreement; LLM-narrated in markdown
Audit	`audit.json`	`typed_run_id`, adata checksum, marker DB used, LLM model/seed, `interpreted_namespace`, `evidence_base_namespace`

`interpreted_assignments.json` schema

{
  "schema_version": "0.1",
  "typed_run_id": "<plan.json:run_id>",
  "evidence_base_namespace": "analysis://typed/<run_id>",
  "interpreted_namespace": "analysis://interpreted/<run_id>",
  "banner": "[A+I: Interpreted on verified consensus]",
  "operator": "<from typed plan.json>",
  "clusters": [
    {
      "id": 5,
      "n_cells": 778,
      "interpretation_status": "interpreted | low_confidence | failed",
      "cell_type": "CA1 pyramidal",
      "confidence": 0.84,
      "evidence": {
        "markers": [
          {"gene": "Pvrl3", "de_rank": 1, "db_source": "panglaodb_brain", "db_celltype": "CA1 pyramidal neuron", "weight": 0.9},
          {"gene": "Wfs1",  "de_rank": 3, "db_source": "panglaodb_brain", "db_celltype": "CA1 pyramidal neuron", "weight": 0.85}
        ],
        "mean_local_purity": 0.617,
        "member_agreement": [
          {"member": "leiden_resolution-0.5", "label_overlap": 0.92},
          {"member": "leiden_resolution-1.0", "label_overlap": 0.78}
        ]
      },
      "narrative_md_anchor": "#cluster-5"
    }
  ],
  "next_steps": [
    {
      "skill": "spatial-de",
      "args_hint": "--groupby consensus_kmode --comparisons cluster_3_vs_5",
      "priority": 1,
      "evidence_refs": [
        "cross_method_nmi.csv:row=leiden_resolution-0.5,col=leiden_resolution-1.5,value=0.597"
      ],
      "reason": "Lowest pair-wise NMI in matrix; marker disambiguation between clusters 3 and 5 will resolve whether these are sub-types of CA1 or a transition zone."
    }
  ]
}

Flow

1. Preflight (T1 — fail-fast if any fail)
   ├─ Load plan.json from --input; assert schema_version + typed run integrity
   ├─ Locate adata at plan.json.input_path (or --adata override); check exists
   ├─ Load consensus_labels.tsv; assert observation column ⊆ adata.obs.index
   ├─ Resolve marker DB:
   │    --markers <path> if given;
   │    else bundled `data/markers/panglaodb_<tissue>.tsv` for --tissue;
   │    else exit 5 (MarkerDBUnavailable)
   └─ If LLM required and unreachable AND --no-llm not set → exit 6 (LLMUnavailable)

2. Per-cluster differential expression (deterministic, scanpy)
   └─ scanpy.tl.rank_genes_groups(adata, groupby=consensus_<operator>, method='wilcoxon')
       → de_per_cluster.csv with top-K markers per cluster (K=20 default)

3. Marker → cell-type lookup (deterministic, pre-LLM)
   └─ For each cluster, compute candidate cell types by ranking DB entries
       whose gene appears in the cluster's top-K markers (weighted by db.weight × 1/de_rank).

4. LLM grounded interpretation (γ + β; one call per cluster + one synthesis call)
   ├─ Prompt template embeds (per cluster):
   │    cluster_id, n_cells, top-K DE markers,
   │    DB candidate cell types (ranked),
   │    member_agreement summary, cross_method_nmi neighbors
   ├─ LLM must return JSON conforming to interpreted_assignments.json
   │   schema; mandatory evidence.markers[] with non-empty
   │   {gene, db_source, db_celltype}
   └─ After all clusters: one synthesis call to produce next_steps[]
       with mandatory evidence_refs[] (capped at top-3 by priority)

5. Invariant enforcement (T3 — fail-fast if violated)
   ├─ Every cluster.evidence.markers != []      → else exit 7
   ├─ Every next_steps[*].evidence_refs != []   → else exit 7
   └─ Banner present and matches one of two allowed values → else exit 7

6. Coverage check (T2 — escalate to T1 if floor breached)
   └─ interpretable_cluster_frac < --coverage-floor → exit 8

7. Artifact writes
   ├─ interpreted_report.md (banner enforced in format_interpreted_report)
   ├─ interpreted_assignments.json
   ├─ de_per_cluster.csv
   ├─ contradiction_regions.csv
   └─ audit.json

Failure modes (per ADR 0012)

Exit	Name	Meaning
0	success	All clusters interpreted (or `low_confidence`), invariants intact, no degradation triggered
2	argparse	CLI error
3	TypedRunInvalid	`plan.json` missing / malformed / not from a typed run
4	AdataMismatch	adata `obs` index disjoint from `consensus_labels.tsv` `observation`
5	MarkerDBUnavailable	`--tissue` not in bundled DBs and `--markers` not provided
6	LLMUnavailable	LLM endpoint unreachable and `--no-llm` not given
7	InvariantViolation	LLM violated marker-grounding or evidence-ref contract (T3)
8	CoverageBelowThreshold	< 50% of clusters interpretable (after T2 degradation)

Examples

Default usage (after a typed run completes)

oc run consensus-domains --input preprocessed.h5ad --output run1/ \
  --members banksy,graphst,leiden:resolution=0.5,leiden:resolution=1.0 \
  --non-interactive --operator kmode --seed 0

oc run consensus-interpret --input run1/ --output run1_interpreted/ \
  --tissue brain
# → run1_interpreted/interpreted_report.md begins with [A+I: ...]

CI / offline (structural-only)

oc run consensus-interpret --input run1/ --output run1_struct/ \
  --tissue brain --no-llm
# → run1_struct/interpreted_report.md begins with [I-noLLM: ...]
# → no cell-type claims, only cluster sizes / NMI summary / contradiction regions

User-provided marker DB (non-bundled tissue)

oc run consensus-interpret --input run1/ --output run1_interp/ \
  --markers ~/markers/mouse_intestine.tsv
# → bypasses --tissue requirement; uses user's custom DB

Related skills

Direction	Skill	Relationship
upstream (required)	`consensus-domains` / `sc-consensus-clustering`	produces the typed run this skill interprets
upstream (auto-chained, internal)	none — DE is computed inline with `scanpy.tl.rank_genes_groups`; we do not invoke `spatial-de` as a subprocess (see ADR 0012 rejected alternatives §4)
sibling (semantically distinct)	`orchestrator`	forward `query → skill`; this skill does backward `result → (skill, evidence)`
downstream (β suggests these)	`spatial-de`, `spatial-deconv`, `spatial-communication`, `spatial-trajectory`, etc.	next-step skills the LLM may recommend; each recommendation MUST cite specific typed-run evidence

References

ADR 0010 — typed-vs-narrative consensus runtime (boundary integrity)
ADR 0011 — typed consensus evaluation protocol (composite member score; DLPFC hero)
ADR 0012 — this skill's evaluation protocol (4-axis panel + T3 invariants)
docs/CONTEXT.md "Cross-reference: Consensus runtime" — canonical vocabulary

consensus-interpret

同仓库更多 Skills

同仓库更多 Skills

consensus-interpret

When to use

Inputs & Outputs

interpreted_assignments.json schema

Flow

Failure modes (per ADR 0012)

Examples

Default usage (after a typed run completes)

CI / offline (structural-only)

User-provided marker DB (non-bundled tissue)

Related skills

References

consensus-interpret

When to use

Inputs & Outputs

interpreted_assignments.json schema

Flow

Failure modes (per ADR 0012)

Examples

Default usage (after a typed run completes)

CI / offline (structural-only)

User-provided marker DB (non-bundled tissue)

Related skills

References

`interpreted_assignments.json` schema

`interpreted_assignments.json` schema