Exécutez n'importe quel Skill dans Manus
en un clic

Exécutez n'importe quel Skill dans Manus en un clic

$pwd:

h-reason

Name: H Reason
Author: m0n0x41d

// Think before building. Use when the user asks to reason about, analyze, evaluate, compare options, make an architecture decision, choose between approaches, think through a problem, or assess trade-offs. Also use when the user asks 'why did we...', 'should we...', 'what are our options', 'is this the right approach', or wants to frame/reframe a problem.

Exécuter dans Manus

$ git log --oneline --stat

stars:1 335

forks:105

updated:13 mai 2026 à 16:13

SKILL.md

readonly

package.json

"author": "m0n0x41d"

"repository": "m0n0x41d/haft"

Ouvrir le dépôt GitHub Voir les dépôts du créateur

$ install --global

$ download --local

Exécuter dans Manus

$ useful --forSOC

Développeurs de logicielsProfessions informatiques et mathématiques15-1252L4

Exécutez n'importe quel Skill en un clic

name	h-reason
description	Think before building. Use when the user asks to reason about, analyze, evaluate, compare options, make an architecture decision, choose between approaches, think through a problem, or assess trade-offs. Also use when the user asks 'why did we...', 'should we...', 'what are our options', 'is this the right approach', or wants to frame/reframe a problem.
argument-hint	[problem, decision, architecture question, trade-off, or 'what's stale?']

Haft Reasoning — Think Before You Build

Haft helps with 5 engineering jobs: Understand, Explore, Choose, Execute, Verify. Plus Note for quick micro-decisions. This skill activates structured engineering reasoning powered by these modes.

When to use: any non-trivial engineering question. Architecture choices, library selection, API design, data model changes, infrastructure decisions, process changes. Also: "think about", "reason about", "evaluate", or "compare" anything significant.

When NOT to use: obvious bug fixes, formatting, tiny refactors with clear acceptance.

Context-aware entry — how much the agent drives

Before doing anything, assess the user's intent. The 5 engineering modes (Understand/Explore/Choose/Execute/Verify) describe WHAT you're doing. The 4 interaction modes below describe HOW MUCH the agent drives. They're orthogonal.

Direct response / direct action

Trigger: "think about X", "what do you think about X", "analyze X", "is this the right approach?", "what are our options?", "save this as md", "make this a ranked list", "turn this into a checklist", "move this to .context", "summarize what you found"

Reason through the problem or do the direct artifact work with normal tools. Do not call Haft MCP tools unless the user explicitly asks to persist something. "Use FPF in your thinking" means reasoning discipline, not artifact workflow.

Research / prepare-and-wait

Trigger: "/h-reason [topic], prepare for framing", "let's think about X before deciding", "I want to reason through X"

Gather context (read code, search existing decisions, research). Present findings. Stop and wait for the user to decide the next step.

Delegated reasoning

Trigger: "/h-reason [topic], go ahead", "work through the options and bring me a recommendation", natural-language delegation like "do it" / "go ahead"

Drive frame → explore → compare in one pass. Do not stop after frame. Do not require a manual /h-explore or /h-compare step. Stop after compare, show the Pareto front, and ask the human to choose. The Transformer Mandate applies at the Choose → Execute boundary: the agent may frame/explore/compare when delegated, but the human still chooses before /h-decide.

Autonomous execution

Trigger: "/h-reason [topic] and implement" ONLY when autonomous mode is already enabled for the session (Ctrl+Q / interaction=autonomous).

Full cycle including decide + implement without pauses. If autonomous mode is OFF, phrases like "figure out the best approach and do it" or "fix everything" are NOT enough to skip the compare → decide pause. Treat them as delegated reasoning.

If unclear: default to research / prepare-and-wait. Never default to autonomous execution.

What you have

Haft tools (MCP) — persist reasoning as artifacts

Tool	What it does	Slash command
`haft_note`	Record micro-decisions with rationale validation	`/h-note`
`haft_problem`	Frame problems and persist characterization dimensions on the ProblemCard	`/h-frame`, `/h-char`
`haft_solution`	Explore variants, compare and identify Pareto front	`/h-explore`, `/h-compare`
`haft_decision`	Decide with formal rationale; record measurement results	`/h-decide`
`haft_commission`	Create/list/claim WorkCommissions for execution harnesses	`/h-commission`
`haft_refresh`	Detect stale decisions, manage lifecycle	`/h-verify`
`haft_query`	Search, status dashboard, file-to-decision lookup, FPF spec lookup, deterministic audience projections	`/h-search`, `/h-status`, `/h-view`

FPF spec lookup — prefer MCP when available

haft_query(action="fpf", query="A.6")
haft_query(action="fpf", query="How do I route boundary statements?", limit=3, explain=true)
haft_query(action="fpf", query="Boundary Norm Square", full=true)

In shell-only environments: haft fpf search "<query>" with optional --full.

Feature maturity

Computed: Pareto front (non-dominated set from comparison data), Refresh (valid_until expiry detection). Tracked (persisted but not computed): problem framing, characterization dimensions, WLNK label on variants, stepping-stone flag, CL on evidence, measurement records. Textual (stored as rules, not enforced): parity plan — you ensure it yourself.

Key rule: don't describe textual features as if they compute something. "WLNK bounds quality" means the user identified what bounds quality, not that the system calculated it. These skill instructions are L1 — detection and questions; the tools (L2) persist and enforce. Don't treat a prompt-based check as verified evidence.

The 5 engineering modes

Understand — "What's actually going on?"

Frame before solving. Problem quality dominates solution speed in outcome.

Framing protocol — ask these questions before recording:

"What observation doesn't fit?" — the signal, not the assumed cause. "Webhook retries hit 15%" not "we need a new queue."
"What have you already tried?" — avoids re-treading dead ends.
"Who owns this problem?" — a specific person with authority, not "the team."
"What would solved look like?" — measurable acceptance, not "it should be better."
"What constraints are non-negotiable?" — hard limits no variant can violate.
"How reversible is this? What's the blast radius?" — determines depth (tactical vs standard vs deep).
"What should we watch but NOT optimize?" — Anti-Goodhart indicators.

Problem typing: Before exploring, classify:

Optimization — working system, want it better on a known dimension
Diagnosis — something's broken, don't know why
Search — need to find something that doesn't exist yet
Synthesis — need to combine existing elements into something new

Each type suggests different exploration strategies and ceremony levels.

Language precision triggers:

If the signal uses ambiguous terms (service, process, function, quality, component), unpack to precise meaning before recording. "Which service — the OAuth provider, the token endpoint, or the session store?"
If the user says "it should do X," clarify: hard constraint, preference, or observation?
If two stakeholders use the same word differently, resolve before framing.

Characterization: Define the characteristic space before evaluating options.

State the selection policy BEFORE seeing results
Ensure parity — same inputs, same scope, same budget across all options
Keep it multi-dimensional — never collapse to a single score unless the fold is explicit

Persist with: haft_problem(action="frame"), haft_problem(action="characterize") Commands: /h-frame, /h-char Goldilocks check: When multiple problems are active, use haft_problem(action="select") to pick the one in the growth zone.

Explore — "What are the real options?"

>=2 variants that differ in kind, not degree (3+ preferred)
Each variant gets a weakest link label — what bounds its quality
Mark stepping stones — options that open future possibilities even if not optimal now

Diversity self-check: Before submitting variants, verify they aren't disguised copies of the same approach. If all variants are cache-based, at least one should explore a genuinely different direction (e.g., restructure data flow to eliminate caching need).

Hypothesis discipline: When investigating during exploration, separate what you observe from what you hypothesize. State hypotheses explicitly. "I observe high latency on the /auth endpoint. Hypothesis: the bottleneck is the token validation call, not the database query."

Past solutions: Use haft_solution(action="similar") to search for relevant precedents from past or cross-project work.

Persist with: haft_solution(action="explore") Command: /h-explore

Choose — "Which is better — and should I even decide now?"

Probe-or-commit gate: Before jumping to comparison, assess readiness:

Are all key dimensions covered with data for all variants?
Do the variants span genuinely different approaches?
Is there a specific investigation that could change the ranking?

If 1+2 yes and 3 no → commit to comparison. If 3 yes → probe — name the specific next investigation, what comparison defect it repairs, and estimated cost. If 2 no → widen — explore more variants first. If the burden shifted (this isn't really a choice problem) → reroute to Understand.

Fair comparison:

Identify the non-dominated set (Pareto front) — variants not strictly worse on all dimensions
Apply the pre-declared selection policy
Constraint elimination: Constraints are hard limits. A variant that violates a constraint dimension is eliminated before Pareto computation, not merely scored lower.
Record what was compared, what won, and why

Language precision in comparison: If comparison dimensions use subjective terms (maintainable, simple, scalable), ask for measurable specifics before scoring. "Maintainability" could mean: fewer dependencies? Lower cyclomatic complexity? Team already knows the stack? These point to different winners.

Reroute upstream: If comparison reveals that the framing was ambiguous, that the problem type was wrong, or that variants address different problems → reroute to Understand. Don't force a choice on a broken comparison basis.

Transformer Mandate: The human chooses at the Choose → Execute boundary. Agent may frame/explore/compare when delegated, but the human confirms the selection before recording a decision. Exception: autonomous mode.

Persist with: haft_solution(action="compare") Command: /h-compare

Execute — "Let's do it — carefully."

The decision record should contain:

Invariants — what MUST hold at all times
Pre-conditions — what must be true before implementation begins
Post-conditions — checklist for implementation completion
Admissibility — what is NOT acceptable
Evidence requirements — what proof the verification loop must gather
Predictions — falsifiable claims with claim, observable, and threshold
Refresh triggers — concrete signals that should reopen the decision
Valid-until date — when to re-evaluate automatically
Weakest link — what most plausibly breaks this choice

Async evidence: When a claim can't be verified immediately (e.g., "error rate drops 30% after 1 week of production"), add a verify_after date to the prediction. This surfaces automatically in Verify mode when the date passes.

Baseline: After implementation, call haft_decision(action="baseline") to snapshot affected file hashes for drift detection.

Projections for handoff: Use projections when reasoning already exists and you need a boundary-crossing view:

/h-view brief for delegated-agent implementation handoff
/h-view rationale for PR/change rationale
/h-view audit for evidence/assurance review
/h-view compare for the current Pareto/trade-off surface

Projections render the same artifact graph for a different audience — no new semantics.

Persist with: haft_decision(action="decide"), haft_decision(action="baseline"), haft_commission(...) Commands: /h-decide, /h-view, /h-commission

Verify — "Did it work? Is it still valid?"

Post-implementation verification:

Baseline first — if the decision has affected_files, call haft_decision(action="baseline")
Verify inductively — run tests, read affected files, or ask the user to confirm
Attach evidence — haft_decision(action="evidence") with type/verdict/CL
Record measurement — haft_decision(action="measure") with findings and verdict

Calling measure from memory without verification is a violation. Calling measure without baseline degrades evidence to CL1 self-evidence. Evidence without measure doesn't close the loop.

Ongoing verification:

Claim status: Which predictions were verified, which aren't?
Drift detection: Files changed since baseline → classify as cosmetic / incidental / material
Staleness scan: Expired valid_until, decayed evidence
Pending verifications: Claims with verify_after dates that have passed — surface proactively

Entity preservation: When summarizing verification results, preserve entity count and identity. Don't merge "5 claims" into "several verified items."

Lifecycle management:

haft_refresh(action="waive") — extend validity with evidence
haft_refresh(action="reopen") — start new problem cycle from a decision
haft_refresh(action="supersede") — replace one artifact with another
haft_refresh(action="deprecate") — archive as no longer relevant

When reopening a stale decision, the new ProblemCard inherits lineage.

Persist with: haft_decision(action="measure/evidence"), haft_refresh(...) Commands: /h-verify, /h-status

The loop and legitimate reroutes

Understand → Explore → Choose → Execute → Verify
    ↑            ↑        ↑        ↑        │
    └────────────┴────────┴────────┴────────┘

Choose → Understand — comparison reveals bad framing or ambiguity
Explore → Understand — exploration reveals wrong problem type
Execute → Choose — implementation reveals chosen option doesn't work
Verify → any — verification shows problem was misframed, option failed, or comparison basis invalidated

Reroutes are not failures. They prevent compounding errors downstream. The simple forward arrow is the common case; reroutes are the important case.

Fast path: Note

Not everything needs the 5-mode ceremony. Quick micro-decisions go to haft_note: captures what was decided and why, stays in the artifact graph for future recall and conflict detection.

Depth calibration

Depth	When	What changes
note	Micro-decisions during coding	`/h-note` — done
tactical	Reversible, <2 weeks blast radius	Compact Understand + Explore + Choose, standard Execute
standard	Most architectural decisions	Full Understand with characterization, 3+ variants in Explore, full Choose with Pareto
deep	Irreversible, security, cross-team	All standard + rich parity rules, runbook, refresh triggers, evidence requirements

Default is tactical. Escalate when: hard to reverse, multiple teams affected, or problem framing is unclear. Depth changes how much evidence and structure you show, not whether the modes exist.

Key distinctions (always maintain)

Object ≠ Description ≠ Carrier — the system, its spec, and its code are three things
Plan ≠ Reality — a model is not the thing it models
Target system ≠ Enabling system — what must work vs who builds/maintains it
Design-time ≠ Run-time — stored reasoning artifacts ≠ verified system behavior

Proactive agent behavior

Auto-capture mode (always on)

Record notes automatically when you observe decisions in conversation. Don't ask — just call haft_note. Validation rejects bad notes.

Triggers: "I'm going with X", "let's use Y instead of Z", config choice, library pick, approach selection. Do NOT auto-capture: formatting choices, import ordering, variable naming.

Proactive checks

At session start: call haft_query(action="status") to surface stale decisions and active problems
When code changed after a decision: read git diff, classify as cosmetic / incidental / material
If status returns zero artifacts on a project with code: suggest /h-onboard
When dev works on files: call haft_query(action="related", file="path") to find linked decisions
When dev says "let's just do X" without rationale: ask "why X?" before recording
When auto-captured note conflicts with active decision: surface the conflict

User steering

Slash commands are course corrections, not mandatory triggers:

In research / prepare-and-wait mode, the user triggers /h-frame, /h-explore, etc. when ready.
In delegated reasoning mode, natural-language delegation continues through modes without extra commands.
Direct operational requests stay direct. Don't escalate them into /h-frame just because they touch engineering content.

NavStrip interpretation

The ── Haft ── strip appended to tool responses shows current state and available actions.

"Available:" = menu for the user, not instructions for the agent.
Mode determines ceremony depth, not whether Choose may bypass Explore.
In research + wait mode, do not auto-advance. In delegated reasoning, you may advance through modes without extra commands.

FPF pattern retrieval

FPF pattern hints are auto-injected into reasoning tool responses — you see them for free when you call haft_problem, haft_solution, haft_decision. Each hint lists the pattern IDs relevant to the current phase with short labels.

When a hint names a pattern you want to apply, retrieve the full text with haft_query(action="fpf", query="<PATTERN-ID>") (e.g. FRAME-01, CHR-02). Cite specific patterns when applying them. Do not pre-fetch the whole phase before every reasoning step — that is ceremony creep.

FPF Micro-Patterns (always-available baseline)

These compressed patterns are your floor — always in context. For full detail, use haft_query(action="fpf").

Frame

FRAME-01 Signal typing: Make the trigger explicit — anomaly, opportunity, probe, or cue. Typed signals prevent drift.
FRAME-02 Scope boundary: Declare what's in-scope AND out-of-scope. Prevents silent inflation.
FRAME-03 Acceptance criteria: What observable condition signals solved? Bridge framing to evidence.
FRAME-05 Problem typing: Classify: optimization / diagnosis / search / synthesis. Each needs different exploration.
FRAME-07 Goldilocks: Select problems in the zone of proximal development — beyond current capability but reachable.
FRAME-08 Reading checklist: Before acting on ANY incoming artifact, run 6 questions: object of talk / context / statement type / lexicon vs term / re-expression vs reinterpretation / result purpose.
FRAME-09 Strict distinction quad: Role ≠ Capability ≠ Method ≠ Work. Assigned ≠ can do ≠ should do ≠ did.

Characterize

CHR-01 Indicator roles: Every dimension = constraint (hard limit), target (optimize 1-3), observation (watch, Anti-Goodhart).
CHR-02 Pipeline: normalize > indicatorize > score > compare > select. Never average disparate scales.
CHR-04 Assurance tuple: F (formality) + G (scope) + R (reliability) + CL (congruence). Snapshot of trust.
CHR-08 L1/L2/L3 ambiguity: L1=flag vague terms. L2=persist disambiguations. L3=check entity preservation.
CHR-09 Parity plan: Equal budgets, time windows, eval protocol, data freshness across variants.
CHR-10 Boundary norm square (L/A/D/E): Decompose mixed boundary statements into Law (definition) / Admissibility (gate) / Deontics (duty) / Evidence (carrier). Split before acting.
CHR-11 Relational precision pipeline: umbrella-word → ground by relations → math lens → restored ontology → precise lexicon. Lexical fix alone isn't enough.
CHR-12 Umbrella specializations: Quality / action-invitation / service / sameness / wholeness / relation-slot / basedness — each family has its own repair.

Explore

EXP-01 Abduction: Frame prompt > generate rivals > apply filters > select prime hypothesis. Keep rivals visible.
EXP-04 WLNK per variant: "What breaks first?" Name the Achilles' heel. Focus evidence on testing it.
EXP-05 Stepping stones: Non-optimal but opens future search space. Allocate 1-2 bets.
EXP-07 Portfolio thinking: Pareto front = set of tradeoffs, not one winner. NQD guides selection.
EXP-08 NQD: Is this genuinely new? If yes, existing templates may mislead.

Compare

CMP-01 Parity: Same budget, assumptions, scope for all variants. Rigged comparison = no comparison.
CMP-02 Selection policy up front: Declare rule BEFORE scoring. Separates judgment from evaluation.
CMP-03 Pareto front: Identify non-dominated set. Dominated variants eliminated with rationale.
CMP-06 CL across options: CL3=exact context, CL2=similar, CL1=related, CL0=opposed. Low CL = lower trust.

Decide

DEC-01 Record structure: Problem frame + Decision + Rationale + Consequences. Traceable.
DEC-04 Invariants: Load-bearing constraints: before, during, after. Violation = rollback.
DEC-05 Rollback: Triggers + Steps + Blast radius + Timeline. Honest about reversibility.
DEC-06 Predictions: "If X, we see Y within Z under K." Falsifiable. Become measurement targets. Predictions are causal claims — check realizability: can you actually sample the target distribution under physical / ethical / operational constraints? See C.27 / C.28 for causal-use calculus and CounterfactualSamplingRealizabilityProfile.
DEC-08 Counterargument: Strongest genuine attack on the chosen option. Self-deception check.

Verify

VER-01 Evidence graph: Every claim anchored to evidence. Typed links + CL. No floating claims.
VER-02 Decay: valid_until on all evidence. Expired = 0.1 (weak, not absent). Epistemic debt.
VER-03 R_eff: min(Rs) - penalty(CL_min). Never average. Weakest link dominates.
VER-07 Refresh triggers: Evidence expiry, context change, WLNK failure, competing alternative.

Cross-cutting

X-WLNK: System reliability <= min(component reliabilities). Never average. Invest in weakest.
X-SCOPE: Every claim has explicit where + under what + when. "Always fast" = scope inflation.
X-TRANSFORMER: External agent decides. System doesn't self-improve. Human is the principal.
X-STATEMENT-TYPE: Every load-bearing sentence = rule / promise / explanation / gate / evidence. Mixed = L1 error; decompose.
X-FANOUT-AUDIT: On concept rename, sweep all carriers: prose + filenames + manifests + review bundles + provenance + tests + schemas. Fixed-point until clean.
X-SOURCE-RESTORATION (A.15.4): Before relying on a dashboard, generated explanation, credential view, projection output, or any agent-produced artifact as evidence — recover the project source that makes the reliance admissible. The carrier may look ready for action before the underlying source is verifiable. Object ≠ Description ≠ Carrier is detection; source restoration is the operational rule.

name	h-reason
description	Think before building. Use when the user asks to reason about, analyze, evaluate, compare options, make an architecture decision, choose between approaches, think through a problem, or assess trade-offs. Also use when the user asks 'why did we...', 'should we...', 'what are our options', 'is this the right approach', or wants to frame/reframe a problem.
argument-hint	[problem, decision, architecture question, trade-off, or 'what's stale?']