ワンクリックでManusで任意のスキルを実行

adjustment-proof

Use when proving, rejecting, or deleting backend adjustment registry entries with real LLM evidence and independent disabled-adjustment worktrees.

Manusで実行

スター27

フォーク4

更新日2026年6月2日 06:26

ソース

jleechanorg

jleechanorg/claude-commands

GitHub リポジトリを開く Creator のリポジトリを見る

インストールコマンド

ダウンロード

Manusで実行

役立つ用途SOC

ソフトウェア開発者コンピュータ・数学職15-1252L4

SKILL.md

readonly

name	adjustment-proof
description	Use when proving, rejecting, or deleting backend adjustment registry entries with real LLM evidence and independent disabled-adjustment worktrees.

Backend Adjustment Proof Protocol

Purpose

Use this skill before accepting any claim that a backend adjustment is proven necessary or safe to delete. A subagent report, PR prose, or one successful organic run is not proof. Proof requires an isolated red/green comparator for each adjustment.

This skill extends:

.claude/skills/zfc-adjuster/SKILL.md
.claude/skills/evidence-standards.md
.claude/skills/zfc-leveling-roadmap/SKILL.md for level-up/rewards changes

Verdicts

Allowed verdicts:

proven: Current-head green evidence passes, and a separate worktree with exactly one adjustment disabled fails for the matching behavior.
delete_candidate: Current-head green evidence passes, and a separate worktree with exactly one adjustment disabled also passes. This is not enough to delete by itself; it means the adjustment lacks positive necessity proof.
insufficient: Required artifacts are missing, stale, non-real, not isolated, or do not demonstrate the claimed behavior.

Do not use delete from a single negative run. Deletion requires a separate design decision after reviewing broader coverage, risk, and whether the code is now redundant.

Non-Negotiable Proof Shape

For each adjustment ID:

Start from the live PR HEAD or the exact SHA being reviewed.
Create one green worktree with the adjustment enabled and unchanged.
Create one red worktree with only that adjustment disabled.
Run the same real-server, real-LLM evidence command in both worktrees.
Save a manifest with:
- adjustment ID
- base SHA
- green worktree path and evidence directory
- red worktree path and evidence directory
- the red worktree diff that disables the adjustment
- exact commands used
- raw evidence paths
Validate the manifest with scripts/adjustment_proof_matrix.py validate.

One disabled worktree per adjustment is mandatory. A bundle named for one adjustment cannot prove or delete another adjustment.

Required Artifacts

Each green and red evidence directory must contain:

run.json
metadata.json
real HTTP request/response capture, such as http_request_responses.jsonl
real LLM capture, such as llm_request_responses.jsonl
matching git SHA provenance for the worktree that produced the evidence
checksums when the harness provides them

For LLM-interacting level-up/rewards paths, unit tests are supporting evidence only. They cannot establish proven or delete_candidate.

How To Run A Proof

Use the script to list registered adjustments:

./scripts/adjustment_proof_matrix.py list

Create a deterministic proof plan for one adjustment:

./scripts/adjustment_proof_matrix.py plan \
  --adjustment-id level_up_atomicity.suppress_unpaired_rewards_box \
  --test-command 'MCP_TEST_MODE=real MOCK_SERVICES_MODE=false ./venv/bin/python testing_mcp/core/test_level_up_organic.py --level-up-scenario single-organic'

Follow the generated commands to create independent worktrees. In the red worktree, disable only the named adjustment. Prefer the narrowest possible edit: remove or bypass the exact correction/suppression branch that implements the registry entry. Do not change prompts, test harnesses, unrelated adjusters, or scenario inputs in the red worktree.

After both runs finish, collect and validate:

./scripts/adjustment_proof_matrix.py collect \
  --adjustment-id level_up_atomicity.suppress_unpaired_rewards_box \
  --green-worktree /tmp/your-project.com/adjustment-proof/<sha>/<slug>/green \
  --red-worktree /tmp/your-project.com/adjustment-proof/<sha>/<slug>/red-disabled \
  --green-evidence /tmp/your-project.com/<branch>/<green-run>/iteration_001 \
  --red-evidence /tmp/your-project.com/<branch>/<red-run>/iteration_001 \
  --test-command 'MCP_TEST_MODE=real MOCK_SERVICES_MODE=false ./venv/bin/python testing_mcp/core/test_level_up_organic.py --level-up-scenario single-organic'

./scripts/adjustment_proof_matrix.py validate \
  /tmp/your-project.com/adjustment-proof/<sha>/<slug>/proof_manifest.json

The validator emits one of the allowed verdicts. Use that verdict in the PR body or registry update; do not upgrade it manually.

Review Rules

Fail the proof as insufficient when:

the green or red bundle is missing run.json or metadata.json
the bundle has no real LLM capture
the evidence SHA does not match the worktree SHA and staleness tolerance was not explicitly justified
the red worktree diff is empty
the red worktree diff disables more than the named adjustment
the same red run is reused for multiple adjustments
a red run passes but the report calls the adjustment proven
a red run fails for harness setup, auth, server boot, missing classifier, or mock-mode contamination instead of the adjustment behavior

If red and green both pass, record delete_candidate and recommend either deleting in a separate cleanup PR with broader evidence or keeping the registry entry as runtime evidence missing until that cleanup decision is made.

PR Body Language

Use precise wording:

proven: "Disabling <adjustment_id> alone caused <failure> in red while the same command passed on green at <sha>."
delete_candidate: "Disabling <adjustment_id> alone did not reproduce a failure under this proof command. This does not prove deletion is safe across all paths."
insufficient: "Evidence did not isolate <adjustment_id> or lacked required real-LLM artifacts."

Never write "all blockers resolved" or "all adjustments proven" unless every registered adjustment has its own validated manifest.

このリポジトリの他の Skills

同じリポジトリ

ao-operator-discipline

jleechanorg/claude-commands

Use when spawning, steering, or auditing Agent Orchestrator workers, especially when the user specifies exact AO parameters such as codex, runtime, project, claim-pr, or PR targets.

2026-06-0227

ao-worker-dispatch

jleechanorg/claude-commands

Checklist for dispatching AO workers — python venv, commit discipline, branch drift, and post-push CodeRabbit verification

2026-06-0227

claw-dispatch

jleechanorg/claude-commands

Use when dispatching work through the Hermes gateway with /claw, especially when the task may resolve slash commands or hand off into AO worker orchestration.

2026-06-0227

code-standards

jleechanorg/claude-commands

Dispatch independent adversarial reviews for ZFC, ZFC leveling, and root-cause-first without duplicating their standards.

2026-06-0227

dark-factory

jleechanorg/claude-commands

Run the Dark Factory DOT pipeline runner against a goal. Slash command: /factory. Implements StrongDM's Attractor pattern as an external Python runner — .dot files are the versioned artifact, sealed holdouts live in a separate repo, every step is recorded to CXDB, and the Healer clusters failures into diagnoses. Use when you want the goal_harness idea executed as a reproducible external pipeline instead of in-Claude subagent dispatch.

2026-06-0227

disk-audit

jleechanorg/claude-commands

Run disk usage analysis and cleanup preview on the local Mac. Always validate snapshot coverage before quoting. Never delete without user approval.

2026-06-0227

name	adjustment-proof
description	Use when proving, rejecting, or deleting backend adjustment registry entries with real LLM evidence and independent disabled-adjustment worktrees.

Backend Adjustment Proof Protocol

Purpose

This skill extends:

.claude/skills/zfc-adjuster/SKILL.md
.claude/skills/evidence-standards.md
.claude/skills/zfc-leveling-roadmap/SKILL.md for level-up/rewards changes

Verdicts

Allowed verdicts:

proven: Current-head green evidence passes, and a separate worktree with exactly one adjustment disabled fails for the matching behavior.
delete_candidate: Current-head green evidence passes, and a separate worktree with exactly one adjustment disabled also passes. This is not enough to delete by itself; it means the adjustment lacks positive necessity proof.
insufficient: Required artifacts are missing, stale, non-real, not isolated, or do not demonstrate the claimed behavior.

Do not use delete from a single negative run. Deletion requires a separate design decision after reviewing broader coverage, risk, and whether the code is now redundant.

Non-Negotiable Proof Shape

For each adjustment ID:

Start from the live PR HEAD or the exact SHA being reviewed.
Create one green worktree with the adjustment enabled and unchanged.
Create one red worktree with only that adjustment disabled.
Run the same real-server, real-LLM evidence command in both worktrees.
Save a manifest with:
- adjustment ID
- base SHA
- green worktree path and evidence directory
- red worktree path and evidence directory
- the red worktree diff that disables the adjustment
- exact commands used
- raw evidence paths
Validate the manifest with scripts/adjustment_proof_matrix.py validate.

One disabled worktree per adjustment is mandatory. A bundle named for one adjustment cannot prove or delete another adjustment.

Required Artifacts

Each green and red evidence directory must contain:

run.json
metadata.json
real HTTP request/response capture, such as http_request_responses.jsonl
real LLM capture, such as llm_request_responses.jsonl
matching git SHA provenance for the worktree that produced the evidence
checksums when the harness provides them

For LLM-interacting level-up/rewards paths, unit tests are supporting evidence only. They cannot establish proven or delete_candidate.

How To Run A Proof

Use the script to list registered adjustments:

./scripts/adjustment_proof_matrix.py list

Create a deterministic proof plan for one adjustment:

./scripts/adjustment_proof_matrix.py plan \
  --adjustment-id level_up_atomicity.suppress_unpaired_rewards_box \
  --test-command 'MCP_TEST_MODE=real MOCK_SERVICES_MODE=false ./venv/bin/python testing_mcp/core/test_level_up_organic.py --level-up-scenario single-organic'

After both runs finish, collect and validate:

./scripts/adjustment_proof_matrix.py collect \
  --adjustment-id level_up_atomicity.suppress_unpaired_rewards_box \
  --green-worktree /tmp/your-project.com/adjustment-proof/<sha>/<slug>/green \
  --red-worktree /tmp/your-project.com/adjustment-proof/<sha>/<slug>/red-disabled \
  --green-evidence /tmp/your-project.com/<branch>/<green-run>/iteration_001 \
  --red-evidence /tmp/your-project.com/<branch>/<red-run>/iteration_001 \
  --test-command 'MCP_TEST_MODE=real MOCK_SERVICES_MODE=false ./venv/bin/python testing_mcp/core/test_level_up_organic.py --level-up-scenario single-organic'

./scripts/adjustment_proof_matrix.py validate \
  /tmp/your-project.com/adjustment-proof/<sha>/<slug>/proof_manifest.json

The validator emits one of the allowed verdicts. Use that verdict in the PR body or registry update; do not upgrade it manually.

Review Rules

Fail the proof as insufficient when:

the green or red bundle is missing run.json or metadata.json
the bundle has no real LLM capture
the evidence SHA does not match the worktree SHA and staleness tolerance was not explicitly justified
the red worktree diff is empty
the red worktree diff disables more than the named adjustment
the same red run is reused for multiple adjustments
a red run passes but the report calls the adjustment proven
a red run fails for harness setup, auth, server boot, missing classifier, or mock-mode contamination instead of the adjustment behavior

PR Body Language

Use precise wording:

proven: "Disabling <adjustment_id> alone caused <failure> in red while the same command passed on green at <sha>."
delete_candidate: "Disabling <adjustment_id> alone did not reproduce a failure under this proof command. This does not prove deletion is safe across all paths."
insufficient: "Evidence did not isolate <adjustment_id> or lacked required real-LLM artifacts."

Never write "all blockers resolved" or "all adjustments proven" unless every registered adjustment has its own validated manifest.