ワンクリックで
build-dod
// Use when converting a spec, requirements document, or goal statement into a Definition of Done with acceptance criteria and integration test scenarios
// Use when converting a spec, requirements document, or goal statement into a Definition of Done with acceptance criteria and integration test scenarios
Operate Kilroy Attractor pipelines end-to-end: ingest English requirements into DOT graphs, validate graph semantics, run and resume pipelines with run config files, configure provider backends (cli/api), and debug runs from logs_root artifacts and checkpoints.
Use when authoring or repairing Kilroy Attractor DOT graphs from requirements, with template-first topology, routing guardrails, and validator-clean output.
Use when bootstrapping a new project repository for Kilroy Attractor from a clean directory using existing spec, DoD, graph, and run config artifacts.
To diagnose active, stuck, or failed Kilroy Attractor runs, inspect run artifacts (`manifest.json`, `live.json`, `checkpoint.json`, `final.json`, `progress.ndjson`), resolve run IDs/log roots, identify model/provider routing, and isolate failure causes. Includes CXDB operations for launching/probing CXDB, opening the CXDB UI, and querying run context turns. This skill is useful when investigating run status, debugging retries/failures, explaining model usage, or inspecting CXDB-backed event history.
Use when authoring or repairing Kilroy run config YAML/JSON files, including DOT-to-provider backend alignment and runtime policy defaults.
Use when preparing a Kilroy release — writing release notes, tagging, and publishing via goreleaser on GitHub.
| name | build-dod |
| description | Use when converting a spec, requirements document, or goal statement into a Definition of Done with acceptance criteria and integration test scenarios |
A DoD converts a spec into pass/fail gates. Its power is in integration tests — scenarios that prove the deliverable works by exercising it the way a user would.
Tests aren't there to be passed. They're there to prove results. Verify the deliverable through integration scenarios that exercise it end-to-end, not through unit tests that verify internals.
When this skill is used inside an Attractor run, scratch outputs should be written under .ai/runs/$KILROY_RUN_ID/.... Root .ai is not implicitly ingested.
Use the graph in two passes:
Each AC is a single, testable assertion using observable language: "exists", "returns", "displays", "produces", "exits 0".
Group by concern (e.g. Build, Output, Behavior, Integration). Number hierarchically: AC-1.1, AC-1.2, AC-2.1.
ACs describe what must be true. They are proven by integration test scenarios, not by individual unit tests.
Integration tests are the primary verification mechanism. Each scenario exercises the delivered artifact directly, proving multiple acceptance criteria simultaneously.
When a digraph exists, write scenarios around high-level intent coverage, not exhaustive graph traversal. Prefer:
Test the delivered artifact in its delivery form. At least one scenario must exercise the full delivery path:
If the deliverable is a browser app and no scenario loads it in a browser, the DoD is incomplete.
Validate every user-facing message. Help text, error messages, status displays, feedback strings, prompts, and warnings are promises to the user. Inventory all of them from the spec, then ensure each one is triggered and validated in at least one scenario:
This means all messages, not a sample. If the spec describes 20 distinct message surfaces, 20 must be tested.
When one artifact references another, verify both. A source file that references an output is evidence of intent; confirm the output itself is present and valid.
For each primary way the deliverable is used, write a scenario with:
Scenarios should cross multiple AC groups. A browser app scenario might cover loading, display, input, and state persistence in one flow.
Each scenario is self-contained — it sets up its own preconditions within the test rather than depending on externally pre-computed inputs or manual preparation.
Each scenario becomes a named automated test in the DoD, with test exits 0 as its verification.
For checks that require judgment, write a concrete semantic verification with:
Every DoD must define deterministic, reviewable test artifacts for each integration scenario.
.ai/runs/$KILROY_RUN_ID/test-evidence/latest/.ai/runs/$KILROY_RUN_ID/test-evidence/latest/IT-<id>/.ai/runs/$KILROY_RUN_ID/test-evidence/latest/manifest.jsonManifest entries must map each scenario ID to:
pass/fail)Example manifest shape:
{
"version": 1,
"scenarios": [
{
"id": "IT-1",
"status": "pass",
"artifacts": [
{ "type": "log", "path": ".ai/runs/$KILROY_RUN_ID/test-evidence/latest/IT-1/test.log" }
],
"notes": []
}
]
}
Artifact requirements:
IT-* scenario must produce at least one artifact.surface=ui: visually rendered user interface is exercised.surface=non_ui: no visually rendered user interface is exercised.surface=mixed: both visual UI and non-UI interfaces are exercised.surface=ui or surface=mixed) must include screenshot artifacts (.png or .jpg) proving key states.surface=non_ui) must include text or structured evidence artifacts (for example logs, stdout captures, JSON reports).Framework policy:
Before finalizing each scenario, confirm:
.ai/runs/$KILROY_RUN_ID/test-evidence/latest/IT-<id>/After writing all ACs and integration scenarios, review:
Per scenario:
Per AC: 5. Confirm at least one scenario proves this AC 6. If no scenario covers an AC, add coverage or justify the gap
Overall:
7. Confirm at least one scenario tests the deliverable in its delivery form
8. Confirm every user-facing message from the inventory is triggered and validated by at least one scenario
9. Confirm the scenarios collectively cover every AC group
10. Confirm .ai/runs/$KILROY_RUN_ID/test-evidence/latest/manifest.json covers every scenario ID with at least one artifact path
11. If a digraph exists, confirm no required intent path is missing or unreachable
# [Project] — Definition of Done
## Scope
### In Scope
[What the deliverable covers]
### Out of Scope
[Explicit exclusions]
### Assumptions
[Prerequisites and environment]
## Deliverables
| Artifact | Location | Description |
|----------|----------|-------------|
| ... | ... | ... |
## Acceptance Criteria
### [Concern Area]
| ID | Criterion | Covered by |
|----|-----------|------------|
| AC-N.M | [Observable assertion] | IT-X, IT-Y |
## User-Facing Message Inventory
| ID | Message surface | Trigger condition | Covered by |
|----|----------------|-------------------|------------|
| MSG-N | [What the user sees] | [What causes it] | IT-X |
## Test Evidence Contract
| Item | Requirement |
|------|-------------|
| Evidence root | `.ai/runs/$KILROY_RUN_ID/test-evidence/latest/` |
| Scenario folder pattern | `.ai/runs/$KILROY_RUN_ID/test-evidence/latest/IT-<id>/` |
| Manifest | `.ai/runs/$KILROY_RUN_ID/test-evidence/latest/manifest.json` |
| UI scenarios (`surface=ui` or `surface=mixed`) | Include screenshot evidence proving key states |
| Non-UI scenarios (`surface=non_ui`) | Include text/structured evidence (log/stdout/json) |
| Failure behavior | Emit best-effort artifacts and manifest entry; record missing artifacts explicitly |
## Integration Test Scenarios
| ID | Scenario | Steps | Verification | Evidence Artifacts |
|----|----------|-------|--------------|--------------------|
| IT-N | [User journey name] | 1. [action] → [expected] 2. [action] → [expected] ... | `test command` exits 0 | `surface=<ui|non_ui|mixed>`; `[type:path, ...]` |