원클릭으로
mcp-integration
Which MCP tool to call and in what order — gate ritual, recording discipline, action tools for all development agents.
메뉴
Which MCP tool to call and in what order — gate ritual, recording discipline, action tools for all development agents.
Когда какой MCP-инструмент звать и в каком порядке — gate ritual, recording discipline, action tools для всех агентов разработки.
Финальный гейт перед публикацией — проверка артефактов, blocker/non-blocker чеклист, пакет публикации, rollback, post-publish
Final gate prior to publication — artifact check, blocker/non-blocker checklist, publication package, rollback, post-publish
Чек-лист признаков "лава-флоу" в legacy/brownfield коде — мёртвый код вокруг живого, не убранные feature flags, окаменевшие workarounds, дубликаты после миграций. Используй при анализе текущего состояния репозитория, перед рефакторингом и при ревью PR в зрелой кодовой базе. Активируй при упоминаниях "legacy", "технический долг", "brownfield", "почему этот код здесь".
Управление test data — fixtures генерируются из real schemas (TS типы, DB schema, OpenAPI), PII hygiene (faker/factory_boy для синтетики), prod-like masking при копировании prod данных, environment isolation (testcontainers, transactional rollback, tempdir), fixture lifecycle. Защита от Mode 1 (mock obsession) — fixture не дрейфует от реальности.
Checklist of "lava flow" signs in legacy/brownfield code — dead code surrounding live paths, unremoved feature flags, fossilized workarounds, duplicates left over from migrations. Use during current-state analysis of a repo, before refactoring, and when reviewing PRs in a mature codebase. Activate on mentions of "legacy", "technical debt", "brownfield", "why is this code here".
| name | mcp-integration |
| description | Which MCP tool to call and in what order — gate ritual, recording discipline, action tools for all development agents. |
| type | mandatory |
| domain | development |
| owners | ["architect","conductor","devops","product_manager","reviewer","senior_full_stack","tester","ux_ui_designer"] |
| gates | ["PM","UX","ARCH","DEV","REV","OPS","TEST","RG"] |
| tech | ["mcp"] |
| topic | ["general"] |
| triggers | ["record_decision","request_decision","classify_gate","start_task","advance_gate","submit_artifact","list_skills","get_skill","load_role","MCP","code-ai MCP","gate flow"] |
| related | ["karpathy-guidelines","code-review-checklist","adr-log"] |
| budget_lines | 250 |
| schema_version | 1 |
| license | MIT |
Rules for working with the code-ai MCP server for all development agents. Which tool to call when, in what order, what to record, what mistakes to avoid.
If the MCP server is not registered in
.mcp.json— fall back to file reads with areport_exceptionnote. Do not stay silent.
Triggers: any task in the development domain that passes through gates (PM/UX/ARCH/DEV/REV/OPS/TEST/RG). Which is — almost any task.
Output: the tool was invoked correctly, its result was used in the current phase, and recorded via record_decision or submit_artifact when needed.
A local MCP server (stdio transport) with 26 tools. State lives in .code-ai/state/ (JsonlStore — append-only, files decisions.jsonl, exceptions.jsonl, artifacts/<task>.jsonl). MempalaceStore is an optional mirror, not the source of truth.
The server is registered by the installer (.mcp.json is written automatically on --target=claude). If the installer did not run — records can be made manually in .code-ai/state/, but the MCP tools are preferred: they validate input via zod.
list_skills — get the full list of domain skills with frontmatter. Call at task start to understand what you have. The parser is permissive — it skips skills with broken frontmatter (see run_drift_audit if something was expected but did not appear).
get_skill — get a SKILL.md by name. Use instead of manual file reading — get_skill returns frontmatter already parsed and the markdown body separately.
load_role — get an agent profile (prompt + list of skills the agent may call). Call first when a task lands on your role.
regenerate_manifest — rebuild manifest.json after skill edits. Call when you added/removed/renamed a skill.
Order at task start: load_role → list_skills → pick the ones you need → get_skill for each.
This is the gate-passage ritual. Every gate must follow these steps in order.
start_task — create the run with an explicit mode (full/bugfix/hotfix) and its initial gate. The single creation path — call it first; current_gate no longer creates the task.
classify_gate — classify the task before starting work on a gate. Returns auto_resolve (green path, artifact still required for audit), fork (needs a user decision — request it via request_decision) or exception (an automated check failed — write a breakdown to exceptions).
request_decision — request a decision from the user. Creates an ADR-PENDING-* entry in decisions.jsonl. Must be called before record_decision if the decision requires user approval. Do not stay silent and do not choose on your own — that violates karpathy-guidelines §1.
current_gate — where the task currently is. Useful when context was lost or you switched between tasks.
advance_gate — push the task into the next gate. Only after all artifacts of the current gate are submitted and sign_off is in place.
sign_off — sign the current gate. Sign_off without a prior submit_artifact is an anti-pattern (see §8). Result first, signature second. Who signs depends on the gate's sign_off_policy: user gates (PM/UX/ARCH/RG) require signer="user" — STOP and request the user's sign-off, one gate at a time, no batching, and do not start the next gate until the current one is signed; execution gates (DEV/REV/OPS/TEST, policy=either) may be signed as signer="mcp" once DoD is met. The machine rejects an mcp sign-off of a user gate.
Canonical gate ritual: start_task (create the run with a mode, once) → current_gate (understand where you are) → classify_gate → if fork: request_decision → wait → record_decision → continue → submit_artifact → verify_claim (where applicable) → sign_off → advance_gate.
Tools that do something in the project. Each answers one DoD question.
run_tests — a vitest/jest/pytest wrapper. Call when the DoD says "tests are green". Returns numPassedTests, numFailedTests, failureMessages. Used by verify_claim for claim_type=tests_pass.
check_lint — linter check. Returns clean: true/false plus a list of lint failures. DoD claim_type lint_clean.
build — tsc or equivalent. Parses TSxxxx errors. DoD claim_type build_succeeds.
apply_diff — apply a git diff from stdin. Cleaner than manual edits across many Edit/Write calls — especially for multi-file patches.
git_commit — commit (with paths or -a). Uses a tempfile for the message — heredoc with quotes is unreliable on Windows.
run_drift_audit — find divergence between skills on disk and AGENTS.yaml/manifest.json. The parser is permissive — it sees broken skills (unlike list_skills which skips them).
e2e_playwright — Playwright runner. Browsers are not downloaded by default (.npmrc skip flag) — install manually if needed.
docker_compose — wrapper over docker compose (up/down/ps/logs). Skipped if the Docker daemon is not running.
dependency_supply_chain — npm audit --json parser. Returns vulnerabilities with severity. DoD claim_type no_critical_vulns.
verify_claim — meta-tool. Takes a claim_type (tests_pass / lint_clean / build_succeeds / no_critical_vulns / e2e_passes / docker_runs / custom), calls the right action tool, returns a structured verdict. custom is still a stub — for it, human verification (see DEV-103).
audit_budget_compliance — file budget compliance check: declared_budget > schema_max (catches schema rejection latent bugs — DEV-107 RoleFrontmatter case) and actual_lines > declared_budget. Across all agent.md + SKILL.md in the given domain (RU + EN). Call periodically or before substantial edits to agent.md / SKILL.md.
audit_bilocale_parity — RU/EN locale parity check: pairs each agent.md / SKILL.md with its sibling in the other locale and reports declared_mismatch (differing budget_lines), actual_mismatch (differing line counts — the design-intake drift in DEV-114), and orphan (file present in one locale only). Read-only. Call before/after edits that touch a single locale.
aggregate_run_metrics — the Auditor's data foundation. Computes deterministic per-agent (gate→role via pipeline.yaml) and per-workflow (mode) statistics from the completed-run ledger (.code-ai/state/audit/runs.jsonl): first-try rate, reworks, rollbacks, circuit-breaker trips, exceptions, classification breakdown. min_runs (default 3) guards small samples. Read-only, numbers only — judgment belongs to the Auditor agent.
propose_change — record an Auditor proposal (a draft change to an agent/skill) as a pending entry in the local store (.code-ai/state/audit/proposals.jsonl). Carries change_kind (edit_minor/add_asset/destructive → risk tier), rationale, evidence, threshold_met, and the inline draft. Pure surfacing — touches no asset; inert until approved/applied (item 4b).
list_proposals — list Auditor proposals (newest first), filters status/risk/domain. Read-only.
review_proposal — authorize a proposal status transition (approve/reject a pending one; mark an approved one applied) plus a mandatory report. Applies the autonomy matrix + the .code-ai/config.json toggle: decided_by='auditor_auto' may approve only low/additive AND only when the gate is OFF; destructive (high) and gate-ON always require user. Auto-adding a new skill also runs an additive-dedup guard — on overlap with an existing skill it routes to user instead of auto-adding. Authorization only — the byte write into the asset is a separate submit_artifact/edit step (see next_step).
render_diff — render a unified diff (e.g. git diff output) into a colored, per-file, line-numbered HTML review page in the system temp dir. Returns the path + a file:// URL to open in a browser. Call it at the REV gate to present code changes for review. Informational only — the file lives in temp (cleared on reboot), never in the project.
record_decision — write an ADR decision to decisions.jsonl. Only after request_decision if the decision needs the user's involvement. If the decision is mechanical (signer=mcp) — direct is fine, but set signer: "mcp" explicitly.
recent_decisions — last N decisions (filters by domain/signer/since). Use to understand the current task context.
audit_trail — the full audit trail of a task: all decisions + artifacts + exceptions in chronological order. Call before sign_off to make sure nothing was forgotten.
submit_artifact — submit an artifact of the current gate (e.g. spec.md, ADR draft, design doc). Without it you cannot sign_off.
list_artifacts / get_artifact — see what was already submitted in the task. Use to avoid duplicates.
report_exception — record an exception (gate-check failed). Do not use instead of an honest fix — exception means "I tried, didn't work, reason X, need a fork", not "workaround".
Storage lives in JsonlStore (append-only). No "overwrite" — only a new entry with invalidates: <prev_id> via kg_invalidate if a fact is outdated.
1. load_role (understand what I can do)
2. list_skills (what exists in the domain)
3. current_gate (where the task is)
4. get_skill <name> (for each relevant skill)
5. classify_gate (start gate ritual)
1. verify_claim (for each DoD claim_type with an automated check)
2. submit_artifact (artifacts of the gate)
3. audit_trail (last check — is everything in place)
4. sign_off (signature)
5. advance_gate (transition)
1. request_decision (creates ADR-PENDING-*)
2. (wait for user approval)
3. record_decision (finalizes the ADR with signer=user)
4. apply_diff (apply the changes as one patch if possible)
5. git_commit (commit with adr_id in the message)
record_decision without request_decision for decisions that need user approval — breaks governance, breaks the audit trail.advance_gate without sign_off — the gate stays "unsigned", the next agent does not understand what is ready.apply_diff without git_commit — changes sit in the working tree, the next session loses them or claims them as its own.get_skill — you skip frontmatter validation and catch a latent bug (see the DEV-103 list_skills lesson).report_exception instead of an honest fix — exception is for "the user decides next", not for "I worked around it".submit_artifactrecord_decision (with request_decision where user approval is required)report_exception (if any)verify_claim called for every DoD claim with an automated check