一键导入
agentic-harness-patterns
// Harness patterns for coding agents — memory, permissions, context engineering, delegation, skills, hooks, bootstrap.
// Harness patterns for coding agents — memory, permissions, context engineering, delegation, skills, hooks, bootstrap.
| name | agentic-harness-patterns |
| description | Harness patterns for coding agents — memory, permissions, context engineering, delegation, skills, hooks, bootstrap. |
| when_to_use | Triggers on: harness engineering, tool safety, permission pipeline, agent memory, memory persistence, delegation pattern, context budget, bootstrap sequence, skill runtime, hook lifecycle, tool orchestration, agent harness, context engineering. |
| license | MIT |
Production AI coding agents are not just an LLM calling tools in a loop. The harness — memory, skills, safety, context control, delegation, and extensibility — is what separates a demo from a production system.
For: Engineers building or extending coding-agent runtimes, custom agents, or advanced multi-agent workflows. Not for: Prompt engineering, model selection, generic software architecture, or LLM API basics.
All principles are distilled from production runtime decisions. Claude Code is used as grounding evidence, not as the only possible implementation.
| If you want to... | Read |
|---|---|
| Make the agent remember and improve over time | Memory |
| Package reusable workflows and expertise | Skills |
| Let the agent use tools powerfully but not dangerously | Tools and Safety |
| Give the agent the right context at the right cost | Context Engineering |
| Split work across multiple agents without losing control | Multi-agent Coordination |
| Extend behavior with hooks, background tasks, or startup logic | Lifecycle and Extensibility |
Before you start building: Read the Gotchas — these are the non-obvious failure modes that cost the most time.
User problem: "My agent forgets corrections and project rules between sessions."
Golden rule: Separate what the agent knows (instruction memory) from what the agent learns (auto-memory) from what the agent extracts (session memory). Each layer has different persistence, trust, and review needs.
When to use: Any agent that operates across multiple sessions or needs to accumulate project-specific knowledge over time.
How it works:
Start here: Define your memory layers (instruction, auto, extraction). Implement the two-step save invariant (topic file, then index). Add background extraction only after the core write path is stable.
In Claude Code: Use
/rememberto audit and promote auto-memory entries across layers.
Tradeoffs:
Go deeper: references/memory-persistence-pattern.md
User problem: "I want my agent to reuse workflows and domain knowledge without re-explaining them every time."
Golden rule: Skills are lazy-loaded instruction sets, not eagerly injected prompts. Discovery must be cheap (metadata only); the full body loads only on activation.
When to use: Any agent that needs reusable, composable workflows activating on matching user intent.
How it works:
Start here: Choose a metadata format (frontmatter recommended). Implement two-phase discovery: cheap listing at startup, lazy body loading on invocation. Set a per-entry character cap before your catalog grows.
Tradeoffs:
Go deeper: references/skill-runtime-pattern.md
User problem: "I want my agent to use tools powerfully, but not dangerously."
Golden rule: Default to fail-closed. Tools are serial and gated unless explicitly marked safe for concurrency and approved by the permission pipeline.
When to use: Any agent runtime that needs tool registration, concurrency control, or permission gating.
How it works:
Start here: Route every tool call through one permission gate. Default to fail-closed (deny/ask). Add bypass-immune rules for protected paths before shipping any auto-approve mode.
In Claude Code: Use
/update-configto configure permission rules and hooks.
Tradeoffs:
Go deeper: references/tool-registry-pattern.md | references/permission-gate-pattern.md
User problem: "My agent either sees too much, too little, or the wrong thing."
Golden rule: Treat context as a budget, not a dump. Every token in the window should earn its place through one of four operations: select, write, compress, or isolate.
When to use: Any agent whose performance degrades in long sessions, whose delegated work pollutes the parent context, or whose startup is slow due to eager context loading.
How it works:
Start here: Audit your current context cost per turn. Apply hard caps to every variable-length block. Add truncation recovery pointers (tell the model which tool to call for full output) before enabling any compression.
Tradeoffs:
Go deeper: references/context-engineering-pattern.md (index) | select | compress | isolate
User problem: "I want parallelism, specialization, and coordination without chaos."
Golden rule: The coordinator must synthesize, not delegate understanding. "Based on your findings, fix it" is an anti-pattern — the coordinator should digest worker results into precise specs before dispatching implementation.
When to use: When a task is too large for a single agent, when you need parallel exploration, or when you want persistent specialized teammates.
How it works:
Three delegation patterns serve different task shapes:
| Pattern | Context sharing | Best for |
|---|---|---|
| Coordinator | None — workers start fresh | Complex multi-phase tasks (research → synthesize → implement → verify) |
| Fork | Full — child inherits parent history | Quick parallel splits sharing loaded context |
| Swarm | Peer-to-peer via shared task list | Long-running independent workstreams |
Key constraints:
Start here: Pick one delegation pattern and implement it fully before mixing patterns. Write every sub-agent prompt as a self-contained document. Add a synthesis step between research and implementation workers — this is where the orchestrator adds value.
Implementation checklist for the coordinator pattern:
Tradeoffs:
Go deeper: references/agent-orchestration-pattern.md
User problem: "I need hooks, background tasks, and a clean startup sequence."
Golden rule: Extensibility is an injection point, not an inheritance hierarchy. Hooks attach side effects at lifecycle moments; tasks track async work with strict state machines; bootstrap layers initialization sequentially with memoized stages.
When to use: When you need to extend agent behavior without modifying core code, track long-running background work, or structure initialization across multiple entry modes.
How it works:
Start here: Route all hooks through a single dispatch point. Implement the trust gate before adding any external hook type. Register cleanup handlers during init, not at usage sites.
In Claude Code: Use
/update-configto configure hooks (pre/post tool execution, prompt submission).
Tradeoffs:
Go deeper: references/hook-lifecycle-pattern.md | references/task-decomposition-pattern.md | references/bootstrap-sequence-pattern.md
Non-obvious principles that will cause bugs if you violate them:
Concurrency classification is per-call, not per-tool. A tool may be safe for some inputs and unsafe for others. Don't assume a tool's concurrency behavior is static — the runtime decides per invocation.
Permission evaluation has side effects. The permission checker tracks denials, transforms modes, and updates state. Don't treat it as a pure lookup function.
Most async work skips the "pending" state. In practice, work units register directly as "running." Don't build UIs that assume every work unit starts pending.
Fork children must not fork. The recursive guard preserves a single-level invariant. The fork tool stays in the child's tool pool (for prompt cache sharing) but is blocked at call time.
Context builders are memoized but manually invalidated. Add a context source without adding a corresponding invalidation point, and the model sees stale data for the entire session.
Memory indexes have hard caps. Entries beyond the cap are silently truncated. Without periodic cleanup, recent entries become invisible.
Skill listing budgets are tight. Descriptions are concatenated and capped per entry. Front-load the most distinctive trigger language — the tail gets cut.
Hook trust is all-or-nothing. If the workspace is untrusted, the entire hook system is disabled, not just individual suspicious hooks.
The default permission for tools is "allow." Tools that don't implement custom permission logic delegate entirely to the rule-based system. Override only when you need tool-specific gates (path ACLs, quotas, etc.).
Eviction requires notification. A terminal work unit is only GC-eligible after the parent has received the completion signal. Evicting before notification creates a race where the parent can never read the result.
This skill is about the harness around an agent, not:
If your question is about the model itself rather than the system around it, this skill does not apply.