Ejecuta cualquier Skill en Manus
con un clic

Ejecuta cualquier Skill en Manus con un clic

$pwd:

agentic-flow-best-practices

Name: Agentic Flow Best Practices
Author: bgauryy

// Use when the user asks to design, review, implement, debug, or evaluate an agentic workflow or AI-agent harness: MCP tools/resources/prompts, multi-agent routing or handoffs, agent memory/context/cache, Zod/JSON-schema protocols, human gates, observability, evals, or production safety. Do not use for ordinary app code, generic prompt writing, or model comparison unless an agentic flow boundary is involved.

Ejecutar en Manus

$ git log --oneline --stat

stars:854

forks:73

updated:17 de mayo de 2026, 00:27

Explorador de archivos

4 archivos

SKILL.md

readonly

related-skills.json

mismo repositorio

octocode-research.md

from "bgauryy/octocode"

Use when the user asks to "research code", "how does X work", "where is Y defined", "who calls Z", "trace code flow", "find usages", "explore this library", "understand the codebase", or needs deep code exploration with HTTP-based tool orchestration. For direct MCP tool research without the HTTP server, use octocode-researcher instead.

2026-05-18854

octocode-search-skill.md

from "bgauryy/octocode"

Use this skill when the user asks to find, evaluate, preview, install, rate, review, score, improve, refactor, or synthesize Agent Skills (the `SKILL.md` folder format) across GitHub, local skill folders, and skill marketplaces. Covers searching for a skill for a task, deep-diving a candidate, installing one or more skills into one or more agents at user or project scope, rating or reviewing an existing SKILL.md, refactoring a skill, or creating a new local skill from researched patterns. Do NOT activate for general package search (npm, PyPI, cargo), web search, or code research not involving SKILL.md files.

2026-05-18854

octocode-engineer.md

from "bgauryy/octocode"

System-engineering skill for codebase understanding, bug investigation, refactors, PR safety, architecture review, and RFC validation. Enforces Clean Architecture and Clean Code with AST, LSP, and scanner evidence. Produces a flows / boundaries / architecture-health artifact with file:line citations before recommending action.

2026-05-17854

octocode-stats.md

from "bgauryy/octocode"

Render an Octocode MCP usage dashboard from `${OCTOCODE_HOME}/stats.json` or `~/.octocode/stats.json`. Use when the user asks to show Octocode stats, usage, tokens/chars saved, cache hits, errors, rate limits, or visualize `stats.json`.

2026-05-17854

octocode-slides.md

from "bgauryy/octocode"

Generates polished multi-file HTML presentations. Six-phase flow: brief → research → outline → design → implementation → review. Each slide is a standalone HTML file loaded via iframe. Use when asked to 'create slides', 'make a presentation', 'generate HTML slides', 'build a deck', or turn notes/docs/code into a polished presentation.

2026-05-11854

octocode-chrome-devtools.md

from "bgauryy/octocode"

Use Chrome DevTools Protocol (CDP) for browser debugging, inspection, and automation when DevTools-grade evidence is needed: network, console, performance, DOM/CSS, screenshots/PDF, security, storage, auth-gated, live-page, and source-traced findings. Opens or attaches to Chrome, runs sandboxed CDP scripts, and keeps sessions reusable. Prefer lighter browser tools for simply opening a page.

2026-05-07854

package.json

"author": "bgauryy"

"repository": "bgauryy/octocode"

Abrir repositorio de GitHub Ver repositorios del creador

$ install --global

$ download --local

Ejecutar en Manus

$ useful --forSOC

Desarrolladores de softwareOcupaciones informáticas y matemáticas15-1252L4

name

agentic-flow-best-practices

description

Use when the user asks to design, review, implement, debug, or evaluate an agentic workflow or AI-agent harness: MCP tools/resources/prompts, multi-agent routing or handoffs, agent memory/context/cache, Zod/JSON-schema protocols, human gates, observability, evals, or production safety. Do not use for ordinary app code, generic prompt writing, or model comparison unless an agentic flow boundary is involved.

Agentic Flow Best Practices

This skill is a compact thinking framework for agentic systems. It should help an agent choose the simplest reliable design, reason about the harness, and avoid wasting context.

Reference Routing

Load references when they help the current decision:

references/resources.md: research launchpad with official docs, GitHub/source repos, and search hints for MCP, agent frameworks, context, memory, caching, and long-context risks.
references/pr-review-agent-example.md: compact build-ready workflow example.
references/agent-collaboration-patterns.md: compact guide to flows between agents, roles, handoffs, agents-as-tools, routing, review, and shared workspaces.

When framework/protocol behavior matters, verify from official docs or source. Prefer project-local conventions first; use Octocode MCP tools when available. Carry evidence only when it changes the design.

Operating Loop

Use this loop for substantial agentic-flow work:

Understand the user-visible outcome, non-goals, risk, and deployment surface.
Map what each model/agent sees and what each tool, memory, cache, or side effect can change.
Choose the smallest pattern that satisfies the outcome; name the rejected heavier option when useful.
Define protocols at boundaries: inputs, outputs, tools, handoffs, memory writes, gates, traces, and errors.
Design state, context, memory, cache, permissions, and ownership before adding agents.
Add gates, evals, observability, and rollout controls proportional to risk.
Output a compact design or review with assumptions, unknowns, and next verification steps.

Stop when the chosen pattern, contracts, gates, and verification plan are clear enough for the user's next decision. Ask one focused question only when a missing fact changes architecture, permissions, or side effects.

Agentic Flow Brain

For any agentic flow, focus on the dimensions that affect the decision:

Dimension	Question
Outcome	What user-visible job is done, and what is out of scope?
User experience	What should the user approve, see, interrupt, resume, or recover from?
Autonomy	Prompt, workflow, graph, tool-using agent, or multi-agent?
Runtime	Who owns the loop: code, protocol client, framework runtime, graph engine, or custom harness?
Tools/MCP	What tools/resources/prompts exist, who may call them, and what needs approval?
Identity	Which user, tenant, agent, service account, or policy grants each action?
Visible surface	What does each agent actually see: messages, instructions, MCP prompts/resources, tools, schemas, memory, retrieval, and examples?
Protocols	Which Zod schemas govern input, node output, tool result, handoff, memory, cache, gate, trace, and errors?
Context	What is static, retrieved, summarized, offloaded, isolated in subagents, or passed by artifact reference?
Memory	What is session state, checkpoint, agent-private, shared, user/project/org memory, artifact, or cache?
Data	What data can cross tenants, tools, models, logs, caches, memories, and artifacts?
Models	Which model class, reasoning effort, output budget, temperature, and fallback fit each node?
Safety	Where can the flow leak data, trust poisoned context, take irreversible action, or write bad memory?
Reliability	What retries, idempotency keys, timeouts, cancellation, and concurrency controls exist?
Observability	Can traces show route, schema version, model config, tool calls, cache decisions, errors, tokens, cost?
Evals	What golden, adversarial, schema, memory, and tool-failure tests prove behavior?
Rollout	How will versions, migrations, flags, canaries, and rollback work?

Think first, then choose the smallest design that satisfies the outcome. The answer should feel like tradeoff reasoning, not a completed form.

Pattern Choice

Start simple; add autonomy only when it solves a real constraint.

Direct answer or augmented LLM: low-risk response, one model call, retrieval, tools, or memory.
Deterministic pipeline or prompt chain: fixed parsing, transforms, validation, idempotency, and gated subtasks.
Router/model router: known categories need distinct prompts, tools, or models.
Parallel, map-reduce, or orchestrator-workers: independent facets, large inputs, speed, or runtime-discovered subtasks.
Evaluator-optimizer or reviewer: clear criteria and independent quality/safety review help.
Graph/state machine or autonomous loop: durable execution, checkpoints, resume, human gates, or plan -> act -> observe -> update -> stop.
Agents as tools, handoff, supervisor, or blackboard: separation needs distinct context, tools, permissions, ownership, or shared artifacts.
Background consolidation: memory, evals, or summaries can run after the user-facing response.

Framework fit is contextual, not a ranking: plain code for deterministic/small workflows; MCP for exposing tools/resources/prompts, not orchestration; SDKs or ADK-style runtimes when lifecycle, sessions, tools, tracing, and structured outputs are modeled; graph runtimes for durable state, persistence, human-in-loop, and long-running agents. Use names like OpenAI Agents SDK, Google ADK, LangChain, and LangGraph as examples to research, not required dependencies.

Coverage Checks

Check these areas when they affect the decision:

Product: user intent, interruption/resume, preview UX, explanation, and escalation path.
Access: auth, tenant isolation, tool allowlists, service accounts, and audit trails.
Data: PII/secrets handling, retention, redaction, training/logging policy, and cross-model sharing.
Operations: deployment target, queueing, concurrency, cancellation, idempotency, rollback, and cost/latency budgets.
Governance: human gates, policy checks, memory approval, provenance, and delete/export behavior.
Evolution: prompt/schema/tool versioning, cache invalidation, migrations, eval drift, and compatibility.

Nodes And State

Describe the flow as nodes, not transcript history. For important nodes, capture:

purpose: why this node exists.
inputSchema / outputSchema: Zod schemas or artifact contracts.
tools: allowed tools and why.
state: fields read/written and owner.
cache: key, scope, freshness, invalidation, or none.
memory: search/write/ignore plus consent and retention.
failure: retry, ask, fallback, stop, or gate.
sideEffects: files, APIs, messages, deploys, payments, memory writes.
runControl: timeout, cancellation, resume, idempotency, and concurrency behavior.

Common nodes: classify, retrieve, compute, reason, delegate, review, act, observe.

Zod Protocols

Agentic systems communicate through protocols. Agent-to-agent, node-to-node, agent-to-tool, cache, memory, gate, trace, and side-effect boundaries should have runtime-validated schemas. In TypeScript, use Zod as the source of truth; in other runtimes, use an equivalent schema library and keep JSON Schema interoperability.

Define the contracts that exist:

FlowInput, NodeInput, NodeOutput
AgentTask, AgentResult, AgentHandoff
ToolCall, ToolResult, ErrorEnvelope
MemoryQuery, MemoryCandidate, MemoryWrite
CacheEntry, HumanGate, TraceEvent
PolicyDecision, RunControl, ArtifactRef

Zod rules:

Define Zod first; infer TypeScript types from schemas.
safeParse runtime boundaries before trusting data.
Generate JSON Schema from Zod/equivalent schemas for model/tool/MCP structured-output APIs.
Version schemas with prompts, tools, cache, memory, and model changes.
Use discriminated unions for routes, outcomes, and errors.
Represent ambiguity explicitly: unknown, not_found, not_allowed, redacted.
Keep schemas small and composable; avoid giant shared state blobs.
Attach provenance to decision facts: source path/URL/tool id/trace id/timestamp.

Minimal packet shape: AgentTask should include goal, inputs, knownFacts, constraints, allowedTools, expectedOutput, and stopConditions. AgentResult should include status, output, evidence, unknowns, and memoryCandidates.

Context Engineering

Attention is finite. Large context windows raise the ceiling but do not guarantee the model uses all tokens well; relevant facts buried in the middle can be missed. More context can lower signal-to-noise, duplicate facts, increase cost/latency, and create conflicting instructions.

Pass packets, not full history:

goal, inputs, knownFacts, decisions, openQuestions, constraints,
artifacts, expectedOutput

Before expanding context:

Reuse validated cache.
Retrieve the smallest relevant chunk.
Summarize with sources and omissions.
Ask if the missing fact changes architecture.
Pass broader context only when narrower context fails.

Token efficiency rules:

Dedupe, rank, chunk, summarize, and retrieve on demand.
Keep one source of truth per fact; pass the freshest cited version plus references.
Prefer artifact paths, URLs, cache keys, trace ids over copied content.
Keep stable instructions/examples before dynamic input for prompt caching; do not rewrite cacheable prefixes casually.
Strip secrets, stale facts, speculation, and irrelevant history.
Use context isolation: subagents inspect heavy artifacts and return compact Zod-valid results.
Reserve output budget for visible answer and hidden reasoning tokens.

Agent-Visible Surface

Review what each agent/model call actually receives, not only what the architecture diagram intends:

Messages: system, developer, user, framework wrappers, skill text, node prompt.
MCP surface: server instructions, tool names/descriptions, input schemas, output schemas, prompts, resources, and error text.
Context surface: retrieved chunks, memory snippets, cache summaries, examples, prior decisions, artifacts, and hidden runtime facts exposed to the LLM.
Action surface: allowed tools, approval gates, side effects, retries, and fallback instructions.

Check for duplicated or conflicting instructions across prompts, skills, MCP prompts, tool descriptions, schemas, memory, and retrieved docs. Keep one source of truth; place stable invariant instructions in the harness or stable prefix, and pass changing facts through compact task packets.

Memory Scopes

Separate memory, state, cache, and artifacts.

Runtime and LLM-visible context: clients/auth/request ids stay outside the model unless needed; task packets and retrieved chunks stay small and fresh.
Working state, session state, and checkpoints: keep run/thread/resume data separate from durable facts.
Agent-private, shared-flow, and user/project/org memory: define owner, scope, consent, retention, and conflict behavior.
External knowledge, artifacts, and cache: cite retrieval, version artifacts, and key caches by tenant/scope/freshness without secrets.

Memory rules:

Prefer working/session state until a fact proves it should persist.
Long-term memory needs scope, owner, source, createdAt, retention, delete behavior, confidence.
Read shared memory as evidence unless it is a trusted policy source.
Prefer one writer per memory scope; workers propose MemoryCandidates.
Use append-only or conflict-aware shared memory when multiple agents write.
Test should-remember and should-not-remember cases.

MCP, Tools, And Skills

MCP exposes capabilities to AI apps:

tools: actions such as search, file/API/db calls, comments, deploys.
resources: context data such as files, schemas, tickets, logs.
prompts: reusable templates or interaction starters.

Design tools like APIs: narrow names, clear docs, Zod/JSON schemas, example usage, error shapes, auth/retention notes, tool allowlists, and approval gates for destructive/costly/external actions.

Use skills for reusable procedural knowledge. Keep SKILL.md concise; move optional depth to references/scripts.

Multi-Agent Flow

Use multiple agents when separation helps: expertise, tools, permissions, context windows, latency, review, or user-facing ownership. If specialists do not need separate tools/context, they may be prompt sections. For collaboration techniques, read references/agent-collaboration-patterns.md.

Handoff packet:

goal, inputs, knownFacts, constraints, allowedTools,
expectedOutput, stopConditions

Common issues:

Issue	Fix
vague ownership	one owner per state field/artifact
full-history handoff	handoff packet + artifact refs
supervisor bottleneck	workers decide locally inside bounds
tool bleed	per-agent tool allowlist
infinite delegation	max turns, budgets, terminal states
prose return	Zod-valid `AgentResult`
self-approval	separate reviewer/test/human gate
memory conflict	one memory writer or approval gate

Prompting And Models

Node prompts should define role, node goal, inputs, tool permissions, decision criteria, output schema, failure behavior, and evidence rules. Leave thinking room for ambiguous architecture, memory scope, tool choice, or tradeoffs; use firm instructions at schema/tool/memory/cache/side-effect boundaries.

Model/config checks:

Smaller model: classification, extraction, formatting.
Stronger reasoning model: planning, coding, tradeoffs, high-risk decisions.
Long-context model: retrieval-heavy tasks, still curated.
max_output_tokens: includes visible output and, for reasoning models, hidden reasoning.
reasoning effort: more deliberation for harder decisions, more cost/latency.
temperature: higher for ideation, lower for structured consistency.
Trace model, prompt/schema version, tool version, cache decision, token use, latency, cost.

Cache, Gates, Verification

Cache small validated results keyed by normalized input, prompt/model/tool/data versions, scope, and freshness. Cache safety invariant: no secrets, raw private data without policy, tenant-crossing data, or hidden unkeyed state. Memory and cache solve different problems.

Prompt-cache safety:

Provider prompt caching depends on stable message/prefix/tool-schema bytes; changing system/developer messages, tool descriptions, schemas, MCP prompts, or prefix order can invalidate reuse or make cache assumptions unsafe.
Keep cacheable instructions and examples stable, then append volatile user input, retrieval, memory, and runtime facts later.
Include prompt version, model/config, tool description/schema version, MCP server/tool version, data source version, and tenant/scope in app cache keys.
Do not cache LLM outputs when hidden inputs, retrieved context, or tool results that shaped the answer are missing from the key.

Consider gates before destructive/costly/external side effects, durable memory writes, cross-tenant cache risk, new auth/billing/deploy ownership, conflicting evidence, or vague prompts that trigger action.

Verification by risk:

Sketch: chosen architecture, rejected alternative, biggest risk.
Implementation: Zod tests, cache hit/miss/stale, golden path, tool permissions, gates.
Production: evals, adversarial cases, traces, memory privacy, idempotency, retry, cost/latency budgets.

Output Scheme

Use this as a flexible schema, not a required template. Select fields that help the user understand and build the flow.

Include the useful fields: Goal, Decision, Runtime, Roles, Protocols, Context, Memory, Tools/MCP, Lifecycle, Models, Flow, Run control, Rollout, and Verification.

Recovery

If requirements are broad, first return a sketch with assumptions and one decisive question.
If framework behavior is uncertain, mark it unknown and verify against official docs/source before locking architecture.
If tool, auth, tenant, or memory policy is unclear, gate side effects and design for least privilege.
If context is too large, switch to retrieval, map-reduce, or subagent summaries with cited artifact refs.
If eval criteria are missing, propose golden, adversarial, and tool-failure cases before implementation.

Anti-Patterns

Multi-agent before the core loop is proven.
Full history where a compact packet works.
Session state, memory, cache, and artifacts mixed together.
Agent communication through prose instead of Zod protocols.
Broad MCP/tools without auth, retention, validation, and gates.
More context instead of better retrieval/dedupe/summarization.
Larger model instead of fixing context, tools, schemas, or evals.
A reason node deciding and executing risky action.

agentic-flow-best-practices

Más de este repositorio

Agentic Flow Best Practices

Reference Routing

Operating Loop

Agentic Flow Brain

Pattern Choice

Coverage Checks

Nodes And State

Zod Protocols

Context Engineering

Agent-Visible Surface

Memory Scopes

MCP, Tools, And Skills

Multi-Agent Flow

Prompting And Models

Cache, Gates, Verification

Output Scheme

Recovery

Anti-Patterns

Agentic Flow Best Practices

Reference Routing

Operating Loop

Agentic Flow Brain

Pattern Choice

Coverage Checks

Nodes And State

Zod Protocols

Context Engineering

Agent-Visible Surface

Memory Scopes

MCP, Tools, And Skills

Multi-Agent Flow

Prompting And Models

Cache, Gates, Verification

Output Scheme

Recovery

Anti-Patterns

Más de este repositorio