Design cost-efficient AI agent architectures. Use when optimizing token usage, selecting model tiers, budgeting compute costs, implementing caching strategies, or designing plan-and-execute patterns for cost reduction. Covers model tiering (frontier for planning, cheap for execution), token budgeting, response caching, the plan-and-execute cost reduction pattern (up to 90% savings), and cost monitoring. Based on emerging FinOps-for-AI trends, heterogeneous model architectures, and production cost optimization practices.
Design and implement multi-agent systems with proven coordination patterns. Use when building agent teams, delegation architectures, inter-agent communication, lead-agent orchestration, or agent swarm coordination. Covers 5 orchestration topologies (hub-and-spoke, pipeline, broadcast, hierarchical, mesh), delegation protocols, state sharing across agent boundaries, conflict resolution, and the plan-and-execute pattern. Based on patterns from Kimi Agent Swarm, Devin, BabyAGI, MetaGPT, Google A2A, and DyLAN architectures.
Design safety architectures for AI agents — autonomy tiers, permission zones, command approval gates, secret handling, escalation paths, and observability. Use when building agents that execute code, modify files, access networks, handle credentials, or make consequential decisions. Covers three autonomy tiers (full-auto, supervised, human-led), container security models, tool safety classifications, and audit logging. Based on patterns from Kimi's 4-layer container model, Claude Code's approval workflows, Devin's data security, and Windsurf's safety protocols.
Design agent memory architectures and context window optimization strategies. Use when building persistent memory systems, context budgeting, dynamic context loading, knowledge retrieval, or managing token limits. Covers three-tier memory (episodic, semantic, procedural), context priority frameworks, just-in-time loading patterns, cache invalidation, and provider-agnostic context layers. Based on patterns from Kimi's skill injection, Cursor's scratchpad, BabyAGI's graph memory, and emerging context engineering practices.
Design production-grade tool specifications for AI agents. Use when defining tool interfaces, parameter schemas, safety flags, error handling, MCP compatibility, or tool composition rules. Covers three specification formats (XML, JSON Schema, markdown), 6 quality indicators, safety classification, error recovery patterns, and MCP interoperability. Based on analysis of 40+ tool specs from Cursor, Replit, Devin, Kimi, Windsurf, and the Model Context Protocol standard.
Generate, audit, and optimize system prompts for AI agents using 8 proven architectural patterns extracted from 16+ production systems (Kimi, Cursor, Devin, Kiro, Claude Code, v0, Windsurf, Lovable, Replit, Traycer, Manus). Use when creating new agent system prompts, auditing existing prompts for quality and completeness, optimizing prompt architecture for specific use cases, or designing multi-agent workflows. Covers skill injection, persona replacement, state machine planning, structured scratchpad, todo tracking, XML response protocols, design system enforcement, and prompt structure blueprints.