Run any Skill in Manus with one click

$pwd:

mcp-builder

Name: Mcp Builder
Author: duc01226

// [AI & Tools] Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).

Run Skill in Manus

$ git log --oneline --stat

stars:6

forks:1

updated:May 6, 2026 at 08:19

File Explorer

10 files

SKILL.md

readonly

package.json

"author": "duc01226"

"repository": "duc01226/EasyPlatform"

View GitHub Repository

$ install --globalskills.sh

$ download --local

Run Skill in Manus

[HINT] Download the complete skill directory including SKILL.md and all related files

Run any Skill with one click

name	mcp-builder
description	[AI & Tools] Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
disable-model-invocation	true

Codex compatibility note:

Invoke repository skills with $skill-name in Codex; this mirrored copy rewrites legacy Claude /skill-name references.

Prefer the plan-hard skill for planning guidance in this Codex mirror.

Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.

User-question prompts mean to ask the user directly in Codex.

Ignore Claude-specific mode-switch instructions when they appear.

Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.

Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required spawn_agent subagent(s) for that task.

Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.

For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.

If a required step/tool cannot run in this environment, stop and ask the user before adapting.

Codex Project-Reference Loading (No Hooks)

Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.

Always read:

docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)
docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)
docs/project-reference/lessons.md (always-on guardrails and anti-patterns)

Situation-based docs:

Backend/CQRS/API/domain/entity changes: backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.md
Frontend/UI/styling/design-system: frontend-patterns-reference.md, scss-styling-guide.md, design-system/README.md
Spec/test-case planning or TC mapping: feature-docs-reference.md
Integration test implementation/review: integration-test-reference.md
E2E test implementation/review: e2e-test-reference.md
Code review/audit work: code-review-rules.md plus domain docs above based on changed files

Do not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.

Quick Summary

Goal: Create high-quality MCP (Model Context Protocol) servers enabling LLMs to interact with external services through well-designed tools.

Workflow:

Deep Research — Study agent-centric design principles, MCP protocol, SDK docs, API docs exhaustively
Implementation Plan — Tool selection, shared utilities, input/output design, error handling
Implementation — Project structure, core infrastructure, tools with proper schemas/docs, language-specific best practices
Review & Refine — Code quality review (DRY, composability, consistency), test & build
Create Evaluations — 10 realistic, complex, read-only evaluation questions with verified answers

Key Rules:

Build for Workflows: Not just API wrappers, consolidate related operations
Optimize for Context: Return high-signal info, support concise/detailed formats
Actionable Errors: Guide agents toward correct usage with specific suggestions
Scripts First: Execute detection/fetch/analyze scripts for zero-token overhead

Be skeptical. Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence percentages (Idea should be more than 80%).

MCP Server Development Guide

Overview

To create high-quality MCP (Model Context Protocol) servers that enable LLMs to effectively interact with external services, use this skill. An MCP server provides tools that allow LLMs to access external services and APIs. The quality of an MCP server is measured by how well it enables LLMs to accomplish real-world tasks using the tools provided.

Process

🚀 High-Level Workflow

Creating a high-quality MCP server involves four main phases:

Phase 1: Deep Research and Planning

1.1 Understand Agent-Centric Design Principles

Before diving into implementation, understand how to design tools for AI agents by reviewing these principles:

Build for Workflows, Not Just API Endpoints:

Don't simply wrap existing API endpoints - build thoughtful, high-impact workflow tools
Consolidate related operations (e.g., schedule_event that both checks availability and creates event)
Focus on tools that enable complete tasks, not just individual API calls
Consider what workflows agents actually need to accomplish

Optimize for Limited Context:

Agents have constrained context windows - make every token count
Return high-signal information, not exhaustive data dumps
Provide "concise" vs "detailed" response format options
Default to human-readable identifiers over technical codes (names over IDs)
Consider the agent's context budget as a scarce resource

Design Actionable Error Messages:

Error messages should guide agents toward correct usage patterns
Suggest specific next steps: "Try using filter='active_only' to reduce results"
Make errors educational, not just diagnostic
Help agents learn proper tool usage through clear feedback

Follow Natural Task Subdivisions:

Tool names should reflect how humans think about tasks
Group related tools with consistent prefixes for discoverability
Design tools around natural workflows, not just API structure

Use Evaluation-Driven Development:

Create realistic evaluation scenarios early
Let agent feedback drive tool improvements
Prototype quickly and iterate based on actual agent performance

1.3 Study MCP Protocol Documentation

Fetch the latest MCP protocol documentation:

Use WebFetch to load: https://modelcontextprotocol.io/llms-full.txt

This comprehensive document contains the complete MCP specification and guidelines.

1.4 Study Framework Documentation

Load and read the following reference files:

MCP Best Practices: 📋 View Best Practices - Core guidelines for all MCP servers

For Python implementations, also load:

Python SDK Documentation: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
🐍 Python Implementation Guide - Python-specific best practices and examples

For Node/TypeScript implementations, also load:

TypeScript SDK Documentation: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md
⚡ TypeScript Implementation Guide - Node/TypeScript-specific best practices and examples

1.5 Exhaustively Study API Documentation

To integrate a service, read through ALL available API documentation:

Official API reference documentation
Authentication and authorization requirements
Rate limiting and pagination patterns
Error responses and status codes
Available endpoints and their parameters
Data models and schemas

To gather comprehensive information, use web search and the WebFetch tool as needed.

1.6 Create a Comprehensive Implementation Plan

Based on your research, create a detailed plan that includes:

Tool Selection:

List the most valuable endpoints/operations to implement
Prioritize tools that enable the most common and important use cases
Consider which tools work together to enable complex workflows

Shared Utilities and Helpers:

Identify common API request patterns
Plan pagination helpers
Design filtering and formatting utilities
Plan error handling strategies

Input/Output Design:

Define input validation models (Pydantic for Python, Zod for TypeScript)
Design consistent response formats (e.g., JSON or Markdown), and configurable levels of detail (e.g., Detailed or Concise)
Plan for large-scale usage (thousands of users/resources)
Implement character limits and truncation strategies (e.g., 25,000 tokens)

Error Handling Strategy:

Plan graceful failure modes
Design clear, actionable, LLM-friendly, natural language error messages which prompt further action
Consider rate limiting and timeout scenarios
Handle authentication and authorization errors

Phase 2: Implementation

Now that you have a comprehensive plan, begin implementation following language-specific best practices.

2.1 Set Up Project Structure

For Python:

Create a single .py file or organize into modules if complex (see 🐍 Python Guide)
Use the MCP Python SDK for tool registration
Define Pydantic models for input validation

For Node/TypeScript:

Create proper project structure (see ⚡ TypeScript Guide)
Set up package.json and tsconfig.json
Use MCP TypeScript SDK
Define Zod schemas for input validation

2.2 Implement Core Infrastructure First

To begin implementation, create shared utilities before implementing tools:

API request helper functions
Error handling utilities
Response formatting functions (JSON and Markdown)
Pagination helpers
Authentication/token management

2.3 Implement Tools Systematically

For each tool in the plan:

Define Input Schema:

Use Pydantic (Python) or Zod (TypeScript) for validation
Include proper constraints (min/max length, regex patterns, min/max values, ranges)
Provide clear, descriptive field descriptions
Include diverse examples in field descriptions

Write Comprehensive Docstrings/Descriptions:

One-line summary of what the tool does
Detailed explanation of purpose and functionality
Explicit parameter types with examples
Complete return type schema
Usage examples (when to use, when not to use)
Error handling documentation, which outlines how to proceed given specific errors

Implement Tool Logic:

Use shared utilities to avoid code duplication
Follow async/await patterns for all I/O
Implement proper error handling
Support multiple response formats (JSON and Markdown)
Respect pagination parameters
Check character limits and truncate appropriately

Add Tool Annotations:

readOnlyHint: true (for read-only operations)
destructiveHint: false (for non-destructive operations)
idempotentHint: true (if repeated calls have same effect)
openWorldHint: true (if interacting with external systems)

2.4 Follow Language-Specific Best Practices

At this point, load the appropriate language guide:

For Python: Load 🐍 Python Implementation Guide and ensure the following:

Using MCP Python SDK with proper tool registration
Pydantic v2 models with model_config
Type hints throughout
Async/await for all I/O operations
Proper imports organization
Module-level constants (CHARACTER_LIMIT, API_BASE_URL)

For Node/TypeScript: Load ⚡ TypeScript Implementation Guide and ensure the following:

Using server.registerTool properly
Zod schemas with .strict()
TypeScript strict mode enabled
No any types - use proper types
Explicit Promise return types
Build process configured (npm run build)

Phase 3: Review and Refine

After initial implementation:

3.1 Code Quality Review

To ensure quality, review the code for:

DRY Principle: No duplicated code between tools
Composability: Shared logic extracted into functions
Consistency: Similar operations return similar formats
Error Handling: All external calls have error handling
Type Safety: Full type coverage (Python type hints, TypeScript types)
Documentation: Every tool has comprehensive docstrings/descriptions

3.2 Test and Build

Important: MCP servers are long-running processes that wait for requests over stdio/stdin or sse/http. Running them directly in your main process (e.g., python server.py or node dist/index.js) will cause your process to hang indefinitely.

Safe ways to test the server:

Use the evaluation harness (see Phase 4) - recommended approach
Run the server in tmux to keep it outside your main process
Use a timeout when testing: timeout 5s python server.py

For Python:

Verify Python syntax: python -m py_compile your_server.py
Check imports work correctly by reviewing the file
To manually test: Run server in tmux, then test with evaluation harness in main process
Or use the evaluation harness directly (it manages the server for stdio transport)

For Node/TypeScript:

Run npm run build and ensure it completes without errors
Verify dist/index.js is created
To manually test: Run server in tmux, then test with evaluation harness in main process
Or use the evaluation harness directly (it manages the server for stdio transport)

3.3 Use Quality Checklist

To verify implementation quality, load the appropriate checklist from the language-specific guide:

Python: see "Quality Checklist" in 🐍 Python Guide
Node/TypeScript: see "Quality Checklist" in ⚡ TypeScript Guide

Phase 4: Create Evaluations

After implementing your MCP server, create comprehensive evaluations to test its effectiveness.

Load ✅ Evaluation Guide for complete evaluation guidelines.

4.1 Understand Evaluation Purpose

Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions.

4.2 Create 10 Evaluation Questions

To create effective evaluations, follow the process outlined in the evaluation guide:

Tool Inspection: List available tools and understand their capabilities
Content Exploration: Use READ-ONLY operations to explore available data
Question Generation: Create 10 complex, realistic questions
Answer Verification: Solve each question yourself to verify answers

4.3 Evaluation Requirements

Each question must be:

Independent: Not dependent on other questions
Read-only: Only non-destructive operations required
Complex: Requiring multiple tool calls and deep exploration
Realistic: Based on real use cases humans would care about
Verifiable: Single, clear answer that can be verified by string comparison
Stable: Answer won't change over time

4.4 Output Format

Create an XML file with this structure:

<evaluation>
  <qa_pair>
    <question>Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat?</question>
    <answer>3</answer>
  </qa_pair>
<!-- More qa_pairs... -->
</evaluation>

Reference Files

📚 Documentation Library

Load these resources as needed during development:

Core MCP Documentation (Load First)

MCP Protocol: Fetch from https://modelcontextprotocol.io/llms-full.txt - Complete MCP specification
📋 MCP Best Practices - Universal MCP guidelines including:
- Server and tool naming conventions
- Response format guidelines (JSON vs Markdown)
- Pagination best practices
- Character limits and truncation strategies
- Tool development guidelines
- Security and error handling standards

SDK Documentation (Load During Phase 1/2)

Python SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
TypeScript SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md

Language-Specific Implementation Guides (Load During Phase 2)

🐍 Python Implementation Guide - Complete Python/FastMCP guide with:
- Server initialization patterns
- Pydantic model examples
- Tool registration with @mcp.tool
- Complete working examples
- Quality checklist
⚡ TypeScript Implementation Guide - Complete TypeScript guide with:
- Project structure
- Zod schema patterns
- Tool registration with server.registerTool
- Complete working examples
- Quality checklist

Evaluation Guide (Load During Phase 4)

✅ Evaluation Guide - Complete evaluation creation guide with:
- Question creation guidelines
- Answer verification strategies
- XML format specifications
- Example questions and answers
- Running an evaluation with the provided scripts

mcp-management
claude-code

[IMPORTANT] Use task tracking to break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ATTENTION ask user whether to skip.

AI Mistake Prevention — Failure modes to avoid on every task: Check downstream references before deleting. Deleting components causes documentation and code staleness cascades. Map all referencing files before removal. Verify AI-generated content against actual code. AI hallucinates APIs, class names, and method signatures. Always grep to confirm existence before documenting or referencing. Trace full dependency chain after edits. Changing a definition misses downstream variables and consumers derived from it. Always trace the full chain. Trace ALL code paths when verifying correctness. Confirming code exists is not confirming it executes. Always trace early exits, error branches, and conditional skips — not just happy path. When debugging, ask "whose responsibility?" before fixing. Trace whether bug is in caller (wrong data) or callee (wrong handling). Fix at responsible layer — never patch symptom site. Assume existing values are intentional — ask WHY before changing. Before changing any constant, limit, flag, or pattern: read comments, check git blame, examine surrounding code. Verify ALL affected outputs, not just the first. Changes touching multiple stacks require verifying EVERY output. One green check is not all green checks. Holistic-first debugging — resist nearest-attention trap. When investigating any failure, list EVERY precondition first (config, env vars, DB names, endpoints, DI registrations, data preconditions), then verify each against evidence before forming any code-layer hypothesis. Surgical changes — apply the diff test. Bug fix: every changed line must trace directly to the bug. Don't restyle or improve adjacent code. Enhancement task: implement improvements AND announce them explicitly. Surface ambiguity before coding — don't pick silently. If request has multiple interpretations, present each with effort estimate and ask. Never assume all-records, file-based, or more complex path.

Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.

MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact.

MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction.

Closing Reminders

IMPORTANT MUST ATTENTION break work into small todo tasks using task tracking BEFORE starting IMPORTANT MUST ATTENTION search codebase for 3+ similar patterns before creating new code IMPORTANT MUST ATTENTION cite file:line evidence for every claim (confidence >80% to act) IMPORTANT MUST ATTENTION add a final review todo task to verify work quality

[TASK-PLANNING] Before acting, analyze task scope and systematically break it into small todo tasks and sub-tasks using task tracking.

Hookless Prompt Protocol Mirror (Auto-Synced)

Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json

[WORKFLOW-EXECUTION-PROTOCOL] [BLOCKING] Workflow Execution Protocol — MANDATORY IMPORTANT MUST CRITICAL. Do not skip for any reason.

DETECT: Match prompt against workflow catalog
ANALYZE: Find best-match workflow AND evaluate if a custom step combination would fit better
ASK (REQUIRED FORMAT): Use a direct user question with this structure:
- Question: "Which workflow do you want to activate?"
- Option 1: "Activate [BestMatch Workflow] (Recommended)"
- Option 2: "Activate custom workflow: [step1 → step2 → ...]" (include one-line rationale)
ACTIVATE (if confirmed): Call $workflow-start <workflowId> for standard; sequence custom steps manually
CREATE TASKS: task tracking for ALL workflow steps
EXECUTE: Follow each step in sequence [CRITICAL-THINKING-MINDSET] Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination principle: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination. AI Attention principle (Primacy-Recency): Put the 3 most critical rules at both top and bottom of long prompts/protocols so instruction adherence survives long context windows.

Learned Lessons

Lessons Learned

[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.

Quick Summary

Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.

Top Rules (apply always):

MUST ATTENTION verify ALL preconditions (config, env, DB names, DI regs) BEFORE code-layer hypothesis
MUST ATTENTION fix responsible layer — NEVER patch symptom sites with caller-specific defensive code
MUST ATTENTION use ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTask
MUST ATTENTION name by PURPOSE not CONTENT — adding member forces rename = abstraction broken
MUST ATTENTION persist sub-agent findings incrementally after each file — NEVER batch at end
MUST ATTENTION Windows bash: verify Python alias (where python/where py) — NEVER assume python/python3 resolves

Debugging & Root Cause Reasoning

[2026-04-11] Holistic-first: verify environment before code. Failure → list ALL preconditions (config, env vars, DB names, endpoints, DI regs, credentials, permissions, data prerequisites) → verify each via evidence (grep/cat/query) BEFORE code-layer hypothesis. Worst rabbit holes: diving nearest layer while bug sits elsewhere — e.g., hours debugging "sync timeout", real cause: test appsettings pointing wrong DB. Cheapest check first.
[2026-04-01] Ask "whose responsibility?" before fixing. Trace: bug in caller (wrong data) or callee (wrong handling)? Fix responsible layer — NEVER patch symptom site masking real issue.
[2026-04-01] Trace data lifecycle, not error site. Follow data: creation → transformation → consumption. Bug usually where data created wrong, not consumed.
[2026-04-01] Code is caller-agnostic. Functions/handlers/consumers don't know who invokes them. Comments/guards/messages describe business intent — NEVER reference specific callers (tests, seeders, scripts).

Architecture Invariants

[2026-03-31] ParallelAsync + repo/UoW MUST use ExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).
[2026-03-31] Bus message naming MUST include service name prefix — core services NEVER consume feature events. Prefix declares schema ownership (AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.

Naming & Abstraction

[2026-04-12] Name PURPOSE not CONTENT — "OrXxx" anti-pattern. HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.

Environment & Tooling

[2026-04-20] Windows bash: NEVER assume python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.

Test-specific lessons → docs/project-reference/integration-test-reference.md Lessons Learned section. Production-code anti-patterns → docs/project-reference/backend-patterns-reference.md Anti-Patterns section. Generic debugging/refactoring reminders → System Lessons in .claude/hooks/lib/prompt-injections.cjs.

Closing Reminders

IMPORTANT MUST ATTENTION holistic-first: verify ALL preconditions (config, env, DB names, endpoints, DI regs) BEFORE code-layer hypothesis — cheapest check first
IMPORTANT MUST ATTENTION fix responsible layer — NEVER patch symptom site; trace caller (wrong data) vs callee (wrong handling), fix root owner
IMPORTANT MUST ATTENTION parallel async + repo/UoW → ALWAYS ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption)
IMPORTANT MUST ATTENTION bus message prefix = schema ownership; feature services NEVER define events for core services — use {CoreServiceName}...RequestBusMessage
IMPORTANT MUST ATTENTION name by PURPOSE — adding/removing member forces rename = broken abstraction
IMPORTANT MUST ATTENTION sub-agents MUST write findings after each file/section — NEVER batch all findings into one final write
IMPORTANT MUST ATTENTION Windows bash: NEVER assume python/python3 resolves — run where python/where py first, use py launcher or node

[LESSON-LEARNED-REMINDER] [BLOCKING] Task Planning & Continuous Improvement — MANDATORY. Do not skip.

Break work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".

Extract lessons — ROOT CAUSE ONLY, not symptom fixes:

Name the FAILURE MODE (reasoning/assumption failure), not symptom — "assumed API existed without reading source" not "used wrong enum value".
Generality test: does this failure mode apply to ≥3 contexts/codebases? If not, abstract one level up.
Write as a universal rule — strip project-specific names/paths/classes. Useful on any codebase.
Consolidate: multiple mistakes sharing one failure mode → ONE lesson.
Recurrence gate: "Would this recur in future session WITHOUT this reminder?" — No → skip $learn.
Auto-fix gate: "Could $code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.
BOTH gates pass → ask user to run $learn. [TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.

name	mcp-builder
description	[AI & Tools] Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).
disable-model-invocation	true

Codex compatibility note:

Invoke repository skills with $skill-name in Codex; this mirrored copy rewrites legacy Claude /skill-name references.

Prefer the plan-hard skill for planning guidance in this Codex mirror.

Task tracker mandate: BEFORE executing any workflow or skill step, create/update task tracking for all steps and keep it synchronized as progress changes.

User-question prompts mean to ask the user directly in Codex.

Ignore Claude-specific mode-switch instructions when they appear.

Strict execution contract: when a user explicitly invokes a skill, execute that skill protocol as written.

Subagent authorization: when a skill is user-invoked or AI-detected and its protocol requires subagents, that skill activation authorizes use of the required spawn_agent subagent(s) for that task.

Do not skip, reorder, or merge protocol steps unless the user explicitly approves the deviation first.

For workflow skills, execute each listed child-skill step explicitly and report step-by-step evidence.

If a required step/tool cannot run in this environment, stop and ask the user before adapting.

Codex Project-Reference Loading (No Hooks)

Codex does not receive Claude hook-based doc injection. When coding, planning, debugging, testing, or reviewing, open project docs explicitly using this routing.

Always read:

docs/project-config.json (project-specific paths, commands, modules, and workflow/test settings)
docs/project-reference/docs-index-reference.md (routes to the full docs/project-reference/* catalog)
docs/project-reference/lessons.md (always-on guardrails and anti-patterns)

Situation-based docs:

Backend/CQRS/API/domain/entity changes: backend-patterns-reference.md, domain-entities-reference.md, project-structure-reference.md
Frontend/UI/styling/design-system: frontend-patterns-reference.md, scss-styling-guide.md, design-system/README.md
Spec/test-case planning or TC mapping: feature-docs-reference.md
Integration test implementation/review: integration-test-reference.md
E2E test implementation/review: e2e-test-reference.md
Code review/audit work: code-review-rules.md plus domain docs above based on changed files

Do not read all docs blindly. Start from docs-index-reference.md, then open only relevant files for the task.

Quick Summary

Goal: Create high-quality MCP (Model Context Protocol) servers enabling LLMs to interact with external services through well-designed tools.

Workflow:

Deep Research — Study agent-centric design principles, MCP protocol, SDK docs, API docs exhaustively
Implementation Plan — Tool selection, shared utilities, input/output design, error handling
Implementation — Project structure, core infrastructure, tools with proper schemas/docs, language-specific best practices
Review & Refine — Code quality review (DRY, composability, consistency), test & build
Create Evaluations — 10 realistic, complex, read-only evaluation questions with verified answers

Key Rules:

Build for Workflows: Not just API wrappers, consolidate related operations
Optimize for Context: Return high-signal info, support concise/detailed formats
Actionable Errors: Guide agents toward correct usage with specific suggestions
Scripts First: Execute detection/fetch/analyze scripts for zero-token overhead

Be skeptical. Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence percentages (Idea should be more than 80%).

MCP Server Development Guide

Overview

Process

🚀 High-Level Workflow

Creating a high-quality MCP server involves four main phases:

Phase 1: Deep Research and Planning

1.1 Understand Agent-Centric Design Principles

Before diving into implementation, understand how to design tools for AI agents by reviewing these principles:

Build for Workflows, Not Just API Endpoints:

Don't simply wrap existing API endpoints - build thoughtful, high-impact workflow tools
Consolidate related operations (e.g., schedule_event that both checks availability and creates event)
Focus on tools that enable complete tasks, not just individual API calls
Consider what workflows agents actually need to accomplish

Optimize for Limited Context:

Agents have constrained context windows - make every token count
Return high-signal information, not exhaustive data dumps
Provide "concise" vs "detailed" response format options
Default to human-readable identifiers over technical codes (names over IDs)
Consider the agent's context budget as a scarce resource

Design Actionable Error Messages:

Error messages should guide agents toward correct usage patterns
Suggest specific next steps: "Try using filter='active_only' to reduce results"
Make errors educational, not just diagnostic
Help agents learn proper tool usage through clear feedback

Follow Natural Task Subdivisions:

Tool names should reflect how humans think about tasks
Group related tools with consistent prefixes for discoverability
Design tools around natural workflows, not just API structure

Use Evaluation-Driven Development:

Create realistic evaluation scenarios early
Let agent feedback drive tool improvements
Prototype quickly and iterate based on actual agent performance

1.3 Study MCP Protocol Documentation

Fetch the latest MCP protocol documentation:

Use WebFetch to load: https://modelcontextprotocol.io/llms-full.txt

This comprehensive document contains the complete MCP specification and guidelines.

1.4 Study Framework Documentation

Load and read the following reference files:

MCP Best Practices: 📋 View Best Practices - Core guidelines for all MCP servers

For Python implementations, also load:

Python SDK Documentation: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
🐍 Python Implementation Guide - Python-specific best practices and examples

For Node/TypeScript implementations, also load:

TypeScript SDK Documentation: Use WebFetch to load https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md
⚡ TypeScript Implementation Guide - Node/TypeScript-specific best practices and examples

1.5 Exhaustively Study API Documentation

To integrate a service, read through ALL available API documentation:

Official API reference documentation
Authentication and authorization requirements
Rate limiting and pagination patterns
Error responses and status codes
Available endpoints and their parameters
Data models and schemas

To gather comprehensive information, use web search and the WebFetch tool as needed.

1.6 Create a Comprehensive Implementation Plan

Based on your research, create a detailed plan that includes:

Tool Selection:

List the most valuable endpoints/operations to implement
Prioritize tools that enable the most common and important use cases
Consider which tools work together to enable complex workflows

Shared Utilities and Helpers:

Identify common API request patterns
Plan pagination helpers
Design filtering and formatting utilities
Plan error handling strategies

Input/Output Design:

Define input validation models (Pydantic for Python, Zod for TypeScript)
Design consistent response formats (e.g., JSON or Markdown), and configurable levels of detail (e.g., Detailed or Concise)
Plan for large-scale usage (thousands of users/resources)
Implement character limits and truncation strategies (e.g., 25,000 tokens)

Error Handling Strategy:

Plan graceful failure modes
Design clear, actionable, LLM-friendly, natural language error messages which prompt further action
Consider rate limiting and timeout scenarios
Handle authentication and authorization errors

Phase 2: Implementation

Now that you have a comprehensive plan, begin implementation following language-specific best practices.

2.1 Set Up Project Structure

For Python:

Create a single .py file or organize into modules if complex (see 🐍 Python Guide)
Use the MCP Python SDK for tool registration
Define Pydantic models for input validation

For Node/TypeScript:

Create proper project structure (see ⚡ TypeScript Guide)
Set up package.json and tsconfig.json
Use MCP TypeScript SDK
Define Zod schemas for input validation

2.2 Implement Core Infrastructure First

To begin implementation, create shared utilities before implementing tools:

API request helper functions
Error handling utilities
Response formatting functions (JSON and Markdown)
Pagination helpers
Authentication/token management

2.3 Implement Tools Systematically

For each tool in the plan:

Define Input Schema:

Use Pydantic (Python) or Zod (TypeScript) for validation
Include proper constraints (min/max length, regex patterns, min/max values, ranges)
Provide clear, descriptive field descriptions
Include diverse examples in field descriptions

Write Comprehensive Docstrings/Descriptions:

One-line summary of what the tool does
Detailed explanation of purpose and functionality
Explicit parameter types with examples
Complete return type schema
Usage examples (when to use, when not to use)
Error handling documentation, which outlines how to proceed given specific errors

Implement Tool Logic:

Use shared utilities to avoid code duplication
Follow async/await patterns for all I/O
Implement proper error handling
Support multiple response formats (JSON and Markdown)
Respect pagination parameters
Check character limits and truncate appropriately

Add Tool Annotations:

readOnlyHint: true (for read-only operations)
destructiveHint: false (for non-destructive operations)
idempotentHint: true (if repeated calls have same effect)
openWorldHint: true (if interacting with external systems)

2.4 Follow Language-Specific Best Practices

At this point, load the appropriate language guide:

For Python: Load 🐍 Python Implementation Guide and ensure the following:

Using MCP Python SDK with proper tool registration
Pydantic v2 models with model_config
Type hints throughout
Async/await for all I/O operations
Proper imports organization
Module-level constants (CHARACTER_LIMIT, API_BASE_URL)

For Node/TypeScript: Load ⚡ TypeScript Implementation Guide and ensure the following:

Using server.registerTool properly
Zod schemas with .strict()
TypeScript strict mode enabled
No any types - use proper types
Explicit Promise return types
Build process configured (npm run build)

Phase 3: Review and Refine

After initial implementation:

3.1 Code Quality Review

To ensure quality, review the code for:

DRY Principle: No duplicated code between tools
Composability: Shared logic extracted into functions
Consistency: Similar operations return similar formats
Error Handling: All external calls have error handling
Type Safety: Full type coverage (Python type hints, TypeScript types)
Documentation: Every tool has comprehensive docstrings/descriptions

3.2 Test and Build

Safe ways to test the server:

Use the evaluation harness (see Phase 4) - recommended approach
Run the server in tmux to keep it outside your main process
Use a timeout when testing: timeout 5s python server.py

For Python:

Verify Python syntax: python -m py_compile your_server.py
Check imports work correctly by reviewing the file
To manually test: Run server in tmux, then test with evaluation harness in main process
Or use the evaluation harness directly (it manages the server for stdio transport)

For Node/TypeScript:

Run npm run build and ensure it completes without errors
Verify dist/index.js is created
To manually test: Run server in tmux, then test with evaluation harness in main process
Or use the evaluation harness directly (it manages the server for stdio transport)

3.3 Use Quality Checklist

To verify implementation quality, load the appropriate checklist from the language-specific guide:

Python: see "Quality Checklist" in 🐍 Python Guide
Node/TypeScript: see "Quality Checklist" in ⚡ TypeScript Guide

Phase 4: Create Evaluations

After implementing your MCP server, create comprehensive evaluations to test its effectiveness.

Load ✅ Evaluation Guide for complete evaluation guidelines.

4.1 Understand Evaluation Purpose

Evaluations test whether LLMs can effectively use your MCP server to answer realistic, complex questions.

4.2 Create 10 Evaluation Questions

To create effective evaluations, follow the process outlined in the evaluation guide:

Tool Inspection: List available tools and understand their capabilities
Content Exploration: Use READ-ONLY operations to explore available data
Question Generation: Create 10 complex, realistic questions
Answer Verification: Solve each question yourself to verify answers

4.3 Evaluation Requirements

Each question must be:

Independent: Not dependent on other questions
Read-only: Only non-destructive operations required
Complex: Requiring multiple tool calls and deep exploration
Realistic: Based on real use cases humans would care about
Verifiable: Single, clear answer that can be verified by string comparison
Stable: Answer won't change over time

4.4 Output Format

Create an XML file with this structure:

<evaluation>
  <qa_pair>
    <question>Find discussions about AI model launches with animal codenames. One model needed a specific safety designation that uses the format ASL-X. What number X was being determined for the model named after a spotted wild cat?</question>
    <answer>3</answer>
  </qa_pair>
<!-- More qa_pairs... -->
</evaluation>

Reference Files

📚 Documentation Library

Load these resources as needed during development:

Core MCP Documentation (Load First)

MCP Protocol: Fetch from https://modelcontextprotocol.io/llms-full.txt - Complete MCP specification
📋 MCP Best Practices - Universal MCP guidelines including:
- Server and tool naming conventions
- Response format guidelines (JSON vs Markdown)
- Pagination best practices
- Character limits and truncation strategies
- Tool development guidelines
- Security and error handling standards

SDK Documentation (Load During Phase 1/2)

Python SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/python-sdk/main/README.md
TypeScript SDK: Fetch from https://raw.githubusercontent.com/modelcontextprotocol/typescript-sdk/main/README.md

Language-Specific Implementation Guides (Load During Phase 2)

🐍 Python Implementation Guide - Complete Python/FastMCP guide with:
- Server initialization patterns
- Pydantic model examples
- Tool registration with @mcp.tool
- Complete working examples
- Quality checklist
⚡ TypeScript Implementation Guide - Complete TypeScript guide with:
- Project structure
- Zod schema patterns
- Tool registration with server.registerTool
- Complete working examples
- Quality checklist

Evaluation Guide (Load During Phase 4)

✅ Evaluation Guide - Complete evaluation creation guide with:
- Question creation guidelines
- Answer verification strategies
- XML format specifications
- Example questions and answers
- Running an evaluation with the provided scripts

mcp-management
claude-code

[IMPORTANT] Use task tracking to break ALL work into small tasks BEFORE starting — including tasks for each file read. This prevents context loss from long files. For simple tasks, AI MUST ATTENTION ask user whether to skip.

Critical Thinking Mindset — Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination.

MUST ATTENTION apply critical thinking — every claim needs traced proof, confidence >80% to act. Anti-hallucination: never present guess as fact.

MUST ATTENTION apply AI mistake prevention — holistic-first debugging, fix at responsible layer, surface ambiguity before coding, re-read files after compaction.

Closing Reminders

[TASK-PLANNING] Before acting, analyze task scope and systematically break it into small todo tasks and sub-tasks using task tracking.

Hookless Prompt Protocol Mirror (Auto-Synced)

Source: .claude/hooks/lib/prompt-injections.cjs + .claude/.ck.json

[WORKFLOW-EXECUTION-PROTOCOL] [BLOCKING] Workflow Execution Protocol — MANDATORY IMPORTANT MUST CRITICAL. Do not skip for any reason.

DETECT: Match prompt against workflow catalog
ANALYZE: Find best-match workflow AND evaluate if a custom step combination would fit better
ASK (REQUIRED FORMAT): Use a direct user question with this structure:
- Question: "Which workflow do you want to activate?"
- Option 1: "Activate [BestMatch Workflow] (Recommended)"
- Option 2: "Activate custom workflow: [step1 → step2 → ...]" (include one-line rationale)
ACTIVATE (if confirmed): Call $workflow-start <workflowId> for standard; sequence custom steps manually
CREATE TASKS: task tracking for ALL workflow steps
EXECUTE: Follow each step in sequence [CRITICAL-THINKING-MINDSET] Apply critical thinking, sequential thinking. Every claim needs traced proof, confidence >80% to act. Anti-hallucination principle: Never present guess as fact — cite sources for every claim, admit uncertainty freely, self-check output for errors, cross-reference independently, stay skeptical of own confidence — certainty without evidence root of all hallucination. AI Attention principle (Primacy-Recency): Put the 3 most critical rules at both top and bottom of long prompts/protocols so instruction adherence survives long context windows.

Learned Lessons

Lessons Learned

[CRITICAL] Hard-won project debugging/architecture rules. MUST ATTENTION apply BEFORE forming hypothesis or writing code.

Quick Summary

Goal: Prevent recurrence of known failure patterns — debugging, architecture, naming, AI orchestration, environment.

Top Rules (apply always):

MUST ATTENTION verify ALL preconditions (config, env, DB names, DI regs) BEFORE code-layer hypothesis
MUST ATTENTION fix responsible layer — NEVER patch symptom sites with caller-specific defensive code
MUST ATTENTION use ExecuteInjectScopedAsync for parallel async + repo/UoW — NEVER ExecuteUowTask
MUST ATTENTION name by PURPOSE not CONTENT — adding member forces rename = abstraction broken
MUST ATTENTION persist sub-agent findings incrementally after each file — NEVER batch at end
MUST ATTENTION Windows bash: verify Python alias (where python/where py) — NEVER assume python/python3 resolves

Debugging & Root Cause Reasoning

[2026-04-11] Holistic-first: verify environment before code. Failure → list ALL preconditions (config, env vars, DB names, endpoints, DI regs, credentials, permissions, data prerequisites) → verify each via evidence (grep/cat/query) BEFORE code-layer hypothesis. Worst rabbit holes: diving nearest layer while bug sits elsewhere — e.g., hours debugging "sync timeout", real cause: test appsettings pointing wrong DB. Cheapest check first.
[2026-04-01] Ask "whose responsibility?" before fixing. Trace: bug in caller (wrong data) or callee (wrong handling)? Fix responsible layer — NEVER patch symptom site masking real issue.
[2026-04-01] Trace data lifecycle, not error site. Follow data: creation → transformation → consumption. Bug usually where data created wrong, not consumed.
[2026-04-01] Code is caller-agnostic. Functions/handlers/consumers don't know who invokes them. Comments/guards/messages describe business intent — NEVER reference specific callers (tests, seeders, scripts).

Architecture Invariants

[2026-03-31] ParallelAsync + repo/UoW MUST use ExecuteInjectScopedAsync, NEVER ExecuteUowTask. ExecuteUowTask creates new UoW but reuses outer DI scope (same DbContext) — parallel iterations sharing non-thread-safe DbContext silently corrupt data. ExecuteInjectScopedAsync creates new UoW + new DI scope (fresh repo per iteration).
[2026-03-31] Bus message naming MUST include service name prefix — core services NEVER consume feature events. Prefix declares schema ownership (AccountUserEntityEventBusMessage = Accounts owns). Core services (Accounts, Communication) are leaders. Feature services (Growth, Talents) sending to core MUST use {CoreServiceName}...RequestBusMessage — never define own event for core to consume.

Naming & Abstraction

[2026-04-12] Name PURPOSE not CONTENT — "OrXxx" anti-pattern. HrManagerOrHrOrPayrollHrOperationsPolicy names set members, not what it guards. Add role → rename = broken abstraction. Rule: names express DOES/GUARDS, not CONTAINS. Test: adding/removing member forces rename? YES = content-driven = bad → rename to purpose (e.g., HrOperationsAccessPolicy). Nuance: "Or" fine in behavioral idioms (FirstOrDefault, SuccessOrThrow) — expresses HAPPENS, not membership.

Environment & Tooling

[2026-04-20] Windows bash: NEVER assume python/python3 resolves — verify alias first. Python may not be in bash PATH under those names. Check: where python / where py. Prefer py (Windows Python Launcher) for one-liners, node if JS alternative exists.

Test-specific lessons → docs/project-reference/integration-test-reference.md Lessons Learned section. Production-code anti-patterns → docs/project-reference/backend-patterns-reference.md Anti-Patterns section. Generic debugging/refactoring reminders → System Lessons in .claude/hooks/lib/prompt-injections.cjs.

Closing Reminders

IMPORTANT MUST ATTENTION holistic-first: verify ALL preconditions (config, env, DB names, endpoints, DI regs) BEFORE code-layer hypothesis — cheapest check first
IMPORTANT MUST ATTENTION fix responsible layer — NEVER patch symptom site; trace caller (wrong data) vs callee (wrong handling), fix root owner
IMPORTANT MUST ATTENTION parallel async + repo/UoW → ALWAYS ExecuteInjectScopedAsync, NEVER ExecuteUowTask (shared DbContext = silent data corruption)
IMPORTANT MUST ATTENTION bus message prefix = schema ownership; feature services NEVER define events for core services — use {CoreServiceName}...RequestBusMessage
IMPORTANT MUST ATTENTION name by PURPOSE — adding/removing member forces rename = broken abstraction
IMPORTANT MUST ATTENTION sub-agents MUST write findings after each file/section — NEVER batch all findings into one final write
IMPORTANT MUST ATTENTION Windows bash: NEVER assume python/python3 resolves — run where python/where py first, use py launcher or node

[LESSON-LEARNED-REMINDER] [BLOCKING] Task Planning & Continuous Improvement — MANDATORY. Do not skip.

Break work into small tasks (task tracking) before starting. Add final task: "Analyze AI mistakes & lessons learned".

Extract lessons — ROOT CAUSE ONLY, not symptom fixes:

Name the FAILURE MODE (reasoning/assumption failure), not symptom — "assumed API existed without reading source" not "used wrong enum value".
Generality test: does this failure mode apply to ≥3 contexts/codebases? If not, abstract one level up.
Write as a universal rule — strip project-specific names/paths/classes. Useful on any codebase.
Consolidate: multiple mistakes sharing one failure mode → ONE lesson.
Recurrence gate: "Would this recur in future session WITHOUT this reminder?" — No → skip $learn.
Auto-fix gate: "Could $code-review/$code-simplifier/$security/$lint catch this?" — Yes → improve review skill instead.
BOTH gates pass → ask user to run $learn. [TASK-PLANNING] [MANDATORY] BEFORE executing any workflow or skill step, create/update task tracking for all planned steps, then keep it synchronized as each step starts/completes.

mcp-builder

Codex Project-Reference Loading (No Hooks)

Quick Summary

MCP Server Development Guide

Overview

Process

🚀 High-Level Workflow

Phase 1: Deep Research and Planning

1.1 Understand Agent-Centric Design Principles

1.3 Study MCP Protocol Documentation

1.4 Study Framework Documentation

1.5 Exhaustively Study API Documentation

1.6 Create a Comprehensive Implementation Plan

Phase 2: Implementation

2.1 Set Up Project Structure

2.2 Implement Core Infrastructure First

2.3 Implement Tools Systematically

2.4 Follow Language-Specific Best Practices

Phase 3: Review and Refine

3.1 Code Quality Review

3.2 Test and Build

3.3 Use Quality Checklist

Phase 4: Create Evaluations

4.1 Understand Evaluation Purpose

4.2 Create 10 Evaluation Questions

4.3 Evaluation Requirements

4.4 Output Format

Reference Files

📚 Documentation Library

Core MCP Documentation (Load First)

SDK Documentation (Load During Phase 1/2)

Language-Specific Implementation Guides (Load During Phase 2)

Evaluation Guide (Load During Phase 4)

Related

Closing Reminders

Hookless Prompt Protocol Mirror (Auto-Synced)

[WORKFLOW-EXECUTION-PROTOCOL] [BLOCKING] Workflow Execution Protocol — MANDATORY IMPORTANT MUST CRITICAL. Do not skip for any reason.

Learned Lessons

Lessons Learned

Quick Summary

Debugging & Root Cause Reasoning

Architecture Invariants

Naming & Abstraction

Environment & Tooling

Closing Reminders

[LESSON-LEARNED-REMINDER] [BLOCKING] Task Planning & Continuous Improvement — MANDATORY. Do not skip.

Codex Project-Reference Loading (No Hooks)

Quick Summary

MCP Server Development Guide

Overview

Process

🚀 High-Level Workflow

Phase 1: Deep Research and Planning

1.1 Understand Agent-Centric Design Principles

1.3 Study MCP Protocol Documentation

1.4 Study Framework Documentation

1.5 Exhaustively Study API Documentation

1.6 Create a Comprehensive Implementation Plan

Phase 2: Implementation

2.1 Set Up Project Structure

2.2 Implement Core Infrastructure First

2.3 Implement Tools Systematically

2.4 Follow Language-Specific Best Practices

Phase 3: Review and Refine

3.1 Code Quality Review

3.2 Test and Build

3.3 Use Quality Checklist

Phase 4: Create Evaluations

4.1 Understand Evaluation Purpose

4.2 Create 10 Evaluation Questions

4.3 Evaluation Requirements

4.4 Output Format

Reference Files

📚 Documentation Library

Core MCP Documentation (Load First)

SDK Documentation (Load During Phase 1/2)

Language-Specific Implementation Guides (Load During Phase 2)

Evaluation Guide (Load During Phase 4)

Related

Closing Reminders