// "Generates production-ready BAML applications from natural language requirements. Creates complete type definitions, functions, clients, tests, and framework integrations for data extraction, classification, RAG, and agent workflows. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), 6 programming languages (Python, TypeScript, Ruby, Java, Go, C#), 10+ frameworks, 50-70% token optimization, and 95%+ compilation success."
| name | baml-codegen |
| description | Generates production-ready BAML applications from natural language requirements. Creates complete type definitions, functions, clients, tests, and framework integrations for data extraction, classification, RAG, and agent workflows. Queries official BoundaryML repositories via MCP for real-time patterns. Supports multimodal inputs (images, audio), 6 programming languages (Python, TypeScript, Ruby, Java, Go, C#), 10+ frameworks, 50-70% token optimization, and 95%+ compilation success. |
Version: 1.2.0 Status: Production (Two-Phase MCP Validation) Token Budget: ~3800 tokens (validated)
Generate production-ready BAML applications by querying official BoundaryML repositories through MCP servers. This skill extracts patterns, validates syntax, and synthesizes complete solutions including types, functions, tests, and integrations.
Required MCP Servers:
baml_Docs: Core BAML repository accessbaml_Examples: Production pattern examples (optional)Validation: Check MCP availability before proceeding. If unavailable, use embedded fallback cache.
Algorithm: Multi-stage pattern detection
INPUT: Natural language requirement
OUTPUT: Pattern category + matched examples
STEPS:
1. Tokenize requirement text
2. Match trigger keywords:
- extraction: [extract, parse, structure, data]
- classification: [classify, categorize, label]
- rag: [search, retrieve, context, citation]
- agents: [agent, plan, execute, tool]
3. Score categories (frequency + position weighting)
4. Select primary category (threshold > 0.5)
5. **EXECUTE MCP QUERIES** (Critical for freshness):
- mcp__baml_Examples__search_baml_examples_code("{category} extension:baml")
- Fetch top 2-3 results via mcp__baml_Docs__fetch_generic_url_content
- Parse types, functions, prompts from real examples
- VALIDATE: Query baml_Docs to verify syntax is current (not deprecated)
- Output: "๐ Found {X} patterns from BoundaryML/baml-examples"
- Fall back to cache only if MCP unavailable
Template-Based Generation:
PHASE 1: Type Inference
- Extract entities from requirements
- Map to BAML types (class, enum, primitive)
- Generate type definitions with @description
PHASE 2: Function Generation
- Determine function signature
- Select appropriate client/model
- Synthesize prompt template
- Inject ctx.output_format
PHASE 3: Client Configuration
- Match model to use case:
* Multimodal โ vision model
* Classification โ fast model (GPT-5-mini)
* Complex โ reasoning model (GPT-5)
- Generate client block with options
PHASE 4: Validation
- Syntax check (BAML grammar)
- Type safety (all refs defined)
- Completeness (all components present)
Available Tools:
mcp__baml_Docs__fetch_baml_documentation:
Purpose: Fetch complete README
Use: General context, fallback
Cost: ~3000 tokens
mcp__baml_Docs__search_baml_documentation:
Purpose: Semantic search
Use: Find specific features
Cost: 200-1000 tokens
mcp__baml_Docs__search_baml_code:
Purpose: GitHub code search
Use: Find implementation examples
Cost: ~500 tokens (30 results/page)
mcp__baml_Docs__fetch_generic_url_content:
Purpose: Fetch specific file
Use: Detailed reference
Cost: 500-2000 tokens
Query Strategy:
Core: class, enum, function, client | Types: string, int, float, bool, Type[], Type?, Type|Type Key Patterns: @description("..."), {{ param }}, {{ ctx.output_format }} Providers: openai, anthropic, gemini, vertex, bedrock
Query MCP for complete syntax: mcp__baml_Docs__search_baml_documentation("syntax")
Token Reduction:
Performance:
5-Layer Validation:
Success Criteria:
User Request
โ
[1] Analyze Requirements
- Parse text
- Identify pattern (extraction/classification/rag/agents)
- Extract entities and constraints
โ
[2] Pattern Matching **๐ MCP REQUIRED**
- Execute: mcp__baml_Examples__search_baml_examples_code
- Fetch: mcp__baml_Docs__fetch_generic_url_content
- Parse: Extract types/functions/prompts from real code
- Rank by similarity (>0.7 threshold)
- Select top 3 candidates
- Output: "๐ Found {X} patterns from BoundaryML/baml-examples"
โ
[2.5] Syntax Validation **๐ baml_Docs**
- Query: mcp__baml_Docs__search_baml_documentation("syntax {feature}")
- Compare: Example syntax vs canonical docs from BoundaryML/baml
- Modernize: Update deprecated patterns to current spec
- Output: "โ
Validated against BoundaryML/baml" OR "๐ง Modernized {N} patterns"
โ
[3] Code Generation
- Generate types (classes, enums)
- Generate function (signature, prompt, client)
- Generate client configuration
โ
[4] Test Generation
- Happy path tests
- Edge case tests
- Error handling tests
โ
[5] Integration Generation
- Python/TypeScript/Ruby client code
- Framework-specific endpoints
- Deployment configuration
โ
[6] Validation & Optimization
- Validate syntax and types
- Optimize for tokens
- Estimate costs
โ
[6.5] Error Recovery **๐ง IF ERRORS**
- Extract: Parse error message from validation
- Query: mcp__baml_Docs__search_baml_documentation("{error}")
- Fetch: Current syntax spec from BoundaryML/baml
- Fix: Update code to match canonical specification
- Retry: Re-validate (max 2 attempts)
- Output: "๐ง Fixed {N} errors using BoundaryML/baml docs"
โ
[7] Deliver Artifacts
- BAML code
- Test suite
- Integration code
- Deployment config
- Performance metadata
CRITICAL: Always query MCP for fresh patterns during code generation.
Observable Indicators (show user MCP usage):
Execution Order:
Multi-Tier Cache:
Tier 1 (Embedded): Core syntax (500 tokens, never expires)
Tier 2 (Session): Recent patterns (15 min TTL, LRU eviction)
Tier 3 (Persistent): Top 20 patterns (7 day TTL)
Tier 4 (MCP): Live queries (no cache, always fresh)
Invalidation:
MCP Unavailable:
Pattern Not Found:
Validation Failure:
Generation Timeout:
Always include:
User: "Generate BAML to extract invoice data"
Output: Invoice class, ExtractInvoice function, tests, FastAPI integration
User: "Create sentiment classifier"
Output: Sentiment enum, ClassifySentiment function, confidence scoring
User: "Build citation-aware search"
Output: Citation class, SearchWithCitations function, source tracking
User: "Create planning agent"
Output: Task/Plan types, PlanExecution function, state management
Pattern Updates:
Skill Updates:
Token Count: ~3800 tokens (validated) Last Updated: 2025-01-30 Version: 1.2.0 - Two-phase MCP validation (baml_Examples โ baml_Docs) Ready for Production: Yes