// Optimize and manage AILANG teaching prompts for maximum conciseness and accuracy. Use when user asks to create/update prompts, optimize prompt length, or verify prompt accuracy.
| name | Prompt Manager |
| description | Optimize and manage AILANG teaching prompts for maximum conciseness and accuracy. Use when user asks to create/update prompts, optimize prompt length, or verify prompt accuracy. |
Mission: Create concise, accurate teaching prompts with maximum information density.
Target: ~4000 tokens per prompt (currently ~8000+) Strategy: Reference external docs, use tables, consolidate examples Validation: Maintain eval success rates while reducing prompt size
Invoke when user mentions:
NEW: Prompts are now accessible via ailang prompt command (single source of truth).
# Get current/active prompt
ailang prompt
# Get specific version
ailang prompt --version v0.3.24
# List all available versions
ailang prompt --list
# Show metadata
ailang prompt --version v0.4.2 --info
# Save prompt to file for editing
ailang prompt > temp_prompt.md
# Pipe to pager for reading
ailang prompt | less
# Quick syntax reference
ailang prompt | grep -A 20 "Quick Reference"
Implementation:
internal/prompt/loader.go (reads from prompts/versions.json)cmd/ailang/prompt.gointernal/prompt package (single source of truth)IMPORTANT: When you edit a prompt file (e.g., prompts/v0.4.2.md), you MUST update its hash in prompts/versions.json for downstream users!
# 1. Edit the prompt file
vim prompts/v0.4.2.md
# 2. Update the hash in versions.json (REQUIRED!)
.claude/skills/prompt-manager/scripts/update_hash.sh v0.4.2
# 3. Verify downstream users see the change
ailang prompt --version v0.4.2 | head -20
# 4. If this is the active version, verify default users see it
ailang prompt | head -20
Why this matters:
ailang prompt reads from prompts/versions.json → uses File field to locate promptinternal/prompt package → same versions.json sourceSingle Source of Truth: prompts/versions.json is the registry. Update it, and everyone sees the change.
Note: The eval harness's legacy PromptLoader (different from internal/prompt) DOES validate hashes. We're migrating to the simpler loader that doesn't validate (for easier development iteration).
.claude/skills/prompt-manager/scripts/create_prompt_version.sh <new_version> <base_version> "<description>"
Creates versioned prompt file, computes hash, updates versions.json
.claude/skills/prompt-manager/scripts/update_hash.sh <version>
Recomputes SHA256 after edits
.claude/skills/eval-analyzer/scripts/verify_prompt_accuracy.sh <version>
Catches prompt-code mismatches, false limitations
.claude/skills/prompt-manager/scripts/check_examples_coverage.sh <version>
Verifies that features used in working examples are documented in prompt
.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.17.md
Shows: word count, section sizes, code blocks, tables, optimization opportunities
.claude/skills/prompt-manager/scripts/test_prompt.sh v0.3.18
Runs AILANG-only eval (no Python) with dev models to test prompt effectiveness
.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.16.md
Sample output:
Total words: 4358 (target: <4000)
Total lines: 1214 (target: <200)
⚠️ OVER TARGET by 358 words (8%)
Code blocks: 60 (target: 5-10 comprehensive)
Table rows: 0 (target: 10+ tables)
Top sections by size:
719 words - Effect System
435 words - List Operations
368 words - Algebraic Data Types
High-ROI optimization areas identified by script:
.claude/skills/prompt-manager/scripts/create_prompt_version.sh v0.3.17 v0.3.16 "Optimize for conciseness (-50% tokens)"
Reference resources/prompt_optimization.md for:
Key techniques:
⚠️ CRITICAL: Must validate AFTER each optimization step!
# 1. CHECK ALL CODE EXAMPLES (NEW REQUIREMENT!)
# Extract and test every AILANG code block in the prompt
# This catches syntax errors that cause regressions
.claude/skills/prompt-manager/scripts/validate_all_code.sh prompts/v0.3.17.md
# 2. Check new size
.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.17.md
# 3. Verify accuracy (no false limitations)
.claude/skills/eval-analyzer/scripts/verify_prompt_accuracy.sh v0.3.17
# 4. Check examples coverage (NEW - v0.4.1+)
.claude/skills/prompt-manager/scripts/check_examples_coverage.sh v0.3.17
# Ensures working examples are documented in prompt
# 5. Update hash
.claude/skills/prompt-manager/scripts/update_hash.sh v0.3.17
# 6. TEST PROMPT EFFECTIVENESS (CRITICAL!)
.claude/skills/prompt-manager/scripts/test_prompt.sh v0.3.17
# This runs AILANG-only eval (no Python baseline) with dev models
# Target: >40% AILANG success rate
Success criteria:
⚠️ If success rate drops >10%, REVERT and try smaller optimization
Add header to optimized prompt:
---
Version: v0.3.17
Optimized: 2025-10-22
Token reduction: -54% (8200 → 3800 tokens)
Baseline: v0.3.16→v0.3.17 success rate maintained
---
git add prompts/v0.3.17.md prompts/versions.json
git commit -m "feat: Optimize v0.3.17 prompt for conciseness
- Reduced tokens: 8200 → 3800 (-54%)
- Builtin docs: prose → tables + reference ailang builtins list
- Examples: 24 scattered → 8 consolidated comprehensive
- Type system: moved details to docs/guides/types.md
- Added quick reference section at top
- Validated: eval success rate maintained"
Full guide: resources/prompt_optimization.md
Detailed workflows: resources/workflow_guide.md
Create version → Remove "❌ NO X" → Add "✅ X" with examples → Verify → Commit
Create version → Add to capabilities table → Add consolidated example → Verify → Commit
Analyze size → Identify high-ROI sections → Apply techniques → Validate success rate → Document metrics → Commit
Benchmarks use TWO different fields for prompts:
| Field | Effect | When to Use |
|---|---|---|
prompt: | REPLACES the teaching prompt | Only for language-agnostic tasks |
task_prompt: | APPENDS to teaching prompt | Use this for AILANG benchmarks! |
Example - WRONG (teaching prompt ignored):
prompt: |
Write a program that parses JSON...
Example - CORRECT (teaching prompt + task):
task_prompt: |
Write a program that parses JSON...
Why this matters: If prompt: is used, AILANG models don't see the teaching prompt at all - they only see the task description. They won't know AILANG syntax!
Best practice: Always load the current AILANG teaching prompt (ailang prompt) when editing prompts or benchmarks, so you understand what models will see.
Before modifying the AILANG teaching prompt, load it to understand the syntax:
ailang prompt > /tmp/current_prompt.md
# Read and understand AILANG syntax patterns
# Then make informed edits
This prevents introducing syntax errors or patterns that don't match AILANG's actual capabilities.
Target prompt profile:
What happened: Optimized v0.3.17 → v0.3.18 with -59% token reduction (5189 → 2126 words) Result: AILANG success rate collapsed to 4.8% (from expected ~40-60%)
Root causes:
What happened: Prompt had 3 syntax errors: (1) match { | pattern => (wrong), (2) import "std/io" (wrong), (3) let (x, y) = tuple (wrong)
Result: -4.8% regression (40.0% → 35.2%), 18 benchmarks failed with PAR_001 compile errors
Root cause: No validation that code examples in prompt actually work with AILANG parser
Critical lessons:
Full analysis: OPTIMIZATION_FAILURE_ANALYSIS.md