| name | template-optimizer |
| description | Optimize YAML templates for Hyper-Extract.
Use when: "optimize template", "fix YAML issues", "improve quality", "lint template"
Trigger: After creating templates or during review
Skip: Creating new templates (use brainstorm + designer instead)
|
Template Optimizer
Automatically analyze and optimize YAML templates by applying best practices and fixing common issues.
Workflow
1. Parse YAML ā 2. Analyze Issues ā 3. Apply Fixes ā 4. Generate Report
Step 1: Parse YAML
Load the YAML template and validate basic structure.
Step 2: Analyze Issues
Check against these rules:
Step 3: Apply Fixes
Apply fixes based on optimization level:
| Level | Description | Example |
|---|
| Auto-fix | Safe changes that always improve | relation_type ā type |
| Suggest | Changes that may need review | Field count > 5 |
| Review | Design decisions | Relation type openness |
Step 4: Generate Report
Output changes with explanations for learning.
Detection Rules
Rule 1: Multi-language Consistency
ā Pattern: [a-zA-Z]+\([^)]+\) in zh fields
ā Pattern: Chinese chars in en fields
Fix: Separate language content, use pure Chinese in zh, pure English in en.
Rule 2: Field Naming
ā relation_type ā type
ā event_date ā time
ā entity_type ā type
Fix: Standardize to concise names.
Rule 3: Field Count
ā ļø entities.fields > 5 ā Flag for review
ā ļø relations.fields > 5 ā Flag for review
Fix: Simplify to essential fields, use priority: Essential ā Important ā Optional.
Rule 4: Schema-Guideline Separation
ā Repetition of field definitions in guideline
ā Schema descriptions in rules
Fix: Schema defines WHAT, Guideline defines HOW TO DO WELL.
Rule 5: Hypergraph Grouping
ā relation_members: participants (string) + role field exists
ā relation_id contains participant field
Fix: Use nested grouping relation_members: [group_a, group_b] when entities partition by roles.
Design Principles
Schema vs Guideline
| Schema Defines | Guideline Defines |
|---|
| Field names | Extraction strategy |
| Field types | Quality requirements |
| Field descriptions | Creation conditions |
| Required/optional | Common mistakes |
ā Wrong: Guideline repeats schema definitions
ā
Correct: Guideline explains how to extract well
Information Density
For entities and relations:
| Field Priority | Examples |
|---|
| Essential | source, target, participants |
| Important | type, time, location |
| Optional | description, metadata |
Max: 5 fields per component
Naming Conventions
| For | Use | Example |
|---|
| Template name | CamelCase | EarningsSummary |
| Field names | snake_case | company_name |
| Tags | lowercase | finance, investor |
Reference Files
Output Format
# Optimization Report
## Changes Made
| File | Issue | Fix | Level |
|------|-------|-----|-------|
| template.yaml | `relation_type` | renamed to `type` | Auto-fix |
| template.yaml | Mixed language | Fixed zh/en separation | Auto-fix |
| template.yaml | 7 relation fields | Suggested simplification | Suggest |
## Summary
- Auto-fix: 2
- Suggestions: 1
- Manual review: 0
Integration
Full workflow: brainstorm ā designer ā optimizer ā validator
ā
Apply best practices
Auto-fix common issues
When to use:
- After creating new templates
- Before validator
- During template review
- Batch optimization of existing templates