| name | trainer-train-prompt |
| description | Own the end-to-end trainer loop for prompt-like files (*.prompt.md, *.prompty, *.instructions.md, system prompts, and other natural-language instruction artifacts). Use this whenever the caller needs to research, synthesize datasets, optimize, validate, and write back a trained candidate for a prompt-type target. Prefer this specialized loop for any file whose primary content is natural-language instructions rather than code, skill configuration, or agent contracts. |
| argument-hint | Describe the target prompt file, the repository root, the validation command, the available stage capabilities (researcher, synthesizer, optimizer, elector), and any existing dataset or workspace artifacts. |
| license | MIT |
| compatibility | Python 3.11+. Works in any repository that keeps trainer artifacts in `.trainer-workspace/` next to the selected target. |
| metadata | {"author":"Tyler Kendrick","version":"0.1.0"} |
Trainer Train - Prompt
Use this skill as the orchestration contract for one trainer run against a prompt-like target: any *.prompt.md, *.prompty, *.instructions.md, system prompt, or natural-language instruction file.
Read references/prompt-loop-contract.md for the full routing table, judge-mode rules, and prompt-specific validation constraints before any stage execution.
When to use this skill
- The selected target is a prompt file, instruction file, or prompty artifact.
- The caller needs to initialize or resume a trainer workspace for a prompt target.
- Missing datasets or eval manifests need to be synthesized before optimization.
- The optimization stage returns a manual follow-up payload and the loop must continue.
- A winning candidate needs to be validated and written back to the source prompt file.
Do not use this skill for code files, skill files, or agent contract files. Read the parent trainer skill's references/target-routing.md to identify the appropriate specialist for those target types.
Required inputs
- One selected prompt-like target file.
- Repository root or enough path context to derive the local trainer workspace.
- The validation command for the repository (e.g.,
python -m pytest -q).
- The concrete stage capability map: researcher, synthesizer, optimizer, elector.
- The currently available specialist-agent roster.
- Any existing workspace artifacts to reuse.
Prompt-specific loop rules
Judge mode
Prompt targets are almost always open-ended. Default to llm_judge mode unless the target has an explicit scoring: exact_match row in the dataset. If any row declares scoring, treat that as authoritative.
Placeholder preservation
Never remove, rename, or reorder template placeholders (e.g., {{variable}}, {input}) during optimization or write-back. Confirm placeholder set is unchanged before any candidate write-back.
Evaluator field isolation
Keep expected, reference, criteria, and scoring fields out of the prompt-visible render path. These are evaluator-only fields and must not appear in the optimized prompt text.
Few-shot and chain-of-thought patterns
When the dataset rows expose example pairs or step-by-step reasoning traces, preserve those structural patterns in the optimized candidate. Do not flatten multi-turn or chain-of-thought structures into a single instruction block.
Core workflow
Follow this order. Consult references/prompt-loop-contract.md when artifact paths, scoring mode, or stage boundaries are uncertain.
- Resolve target and workspace. Derive
<prompt-name> by stripping .prompty entirely or stripping the final extension. Use <target-dir>/.trainer-workspace/<prompt-name>/ as workspace root. If state indicates a resumed run, audit tracked artifact pointers and skip only stages that already produced valid outputs.
- Require the workspace review checkpoint. Confirm the engineering review artifact exists before optimization starts. Report a blocker if it is absent.
- Initialize or refresh workspace. Create or update
workflow-status.json, source snapshot, the review subdirectory, inputs/source/, and iterations/ directories.
- Inspect existing datasets and evals. Prefer reuse when train, validation, and authored eval assets already fit the prompt target and scoring shape. Keep authored evals, train data, and validation data as separate artifacts in separate files.
- Run missing-data path if needed. If any required dataset or eval is absent or the validation split is not a genuine holdout, pause optimization and gather them via the caller-supplied researcher and synthesizer before continuing.
- Infer judge mode. Inspect representative dataset rows. Default to
llm_judge for prompt targets. Treat an explicit row-level scoring declaration as authoritative. Stop and report inconsistency if train and validation splits imply different modes.
- Run at least one optimization pass. Pass the inferred judge mode and the prompt-specific constraints (placeholder preservation, evaluator field isolation) to the optimizer.
- Handle manual follow-up if returned. Save the report as
manual-followup-report.json, answer the model-facing prompt, persist optimized-prompt.md, and continue the loop.
- Run election if multiple candidates exist. Use the caller-supplied elector when optimization produces more than one defensible candidate.
- Publish iteration artifacts. Stage steering, candidate bundles, validation logs, and a decision summary under the active iteration directory.
- Write back only when validation passes. Confirm placeholder preservation and validation success before writing the winning candidate back to the source prompt file.
Blocker-first rule
Stop and report a clear blocker before any optimization or rewrite when:
- The workspace review artifact is absent.
- Required datasets or authored evals are missing.
- Tracked artifact pointers from a resumed run are missing or inconsistent.
- Train and validation splits imply different judge modes.
A blocker report must name the missing artifact, explain why the loop cannot advance, and leave workflow-status.json in a resumable checkpoint state.
Output contract
Return:
- Workspace status and any active blockers.
- Current-turn decisions: reuse choice, judge mode, selected branch, blockers.
- Optimization or manual follow-up status with artifact paths.
- Placeholder preservation confirmation.
- Validation status.
- Write-back decision and justification.
- Next required action to resume or continue the loop.