Appends structured decision entries to the yahoo-mlb decision log (tracker/decisions-log.md) on behalf of any agent in the MLB team. Validates entries against the authoritative schema, serializes concurrent writes from parallel agents, and runs the Monday calibration pass to fill in outcomes and update the variant scoreboard. Use when any MLB agent needs to record a decision, when the coach requests "log decision", "append to decision log", "record variant outcome", or runs the "calibration pass".
Appends structured decision entries to the yahoo-mlb decision log (tracker/decisions-log.md) on behalf of any agent in the MLB team. Validates entries against the authoritative schema, serializes concurrent writes from parallel agents, and runs the Monday calibration pass to fill in outcomes and update the variant scoreboard. Use when any MLB agent needs to record a decision, when the coach requests "log decision", "append to decision log", "record variant outcome", or runs the "calibration pass".
Scenario: Three agents fire in a single morning brief on 2026-04-17. Each agent hands a decision payload to this skill. On Monday 2026-04-21, the coach runs the calibration pass over Friday's lineup decision.
Sequence:
mlb-lineup-optimizer submits a start/sit decision for Junior Caminero.
mlb-waiver-analyst submits a FAAB bid on a two-start streamer.
mlb-streaming-strategist submits a stream pickup.
Skill behavior:
Reads the current tail of tracker/decisions-log.md.
Auto-assigns decision_id by counting same-day, same-type entries and incrementing (2026-04-17-lineup-01, 2026-04-17-waiver-01, 2026-04-17-stream-01).
Validates each payload against the schema in context/frameworks/decision-log-format.md.
Appends entries in timestamp order, one at a time, re-reading the tail between writes.
Returns the assigned decision_id to the calling agent.
Monday calibration (2026-04-21):
Reads all entries where will_verify_on <= 2026-04-21 and outcome_recorded_on is empty.
For each entry, agent web-searches the outcome and passes result to this skill.
Skill edits the target entry in place (only the three outcome fields) and increments the matching row in tracker/variant-scoreboard.md.
Recomputes tilt per agent using the advocate/critic counts over the most recent 20 verified decisions.
After Monday calibration the last three fields are filled in; the scoreboard row for mlb-lineup-optimizer increments Total decisions, Advocate correct (if 1-for-4 with an RBI counted as success), and Synthesis correct; Tilt is recomputed.
Workflow
Copy this checklist and track progress for each invocation:
Check every required field. Reject (do not write) if any are missing or malformed. The authoritative schema is in context/frameworks/decision-log-format.md; resources/template.md mirrors it verbatim.
timestamp_iso8601 is UTC ISO 8601 (YYYY-MM-DDTHH:MM:SSZ).
recommendation starts with an action verb (START, SIT, ADD, DROP, BID $X, ACCEPT, COUNTER, REJECT, STREAM, HOLD).
signals_in references at least one signal by name.
variants has both advocate and critic entries (or explicit n/a for bootstrap).
confidence in [0.00, 1.00].
will_verify_on is a date, end of week N, or end of season.
For calibrate mode: outcome in {happened, did not happen, partial}, variant_that_was_right in {advocate, critic, both, neither}.
Step 4: Serialize write
append: assign decision_id using format {YYYY-MM-DD}-{decision_type}-{NN} where NN is last_same_type_same_day_index + 1, zero-padded to 2 digits. Append the formatted entry plus a trailing separator line.
calibrate: locate the target entry by decision_id, replace only the three outcome fields, preserve everything else byte-for-byte.
Never overwrite the full file. Never reorder entries. Never edit fields other than the three outcome fields.
Increment Advocate correct, Critic correct, or both (for variant_that_was_right = both). Increment Synthesis correct when the synthesized recommendation matched reality.
Coach launches lineup-optimizer, waiver-analyst, and streaming-strategist in one run. Each completes its variant pair, synthesizes, and calls this skill. Skill processes them serially in arrival order, re-reading the log tail between each write, assigning unique decision_ids per type.
Pattern 2: Monday calibration pass
Coach opens the log, filters for entries where will_verify_on <= today and outcome_recorded_on is empty, and for each one: web-searches the outcome, builds the calibration payload, calls this skill in calibrate mode. Each call edits one entry and updates one scoreboard row.
mlb-trade-analyzer fires when a trade offer arrives. The decision is REJECT, ACCEPT, or COUNTER. will_verify_on is typically end of week N or end of season because trade value plays out over weeks. Skill logs normally; the trade stays "open" on the calibration queue until its verify date.
Pattern 4: Bootstrap / ad-hoc entry
For team initialization or meta-decisions (changing a weighting, amending a framework), decision_type = ad-hoc or bootstrap, variants = n/a, variant_that_was_right = neither. Skill accepts this as a special shape.
Guardrails
Append only. Never rewrite the log. Never reorder entries. The only edit allowed is filling in the three outcome fields on an existing entry during calibration. Any other mutation is a bug.
Re-read before every write. Parallel agents share one file. Reading the tail immediately before writing is the serialization primitive. Do not cache the tail from earlier in the run.
Validate before writing. A malformed entry poisons downstream calibration. Reject with a structured error; do not attempt partial writes or auto-fix missing fields.
decision_id is deterministic. Same day, same type, sequential NN. If another agent has already used 2026-04-17-lineup-01, this agent gets 02. Never reuse an id.
Outcome-editing is surgical. When filling outcome fields, replace only the three target lines. Preserve timestamps, decision_id, recommendation, variants, synthesis, red-team, confidence, and will_verify_on byte-for-byte. If you have to reformat to write, you are doing it wrong.
Scoreboard updates are idempotent per decision. Each decision_id contributes exactly one row increment. If a calibration call is retried, re-check whether the entry already has outcome_recorded_on populated; if so, do not double-count.
Tilt needs sample size. Report neutral when Total decisions < 10 for the agent. Only switch to advocate+ / critic+ once both sample size and margin are met. See resources/methodology.md for exact thresholds.
Agents never write to the log directly. Every MLB agent routes through this skill. If the coach finds a malformed entry not produced here, treat it as a bug and log it under tracker/calibration-review.md.