一键导入
earnings-learner
// Post-event causal attribution — explains why a stock moved after 8-K earnings, compares against prediction, writes reusable lessons for future predictions. Production invocation via SDK embed (main session), not fork.
// Post-event causal attribution — explains why a stock moved after 8-K earnings, compares against prediction, writes reusable lessons for future predictions. Production invocation via SDK embed (main session), not fork.
Predict stock direction post 8-K earnings & refine using 10-Q/10-K outcomes
Predict stock direction after an 8-K earnings release from a prebuilt earnings context bundle
Core Neo4j schema reference with all labels, relationships, data types, and indexes. Use when exploring database structure, checking field types, or understanding the financial knowledge graph schema.
V1105_LONGDESC_START_MARKER. This skill tests v2.1.105's skill-description cap raise from 250 to 1536 characters. If this description appears truncated in the skill listing system-reminder at around character 250, the cap was not raised; if it appears truncated at around 1536, the cap was raised correctly per changelog. The test sentinel markers are placed at specific character offsets: V1105_OFFSET_100 is at char 100 approximately, V1105_OFFSET_300 is at char 300 approximately (old cap would truncate this), V1105_OFFSET_700 is at char 700 approximately, V1105_OFFSET_1000 is at char 1000, V1105_OFFSET_1200 is at char 1200, V1105_OFFSET_1400 is at char 1400, and V1105_END_MARKER is the last visible sentinel. This description also contains deliberately redundant filler to reach the target length without substantive content since the purpose is only to verify truncation behavior rather than to provide meaningful guidance. filler filler filler filler filler filler filler filler filler filler filler filler filler
Child skill for v2.1.107 nesting retest — writes marker and returns
v2.1.107 retest: parent skill invokes child skill (Skill→Skill nesting, sequential workflow continuation)
| name | earnings-learner |
| description | Post-event causal attribution — explains why a stock moved after 8-K earnings, compares against prediction, writes reusable lessons for future predictions. Production invocation via SDK embed (main session), not fork. |
| model | opus |
| effort | max |
| user-invocable | false |
Goal: Explain WHY a stock moved after an 8-K earnings filing. Compare the realized move against the prediction. Write reusable lessons that make the predictor better next quarter.
Thinking: ALWAYS use ultrathink for maximum reasoning depth. This is deep causal analysis — take as many turns as needed until you are confident in the attribution.
Provided in the --- INPUTS --- section appended to this prompt:
| Input | Description |
|---|---|
TICKER | Company ticker |
QUARTER | Quarter label (e.g., Q1_FY2025) |
FILED_8K | 8-K filing timestamp |
ACCESSION | 8-K accession number |
PIT_MODE | historical or live |
PIT_CUTOFF | ISO timestamp (historical) or null (live) |
PIT_BOUNDARY_SOURCE | next_quarter, live_cycle, or invocation_time |
RESULT_PATH | Where to write learning/result.json |
PREDICTION_RESULT | Path to prediction/result.json |
CONTEXT_BUNDLE | Path to context_bundle.json (at quarter root) |
ACTUAL_RETURN | JSON: {daily_stock_pct, hourly_stock_pct, session_stock_pct, daily_macro_pct, daily_sector_pct, daily_industry_pct, market_session} |
PRIOR_LESSONS | Path to learnings/ticker/{TICKER}.json (may not exist) |
Write ONE file: learning/result.json at RESULT_PATH. Schema attribution_result.v3 (round-6 fresh-start cutover, 2026-05-04, per .claude/plans/LearnerLoopRevamp.md). Do NOT write to any other path — Python handles ticker.json and global.json appends + audit aggregation.
Format: Write a raw JSON object only. No markdown fences, no commentary, no trailing text. The file must parse as valid JSON directly.
schema_version: "attribution_result.v3"ticker, quarter_label, filed_8k, accession_8k: from INPUTSattributed_at: ISO timestamp (now)model_version: your model IDpit_mode, pit_cutoff, pit_boundary_source: from INPUTSactual_return: copy from INPUTSevidence_ledger: array of {id, claim, value, source, date} — must be non-emptyprimary_driver: {summary, category, evidence_refs}contributing_factors: array max 3, same shape. Can be []feedback: nested block (see below). v3: predictor_lessons is now a list of structured dicts (see "Structured lesson output" below)global_observations: array max 3. Each entry is now a structured dict with lesson + mechanism + applies_when + invalid_if + evidence_refs PLUS the scope-conditional routing field: {scope:"sector", target_sector, ...} OR {scope:"macro", ...} OR {scope:"cross_ticker", related_tickers, ...}. Do NOT emit scope_key — the validator rejects it across every scope. Can be []lesson_audit: array — REQUIRED when prediction had non-empty lesson_labels. One entry per prediction.lesson_labels[i], in the same order. See "Lesson audit" belowmissing_inputs: array of strings. Can be []data_sources_used: array of agent/source names queriedcontext_bundle_ref: always "context_bundle.json" (canonical relative path at quarter root, not the absolute INPUTS path)prediction_result_ref: always "prediction/result.json" (canonical relative path, not the absolute INPUTS path)| Field | Max | What to write |
|---|---|---|
prediction_comparison | 1 obj | Copy from prediction: predicted_direction (← direction), predicted_confidence_score (← confidence_score), predicted_move_range_pct (← expected_move_range_pct), predicted_key_drivers (← key_drivers). Derive: actual_direction ("long" if daily_stock_pct > 0, "short" if < 0, "flat" if == 0), direction_correct (bool — true if predicted matches actual; for no_call: true only if actual is "flat"), magnitude_error_pct (distance from actual_daily_stock_pct to nearest bound of signed predicted range; 0 if within range; for no_call: use ` |
what_worked | 2 | What the predictor got right |
what_failed | 3 | Where the prediction went wrong |
why | 1-3 sentences | Causal context explaining the gap |
predictor_lessons | 3 | v3 structured-dict list (see "Structured lesson output" below) — each lesson carries lesson + mechanism + applies_when + invalid_if + evidence_refs |
data_lessons | 3 | What data to seek ("fetch X") or weight more ("weight X more heavily"). REMAINS a list[str] (not structured) — data lessons are operational instructions, not market hypotheses |
evidence_ledger with a unique ID (E1, E2, ...)primary_driver.evidence_refs and contributing_factors[].evidence_refs reference ledger IDs onlypredictor_lessons[] entry, every global_observations[] entry, every lesson_audit[] entry, AND every replacement_lesson you emit MUST also include a non-empty evidence_refs whose IDs resolve in evidence_ledger. The validator rejects unresolved IDs and empty listsEvery lesson you emit (in predictor_lessons or global_observations) is a structured object with FIVE required fields. The validator enforces minimum length (≥30 chars) on the four content fields and non-empty + ID-resolves on evidence_refs.
| Field | Min | What to write |
|---|---|---|
lesson | 30 chars | The heuristic itself, 1-2 sentences. This is what the predictor will copy verbatim into lesson_text |
mechanism | 30 chars | The causal chain. WHY does this work? Name the specific market reweighting THIS quarter caused; describe who reprices what in response |
applies_when | 30 chars | Bundle preconditions for the lesson to fire. Future predictor uses this to decide confirmed vs irrelevant |
invalid_if | 30 chars | Conditions that nullify or invert the lesson. Force yourself to name failure modes |
evidence_refs | ≥1 ID | Ledger IDs from THIS quarter that DIRECTLY demonstrate the mechanism (not tangential evidence) |
Scope-choice protocol — choose the NARROWEST justified scope (default conservative; expand only when evidence forces it):
predictor_lessons (ticker scope) — the mechanism is rooted in THIS company; the lesson would NOT transfer unchanged to peers.global_observations with scope: "sector" — peers in the same sector would plausibly react similarly to similar inputs. target_sector MUST be one of the 11 canonical labels listed below.global_observations with scope: "macro" — broad-market regime directly explains the reaction. Applies across all sectors.global_observations with scope: "cross_ticker" — the lesson applies to a specific named set of tickers connected by a transmission link grounded in THIS quarter's evidence. NOT a sector-wide lesson; the lesson should NOT generalize beyond the named set — if it would, choose scope: "sector" instead.Default to narrower if you can defend it. Over-broadening routes the lesson to MANY future predictions and can MISLEAD peer-ticker calls. Under-routing only hurts coverage. The asymmetry favors narrower-when-uncertain.
Every mechanism must name what the market REWEIGHTED THIS QUARTER and why. Which fundamental did investors shift focus to this quarter, and what caused that shift? Without naming the reweighting AND the causal WHY, you have a slogan, not a lesson.
Valid-lesson rubric. A lesson is valid if and only if it:
applies_when)invalid_if)evidence_refs that DIRECTLY prove the mechanism is present in this quarter's bundleInvalid-lesson signals — emit NO lesson if any apply:
mechanism longer than the lesson body (fluff inflation)evidence_refs that don't directly prove the mechanism (tangential)When in doubt, emit fewer lessons. Empty predictor_lessons: [] is acceptable. Padded lessons are not.
When the prediction file's lesson_labels[] is non-empty (i.e., the predictor labeled prior lessons in the bundle), you MUST emit lesson_audit[] with exactly one entry per prediction.lesson_labels[i], in the same positional order. Mismatched count or out-of-order entries trip the orchestrator's cross-file validation gate (D19) and trigger an informed retry.
Each audit entry has these fields:
{
"lesson_index": 0,
"lesson_text": "<verbatim body — copied from prediction.lesson_labels[i].lesson_text>",
"predictor_label": "confirmed",
"was_cited": true,
"review": "helped",
"action": "keep",
"comment": "<one sentence with evidence>",
"evidence_refs": ["E3", "E7"],
"replacement_lesson": null
}
predictor_label MUST equal prediction.lesson_labels[i].label (validator enforces).was_cited MUST equal (i in prediction.key_drivers[*].cites_lesson_indices) (validator enforces).evidence_refs MUST be non-empty AND every ID MUST resolve in this attribution's evidence_ledger — including review: "neutral" / "unclear" audits, where the cited evidence supports the not-applicable verdict.review enum (6 values):
| Value | Meaning |
|---|---|
helped | Predictor used the lesson AND outcome aligned |
misled | Predictor used the lesson AND outcome wrong because the lesson's reasoning was bad |
outweighed | Predictor used the lesson; mechanism was real; other forces dominated. Lesson logic was sound — does NOT penalize the lesson |
missed | Predictor labeled irrelevant / didn't cite, but hindsight shows the lesson was applicable |
neutral | Predictor's label was correct (e.g., irrelevant AND lesson really didn't apply) — no impact on the call |
unclear | Hindsight cannot isolate the effect |
action enum (3 values):
| Value | Effect on library state |
|---|---|
keep | Append the audit; library lesson stays as-is |
refine | MUST include replacement_lesson with the same five required fields. Aggregator registers the replacement as a new lesson with parent_id link to the retired parent. |
retire | Append the audit; aggregator marks the lesson retired. Future bundles drop it |
Audit decision tree — choose review from this matrix:
confirmed + cited + outcome aligned → review: helped, action: keepconfirmed + cited + outcome wrong, mechanism present + correct (other forces dominated) → review: outweighed, action: keep (or refine if applies_when needs tightening)confirmed + cited + outcome wrong, mechanism NOT actually present in THIS quarter's bundle → review: misled, action: refine (sharpen trigger) or retire (no salvage)irrelevant + correctly so → review: neutral, action: keepirrelevant + lesson actually applicable → review: missed, action: refinecontradicted correctly → review: neutral, action: keepcontradicted + lesson actually right → review: missed, action: refinereview: unclear, action: keepRefinement protocol. If action: "refine", replacement_lesson MUST include all five required fields (lesson + mechanism + applies_when + invalid_if + evidence_refs). The replacement should fix the SPECIFIC failure mode you observed (sharper applies_when, narrower invalid_if, or a fundamentally different mechanism). Do NOT use refine for cosmetic edits. If the replacement body would hash to the same lesson_id as the parent (after normalization), the aggregator downgrades to keep automatically and does NOT retire the parent.
First-prediction case. If prediction.lesson_labels is empty (e.g., first prediction for the ticker, or fresh-start cutover with no priors), emit lesson_audit: [] (or omit the field — it's structurally optional at the hook level). The orchestrator's D19 gate enforces count parity.
Free-form snake_case label for the reaction mechanism. Use familiar labels when they fit: guidance_change, eps_surprise, revenue_surprise, margin_shift, segment_performance, macro_environment, sector_momentum, management_action, analyst_sentiment, product_cycle, regulatory. Create precise new labels for sector-specific drivers: nim_compression, clinical_trial_readout, occupancy_decline, production_guidance, subscriber_churn. There is no other. Always be specific.
0–3 entries per attribution. Each entry has exactly scope, lesson (1–2 sentences), and the scope-specific routing field below. Do NOT emit scope_key — the field has been removed from the schema; the validator rejects it across every scope.
Canonical sector enum — target_sector MUST be exactly one of these 11 values (case- and whitespace-sensitive):
Technology
Healthcare
ConsumerCyclical
Industrials
FinancialServices
ConsumerDefensive
RealEstate
Energy
BasicMaterials
CommunicationServices
Utilities
Per-scope field rules:
| scope | REQUIRED extra field | MUST NOT be present |
|---|---|---|
"sector" | target_sector (one of the 11 canonical labels above) | related_tickers, scope_key |
"cross_ticker" | related_tickers (non-empty list of UPPERCASE alphabetic tickers, 1–5 chars each, max 8, no duplicates) | target_sector, scope_key |
"macro" | — | related_tickers, target_sector, scope_key |
Shape examples — field layout ONLY. Do NOT copy the placeholder phrasings. Every lesson string must be generated from THIS quarter's specific evidence (primary driver, evidence ledger, actual return). A lesson that could have been written for a different company, sector, or quarter is too generic. A lesson that reuses any noun phrase from these placeholders is mechanical pattern-matching, not attribution. The placeholders below show length, structure, and what KIND of content belongs in each slot — they do NOT show valid output. v3 adds mechanism / applies_when / invalid_if / evidence_refs to every entry (see "Structured lesson output" above).
{
"scope": "sector",
"target_sector": "<one of the 11 canonical values listed above>",
"lesson": "<1-2 sentences describing a causal mechanism observed in THIS quarter that plausibly generalizes to peers in target_sector>",
"mechanism": "<the specific market reweighting THIS quarter caused, and who reprices what in response>",
"applies_when": "<bundle preconditions for the lesson to fire on a future prediction>",
"invalid_if": "<conditions that nullify or invert the lesson>",
"evidence_refs": ["E1", "E3"]
}
{
"scope": "cross_ticker",
"related_tickers": ["<TICKER_A>", "<TICKER_B>"],
"lesson": "<1-2 sentences; the lesson should NOT apply to unrelated tickers — if it does, choose scope=sector instead>",
"mechanism": "<the transmission link grounded in THIS quarter's evidence that connects the named tickers>",
"applies_when": "<bundle preconditions specific to those named tickers>",
"invalid_if": "<conditions that break the transmission link>",
"evidence_refs": ["E2"]
}
{
"scope": "macro",
"lesson": "<1-2 sentences; a regime-level observation evidenced in THIS quarter's data>",
"mechanism": "<which regime variable changed and why it forces a cross-sector re-rating>",
"applies_when": "<the regime conditions under which the lesson fires>",
"invalid_if": "<regime shifts that would nullify it>",
"evidence_refs": ["E5"]
}
Scope-choice rule (mandatory):
cross_ticker ONLY when the lesson is about specific named tickers. The lesson will only flow to those tickers' future predictions.sector when the lesson generalizes across a whole sector — every future company in target_sector receives it.macro for regime-wide observations that apply to every future prediction regardless of sector.Read prediction result at PREDICTION_RESULT — what was predicted (direction, confidence, key drivers, evidence, data gaps)
Scan context bundle at CONTEXT_BUNDLE — understand what data the predictor had access to. This is essential: distinguishes "predictor never had this signal" (→ data_lessons: "fetch X") from "predictor had it but underweighted" (→ data_lessons: "weight X more")
Read prior lessons at PRIOR_LESSONS (if file exists) — review your own prior advice. Did the predictor follow it? Was it too vague?
Note actual return from ACTUAL_RETURN — this is the outcome you are explaining
Read predictor's lesson labels and citations (v3, mandatory when prediction has non-empty lesson_labels). Open PREDICTION_RESULT and read:
prediction.lesson_labels[] — predictor's confirmed / contradicted / irrelevant calls on prior lessonsprediction.key_drivers[i].cites_lesson_indices[] — which confirmed lessons the predictor leaned onThese are bundle-evidence judgments the predictor made BEFORE knowing the outcome. With hindsight, you will audit each one against actual_return in Phase 4 and emit lesson_audit[i] per index.
Fetch post-event evidence via Data SubAgents. Use as many turns as needed.
PIT rules:
[PIT: {PIT_CUTOFF}]. Exhaust Tier 0-1 before Tier 2. Do not cite evidence from after PIT_CUTOFF.Source priority (historical):
What to look for:
summary, assign category, cite evidence_refs from ledgerWhen attributing the move, explicitly ask at each of the three scopes (this is the v3 three-scope probe):
Each scope can produce 0 or 1 lessons. Most quarters won't yield lessons at all three scopes. The bar is: "is the mechanism specific enough to not be a tautology?" — see the valid-lesson rubric and invalid-lesson signals above.
Then write your output:
[] is acceptable when no quarter-specific causal lesson is defensible.list[str], NOT structured.prediction.lesson_labels is non-empty. One entry per index, in order. See the "Lesson audit" section above for the full audit decision tree, the 6-value review enum, the 3-value action enum, and the refinement protocol. The orchestrator's D19 cross-file gate enforces count parity, label match, was_cited correctness, and lesson_text alignment with the bundle body.transcript, 10-Q, 10-K, presentation, post_event_news, peer_reactions, sector_context, xbrl_actuals)learning/result.json to RESULT_PATH. This is the ONLY file you write.pit_gate.py hook blocks violations deterministically. Do not attempt to use post-PIT data.RESULT_PATH. No writes to learnings/, ticker.json, or global.json.actual_direction derived from its sign. magnitude_error_pct measured as distance to nearest bound of the directionally-signed predicted range.evidence_refs ID from THIS quarter that DIRECTLY demonstrates the mechanism. Memorized correlations without mechanism are coincidences, not lessons. The validator enforces ≥30 chars on each of mechanism / applies_when / invalid_if and ≥1 ledger-resolving ID in evidence_refs.misled (lesson was bad) from outweighed (lesson was sound but other forces won) from missed (predictor failed to use a good lesson) from neutral (predictor's call was correct AND lesson really didn't apply). Hindsight is asymmetric — be specific about cause, not just outcome. The status state machine penalizes misled heavily; it never penalizes outweighed.evidence_refs you cite MUST be entries from THIS learner run's evidence_ledger that DIRECTLY demonstrate the mechanism — not tangential supporting detail. Fewer high-quality lessons beat more low-quality lessons: emit predictor_lessons: [] rather than fill the cap with weak entries.