Run any Skill in Manus with one click

novatrade-parity-check

Verify that NovaTrade live FTMO trades match the IRB v5 backtest for a given window. Pulls live deals from MetaApi, runs the same window through the backtest engine, runs the parity diff, then classifies each divergence as uptime-gap / bar-feed / price-drift / strategy-drift / real-mismatch and reports a headline. Invoke on 'parity check', 'verify this week's trades', 'are live trades matching the backtest', 'live vs backtest for [date range]', 'did we trade what we should have', or any time the operator wants to know if live execution matches the validated baseline.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/demarcushill20/Nova-Core --skill novatrade-parity-check

Copy and paste this command into Claude Code to install the skill

Source

demarcushill20/Nova-Core

Stars0

Forks0

UpdatedApril 26, 2026 at 13:06

File Explorer

2 files

SKILL.md

readonly

NovaTrade Live vs Backtest Parity Check

Overview

Answer a single question: for a given date window, do the live FTMO trades match what the IRB v5 backtest would have produced on the same bars? "Match" here is not strict equality — bar-feed jitter and execution slippage make pip-perfect agreement impossible. The skill grades divergences against known failure modes so the operator immediately sees whether a discrepancy is a real strategy bug or a known-and-classified artifact.

Announce at start: "I'm using the novatrade-parity-check skill to verify live vs backtest for ."

When to use

Operator asks to "verify this week's trades", "run a parity check", "compare live and backtest", "are live trades matching the backtest"
After a multi-day live run, before claiming the live engine is faithfully executing the validated strategy
After a code change to novatrade/strategy/live_engine.py, signal generation, or the backtest engine — to confirm parity didn't regress
Before any decision that depends on live trades being a faithful proxy for the backtest (sizing changes, risk-budget reallocation, FTMO challenge progression)

Don't use for single-trade post-mortems (read the journal directly), or when the live engine wasn't running at all in the window (there's nothing to compare).

The workflow

Three deterministic command-line steps, then one interpretation step. The first three are bundled into scripts/run_parity_week.sh so the model doesn't have to remember argument order or stamp filenames.

Step 1: Pick the window

Default: the current trading week (Monday 00:00 UTC → now). Operator may specify other windows like "this week", "last 3 days", "Apr 20–24", or a single date.

Convert to two pieces:

--days N for the live fetcher (covers now − N days back through now)
--start YYYY-MM-DD --end YYYY-MM-DD for the backtest (end is exclusive)

Forex markets are closed Saturday and Sunday — a window that includes a weekend will have backtest trades on weekdays only. That's fine; just don't be confused if Sunday produces zero rows.

Step 2: Run the wrapper

bash .claude/skills/novatrade-parity-check/scripts/run_parity_week.sh \
    --start 2026-04-20 \
    --end 2026-04-27 \
    --days 7

The wrapper runs three things in sequence and prints the three output paths:

python3 scripts/fetch_ftmo_deal_history.py --days <N> --env-file OUTPUT/novatrade.env.updated → OUTPUT/parity/live_trades_<stamp>.csv
python3 scripts/backtest_window_for_parity.py --start <S> --end <E> --warmup-days 2 --env-file OUTPUT/novatrade.env.updated → OUTPUT/parity/backtest_trades_<stamp>.csv
python3 scripts/parity_report.py --live <live_csv> --backtest <bt_csv> --window-start <S> --window-end <E> --tolerance-min 15 → OUTPUT/parity/parity_report_<stamp>.csv

If any step fails, stop and surface the error to the operator — don't try to fabricate results from partial data.

Step 3: Read the headline counters

The parity script prints a JSON summary like:

{ "live_count": 21, "backtest_count": 32, "matched": 10,
  "outcome_match": 9, "outcome_flip": 1,
  "extra_in_live": 11, "missing_in_live": 22,
  "outcome_match_rate": 0.9 }

Capture these numbers. They're the headline for the report.

Step 4: Classify each divergence row

Open parity_report_<stamp>.csv. Every row has a status. For each non-MATCH row, assign one of these categories — these are not arbitrary buckets, they map to known failure modes documented in OUTPUT/parity/bar_divergence_analysis_*.md and our incident history, and the goal is to separate "this is the system as-known" from "this is a new bug":

Category A: Uptime gap (live engine was offline)

Signal: A run of MISSING_IN_LIVE rows — typically 3+ entries within a single calendar day or across consecutive trading hours — when no live trades exist in the same span.

Why it shows up: the live daemon (novacore-novatrade.service) wasn't running, or was disconnected from MetaApi, so the backtest fired signals the live engine never received. Verify by spot-checking the journal:

sudo journalctl -u novacore-novatrade.service \
    --since "2026-04-20 00:00 UTC" --until "2026-04-21 00:00 UTC" \
    | head -30

If the journal is silent or shows reconnect loops in that span, the gap is uptime, not strategy.

Category B: Bar-feed divergence (known issue)

Signal: EXTRA_IN_LIVE rows that don't pair with anything in the backtest, and the live engine was demonstrably running at that timestamp (journal has bar closed lines).

Why it shows up: OUTPUT/parity/bar_divergence_analysis_2026-04-24.md documents that ~96% of M5 bars between the live engine's real-time MetaApi feed and the retrospective HistoricalFetcher differ by 0.5–3 pips on at least one of OHLC. Different bars → different IRB inside-bar detection → live takes setups the retrospective backtest doesn't see, and vice versa. This is a known platform-level limitation, not a strategy bug.

If you see EXTRA_IN_LIVE setups, reference that analysis explicitly in the report. Don't re-investigate it — point to the existing artifact.

Category C: Price drift on a paired trade

Signal: A MATCH or OUTCOME_FLIP row where entry_price_diff_pips or sl_diff_pips is greater than ~2 pips in absolute value.

Why it shows up: Same signal fired in both runs, but the live engine got a different fill (slippage, spread, MetaApi vs. retrospective bar OHLC for the trigger bar). On OUTCOME_FLIP rows specifically, this is the most common cause — the SL was placed a pip or two differently and one bar wick caught one but not the other.

A price-drift OUTCOME_FLIP is not strategy drift — same signal, different fill. Note it but don't escalate.

Category D: Strategy drift (real bug)

Signal: OUTCOME_FLIP rows where entry and SL drift are both small (≤2 pips) yet the trade resolved differently. Or EXTRA_IN_LIVE / MISSING_IN_LIVE rows where the live engine was up, the bars agree (no Category B evidence), and the signal logic itself disagreed.

Why it matters: This is the only category that implies the live strategy code has actually diverged from the validated baseline. Stop and investigate before recommending any further live trading.

Category E: Real mismatch (unclassified)

Anything not fitting A–D. Treat as Category D until proven otherwise.

Step 5: Write the report

Use this exact template — operator scans it for the "Strategy drift?" line first:

## Parity check — <window>

**Headline:** <live_count> live / <backtest_count> backtest entries.
Matched <matched> pairs, outcomes agree <outcome_match>/<matched> (<outcome_match_rate>).

| Category | Count | Notes |
|---|---|---|
| Uptime gap | <N> | <which days/hours, e.g., "Apr 20 all day; Apr 24 10:20 UTC onward"> |
| Bar-feed divergence | <N> | Known issue per `OUTPUT/parity/bar_divergence_analysis_*.md` |
| Price drift on paired trades | <N> | Avg entry drift <X> pips, SL drift <Y> pips |
| Strategy drift (REAL BUG) | <N> | <list rows or "none"> |
| Unclassified | <N> | <list rows or "none"> |

**Strategy drift?** <Yes / No>. <One-line action: "investigate row at <ts>" / "no live-engine code change required">.

**Artifacts:**
- Live: `OUTPUT/parity/live_trades_<stamp>.csv`
- Backtest: `OUTPUT/parity/backtest_trades_<stamp>.csv`
- Parity: `OUTPUT/parity/parity_report_<stamp>.csv`

Keep it tight. The operator already trusts the workflow — they want the answer to "should I worry?" first, supporting evidence second.

Edge cases

Zero live trades, zero backtest trades: weekend/holiday window, or live engine fully down with no signals in the backtest either. Report "no trades in window" and stop — there's nothing to grade.
Zero live trades, many backtest trades: the live engine never opened a position. Either it was down the whole window (Category A) or there's a guard/halt blocking entries. Check HardRiskSupervisor halt state and the journal before assuming Category A.
Live trades in a window the backtest can't fetch: HistoricalFetcher occasionally returns a short bar count. If m5_bars looks low for the window length (a 5-day window should produce ~1400 M5 bars on weekdays), rerun the backtest before grading — partial bar data will produce false MISSING_IN_LIVE rows.
OUTCOME_FLIP with massive PnL difference: the backtest uses position sizing from irb_v5_m5_champion.yaml; live sizing is set by the FTMO challenge runtime. PnL magnitudes will not match. Compare outcome signs (WIN/LOSS), not dollar amounts.

Why this skill exists (theory of operation)

The locked baseline strategy ("Rob Hoffman IRB v5 — Relaxed Reliable Build") was validated against 24 years of historical data. The live engine is meant to be a faithful runtime of that backtest. Every divergence is either (1) a known platform artifact we already understand and have decided to live with, or (2) evidence the runtime has drifted from the validated logic — which means the validation no longer applies.

The classification step is what separates this skill from a generic "diff two CSVs" report: it answers the operator's actual question, which is never "are the trades identical" (they can't be) but "is the live engine still running the strategy I validated?". Any divergence that lands in Category D invalidates that claim until investigated.

novatrade-parity-check

More from this repository

More from this repository

NovaTrade Live vs Backtest Parity Check

Overview

When to use

The workflow

Step 1: Pick the window

Step 2: Run the wrapper

Step 3: Read the headline counters

Step 4: Classify each divergence row

Category A: Uptime gap (live engine was offline)

Category B: Bar-feed divergence (known issue)

Category C: Price drift on a paired trade

Category D: Strategy drift (real bug)

Category E: Real mismatch (unclassified)

Step 5: Write the report

Edge cases

Why this skill exists (theory of operation)

NovaTrade Live vs Backtest Parity Check

Overview

When to use

The workflow

Step 1: Pick the window

Step 2: Run the wrapper

Step 3: Read the headline counters

Step 4: Classify each divergence row

Category A: Uptime gap (live engine was offline)

Category B: Bar-feed divergence (known issue)

Category C: Price drift on a paired trade

Category D: Strategy drift (real bug)

Category E: Real mismatch (unclassified)

Step 5: Write the report

Edge cases

Why this skill exists (theory of operation)