| name | clickhouse-antipatterns |
| description | ClickHouse SQL anti-patterns and performance constraints discovered during Gen200-Gen600 barrier framework. Use when writing ClickHouse SQL, creating new barrier/pattern generations, modifying array functions, window functions, parameter sweeps, forward arrays, or encountering slow queries, OOM, NULL entry prices, wrong barrier detection, arrayFirstIndex returning 0, or aligning SQL results with backtesting.py. TRIGGERS - ClickHouse SQL, barrier SQL, array function, window function, trailing stop SQL, parameter sweep, slow query, OOM, arrayFirstIndex, leadInFrame, groupArray, arrayFold, arrayScan, threshold-relative, anti-pattern, performance constraint, forward arrays, self-join, O(N×M), SQL vs Python, SQL vs backtesting. |
ClickHouse Anti-Patterns for Range Bar Pattern SQL
Discovered during Gen200-Gen600 Triple Barrier + Hybrid Feature Sweep framework implementation. Each anti-pattern has been validated through production failures and resolved with tested workarounds.
Companion skills: quant-research:backtesting-py-oracle (Python-side anti-patterns) | sweep-methodology (sweep design)
GitHub Issue: #8 - Anti-Pattern Registry
Quick Lookup
For detailed descriptions with code examples, see references/anti-patterns.md.
For infrastructure-specific issues, see references/infrastructure.md.
Critical Rules (Never Violate)
1. Window-Based Forward Arrays (NOT Self-Join)
base_bars AS (
SELECT *,
arraySlice(groupArray(high) OVER (
ORDER BY close_time_ms ROWS BETWEEN CURRENT ROW AND 101 FOLLOWING
), 2, 101) AS fwd_highs,
arraySlice(groupArray(low) OVER (
ORDER BY close_time_ms ROWS BETWEEN CURRENT ROW AND 101 FOLLOWING
), 2, 101) AS fwd_lows,
FROM open_deviation_bars WHERE ...
)
forward_arrays AS (
SELECT s.*, groupArray(b.high) AS fwd_highs ...
FROM signals s INNER JOIN base_bars b ON b.rn BETWEEN s.rn + 1 AND s.rn + 101
GROUP BY ...
)
Memory tradeoff: Window approach uses ~10x more memory (1.5 GB vs 165 MB) because it computes arrays for ALL bars, not just signals. At 16 parallel queries: 1.5 GB × 16 = 24 GB — safe on 61 GB hosts. For memory-constrained hosts, the self-join is acceptable for sparse patterns (<2% signal coverage).
Gen600 Production Confirmation (2026-02-11): AP-14 window approach confirmed at scale — 284K+ results collected at 3.2 queries/sec (xargs -P16), 3-5s per query regardless of pattern density (sparse 1.2% to dense 49%), zero errors, memory stable at ~24 GB peak (1.5 GB/query × 16 parallel).
Historical note: AP-01 originally recommended self-join over window approach because Gen200 had 1.4M bars × 51 elements = 2.36 GB with the WRONG window frame (ROWS BETWEEN 1 FOLLOWING AND 51 FOLLOWING on ALL bars). The CORRECT window approach uses arraySlice(..., 2, 101) on ROWS BETWEEN CURRENT ROW AND 101 FOLLOWING — slicing off the current row. Gen600 benchmarking proved the window approach is 11x faster for dense patterns (36K+ signals) where the self-join becomes the dominant bottleneck.
2. Pre-Compute Barrier Prices as Columns
param_with_prices AS (
SELECT *, entry_price * (1.0 + tp_mult * 0.025) AS tp_price FROM param_expanded
),
barrier_scan AS (
SELECT arrayFirstIndex(x -> x >= tp_price, ...) AS raw_tp_bar FROM param_with_prices
)
SELECT arrayFirstIndex(x -> x >= entry_price * (1.0 + tp_mult * 0.025), fwd_highs)
3. Always Guard arrayFirstIndex with > 0
CASE
WHEN raw_sl_bar > 0 AND raw_tp_bar > 0 AND raw_sl_bar <= raw_tp_bar THEN 'SL'
WHEN raw_sl_bar > 0 AND raw_tp_bar > 0 AND raw_tp_bar < raw_sl_bar THEN 'TP'
WHEN raw_sl_bar > 0 AND raw_tp_bar = 0 THEN 'SL'
WHEN raw_tp_bar > 0 AND raw_sl_bar = 0 THEN 'TP'
WHEN window_bars >= max_bars THEN 'TIME'
ELSE 'INCOMPLETE'
END
CASE WHEN raw_tp_bar <= raw_sl_bar THEN 'TP' ELSE 'SL' END
4. leadInFrame Requires UNBOUNDED FOLLOWING
leadInFrame(open, 1) OVER (
ORDER BY close_time_ms
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS entry_price
leadInFrame(open, 1) OVER (ORDER BY close_time_ms) AS entry_price
5. Threshold-Relative Parameters
entry_price * (1.0 + tp_mult * 0.025) AS tp_price
entry_price * (1.0 + tp_mult * 0.05) AS tp_price
entry_price * (1.0 + 0.01) AS tp_price
6. NEVER Expanding Window — Always Rolling 1000-Bar
quantileExactExclusive(0.95)(trade_intensity) OVER (
ORDER BY close_time_ms
ROWS BETWEEN 999 PRECEDING AND 1 PRECEDING
) AS ti_p95_rolling
quantileExactExclusive(0.95)(trade_intensity) OVER (
ORDER BY close_time_ms
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS ti_p95_expanding
Post-Change Checklist
After modifying ANY Gen200+ SQL file: