with one click
clickhouse-migrations
ClickHouse migration patterns and rules. Use when creating or modifying ClickHouse migrations.
Menu
ClickHouse migration patterns and rules. Use when creating or modifying ClickHouse migrations.
| name | clickhouse-migrations |
| description | ClickHouse migration patterns and rules. Use when creating or modifying ClickHouse migrations. |
Read posthog/clickhouse/migrations/AGENTS.md for comprehensive patterns, cluster setup, examples, and ingestion layer details.
operations = [
run_sql_with_exceptions(
SQL_FUNCTION(),
node_roles=[...],
sharded=False, # True for sharded tables
is_alter_on_replicated_table=False # True for ALTER on replicated tables
),
]
[NodeRole.DATA]: Sharded tables (data nodes only)[NodeRole.DATA, NodeRole.COORDINATOR]: Non-sharded data tables, distributed read tables, replicated tables, views, dictionaries[NodeRole.INGESTION_SMALL]: Writable tables, Kafka tables, materialized views on ingestion layerMergeTree engines:
AggregatingMergeTree(table, replication_scheme=ReplicationScheme.SHARDED) for sharded tablesReplacingMergeTree(table, replication_scheme=ReplicationScheme.REPLICATED) for non-shardedCollapsingMergeTree, ReplacingMergeTreeDeletedDistributed engine:
Distributed(data_table="sharded_events", sharding_key="sipHash64(person_id)")Distributed(data_table="my_table", cluster=settings.CLICKHOUSE_SINGLE_SHARD_CLUSTER)ON CLUSTER clause in SQL statementsIF EXISTS / IF NOT EXISTS clausesDROP TABLE IF EXISTS ... SYNCon_cluster=Falsesharded=True when altering sharded tablesis_alter_on_replicated_table=True when altering non-sharded replicated tablesDROP COLUMN migration yourself — DROP COLUMN can get stuck in ClickHouse and block releases. Column removal is a two-step process: (1) the ClickHouse team drops the column directly on the cluster, then (2) you write a migration with the matching DROP COLUMN so the codebase schema stays in sync. Never initiate the drop from a migration without the ClickHouse team having done step 1 first.kafka_events_json_ws or events_json_ws_mv — these tables are a no-go zone. The MV definition differs significantly between US prod, EU prod, and dev (dozens of environment-specific mat_* columns) and those differences are not reflected in the repo. Dropping and recreating from repo SQL would destroy the environment-specific schema and break event ingestion. Any change must go through the ClickHouse team.A PR that contains a ClickHouse migration must be migration-only. Do not mix migration files with feature code, API changes, model changes, or frontend changes in the same PR. Migration-related files are:
posthog/clickhouse/migrations/0NNN_*.py)posthog/clickhouse/sql/*.py, table engine helpers)If you need both a schema change and application code that uses the new schema, ship the migration first in its own PR and merge it before the application-code PR.
No table should exist only in the cloud. Every table created via migration must also exist in a local dev environment.
Some migrations are cloud-guarded and skipped in local/hobby dev:
operations = (
[]
if settings.CLOUD_DEPLOYMENT not in ("US", "EU", "DEV")
else [...]
)
If you create a new table inside such a guard, you must also add its SQL function to posthog/clickhouse/schema.py in the appropriate tuple so the table gets created locally:
| Table type | Tuple in schema.py |
|---|---|
| MergeTree / base table | CREATE_MERGETREE_TABLE_QUERIES |
| Distributed / writable | CREATE_DISTRIBUTED_TABLE_QUERIES |
| Kafka consumer | CREATE_KAFKA_TABLE_QUERIES |
| Materialized view | CREATE_MV_TABLE_QUERIES |
| Non-materialized view | CREATE_VIEW_QUERIES |
| Dictionary | CREATE_DICTIONARY_QUERIES |
The only exception is tables whose definition intentionally differs per environment and is not tracked in the repo (e.g. the no-go zone events_json_ws_mv table).
Delete entry from infi_clickhouse_orm_migrations table to re-run a migration.
Clarify how to visualize change over a time range before building a trend. Use whenever the user asks how much something changed, grew, dropped, improved, or regressed between two points or periods — "how much did X change from A to B", "before vs after", "start vs end", "week over week", "compare this month to last", "change over time" — or mentions a "slope chart" / "slopegraph". Two readings of "change" need different charts: the whole trend (a line, every interval) versus just the two endpoints (a slope, start vs end). Ask which they want, then render it. Not for choosing a saved insight ChartDisplayType in the insight editor.
Explore PostHog MCP intent clusters — agent goals grouped by semantic similarity, with each cluster's tool distribution and error rates. Use when the user asks "what are agents trying to do with the MCP?", "group the intents", "which goals fail most?", "what does each cluster route to?", wants to recompute the clustering, or pastes an MCP analytics intent-clustering URL.
Investigate individual PostHog MCP sessions — the sequence of tool calls a single agent made in one run, what it was trying to do, and where it went wrong. Use when the user asks "what did this MCP session do?", "show me the tool calls for session X", "what was the agent's goal?", "which sessions had errors?", or pastes an MCP analytics sessions URL.
Investigate the quality of PostHog MCP tool calls — error rates, latency, reach, and which tools are failing or slow. Use when the user asks "which MCP tool has the highest error rate?", "what's the slowest tool?", "which tools fail most often?", "how reliable is tool X?", wants a tool-quality matrix, or pastes an MCP analytics tool-quality / dashboard URL and asks what it shows.
Guides PostHog engineers through dashboard widget platform work — ship a new widget_type (WIDGET_REGISTRY, catalog, run_widgets, WidgetCard) or update a shipped type (config, query, layout, RBAC, tile filter bar, list footer, titleHref, throttles). Use for WidgetSpec, widget_specs/, widget-configs.zod.ts, hogli build:openapi, error_tracking_list, session_replay_list, widgetFilters, formatWidgetListCountFooter, widget_query_throttle, or WidgetCard composition. New types need widget-intake confirmation first. Not for MCP batch-add of existing types or adding tiles to a dashboard.
Assesses what a page's heatmap is telling you and recommends concrete changes. Pulls click / rageclick / scroll-depth data for a URL, names the hot elements by cross-referencing autocapture events on the same page, and can create a saved heatmap the user opens in PostHog, then summarizes the behavior and proposes improvements. TRIGGER when: user asks what a heatmap shows, why people aren't clicking something, where users rage-click, how far they scroll, what to change on a page based on heatmap/click data, or to 'analyze/assess/review the heatmap' for a URL. DO NOT TRIGGER when: the user only wants to create a saved heatmap screenshot with no analysis (use heatmaps-saved-create directly), or is asking about session replay in general (use investigating-replay).