| name | database-migration-plan |
| description | Write a safe, zero-downtime database migration plan for a schema change. Use when asked to plan a database migration, design a zero-downtime schema change, document an expand/contract migration, produce a rollback procedure for a database change, or coordinate a database schema update with a deployment. Produces a structured migration plan covering migration objectives, backward compatibility analysis, expand/contract phase breakdown, exact SQL, rollback steps per phase, data validation queries, and a deployment runbook. |
Database Migration Plan Skill
Produce a complete, safe database migration plan for a schema change. A migration plan is not just the SQL — it is a coordinated sequence of steps that ensures the application stays available, data stays consistent, and every step can be rolled back independently.
The expand/contract pattern is the default approach: expand the schema to support both old and new states, migrate the application, then contract to remove the old state. Never combine schema changes and data backfills in a single migration that runs during deployment.
Required Inputs
Ask for these if not already provided:
- Current schema state — the DDL or description of the table(s) as they are now
- Target schema state — the DDL or description of what the table(s) should look like after migration
- Migration reason — why this change is being made (new feature, performance fix, normalization, compliance)
- Database engine — PostgreSQL, MySQL, SQLite, CockroachDB, etc.
- Estimated data volume — approximate number of rows in affected tables
- Deployment constraints — is any downtime allowed? What is the expected traffic level during migration? Are there multiple app instances running?
- Rollback window — how long after deploy can the team roll back before the migration becomes irreversible?
Output Format
Database Migration Plan: [Migration Name]
Service: [Name] | Team: [Team name]
Author: [Name] | Reviewed by: [Name / DBA]
Date: [Date] | Target deploy date: [Date]
Database engine: [PostgreSQL X.X / MySQL X.X]
Ticket: [JIRA-XXX]
1. Migration Overview
What is changing:
[1–2 sentences: the specific schema change — e.g. "Adding a non-nullable organisation_id column to the users table and backfilling it from the accounts table."]
Why:
[1–2 sentences: the business or technical reason driving the change.]
Migration type: [Additive only / Additive + backfill / Column rename / Column type change / Table restructure / Index change]
Zero-downtime: [Yes — using expand/contract / No — requires maintenance window — state duration]
Estimated migration duration:
- Expand phase: [~X minutes]
- Data backfill: [~X minutes/hours — based on X rows at Y rows/second]
- Contract phase: [~X minutes after app version deployed]
2. Backward Compatibility Analysis
Before writing a single line of SQL, assess whether each change is backward compatible with the currently deployed application code.
| Change | Backward compatible? | Risk | Notes |
|---|
[e.g. Add nullable column org_id] | Yes | Low | Old app ignores new column |
[e.g. Backfill org_id] | Yes | Medium | Old app unaffected; new app reads backfilled values |
[e.g. Add NOT NULL constraint to org_id] | No | High | Old app that inserts without org_id will fail |
[e.g. Drop old column account_id] | No | High | Old app that reads account_id will fail |
[e.g. Add index on org_id] | Yes | Low | Additive; no breaking change |
| [e.g. Rename column] | No | High | Never rename in one step; use expand/contract |
Summary: [e.g. "This migration requires the expand/contract pattern across 3 deployment phases because steps 3 and 4 are not backward compatible."]
3. Expand/Contract Phases
Phase Overview
Phase 1 — EXPAND
Deploy migration: add new column (nullable), create new indexes
Old app: continues to work (ignores new column)
New app: not yet deployed
Duration: [~X min] | Rollback: trivial — drop new column
│
▼
Phase 2 — BACKFILL + DUAL-WRITE
Deploy app update: writes to both old and new columns
Run backfill: populate new column for existing rows
Validate: confirm 100% of rows have non-null new column
Duration: [~X hours depending on data volume]
Rollback: deploy previous app version; new column is still nullable
│
▼
Phase 3 — ENFORCE + SWITCH
Deploy migration: add NOT NULL constraint, drop old column/index
Deploy app update: reads only from new column
Duration: [~X min] | Rollback: requires forward-fix (constraint must be dropped first)
│
▼
Phase 4 — CONTRACT (optional cleanup)
Deploy migration: drop deprecated columns, rename if needed
Final state matches target schema
Rollback: not recommended — contract changes are destructive
Phase 1 — Expand Schema
Goal: Add the new column and structures without breaking the existing application.
Deploy order: Run migration first, then (optionally) deploy app.
Application state: Old app running; no app changes required yet.
BEGIN;
ALTER TABLE users
ADD COLUMN org_id UUID NULL
REFERENCES organisations(id) ON DELETE RESTRICT;
CREATE INDEX CONCURRENTLY users_org_id_idx ON users (org_id);
COMMIT;
Validation after Phase 1:
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'users' AND indexname = 'users_org_id_idx';
Rollback (Phase 1 only):
BEGIN;
DROP INDEX CONCURRENTLY IF EXISTS users_org_id_idx;
ALTER TABLE users DROP COLUMN IF EXISTS org_id;
COMMIT;
Phase 2 — Backfill Existing Data
Goal: Populate the new column for all existing rows before enforcing NOT NULL.
When to run: After Phase 1 is live and stable. Can be run as a background job or a one-time script.
Application state: Deploy app version that dual-writes to both old and new columns.
App code change required:
// All INSERT and UPDATE operations must now set BOTH old_column and new_column
// until Phase 3 is complete. This ensures new rows are populated during the backfill window.
Backfill script — batch processing:
DO $$
DECLARE
batch_size INT := 1000;
affected INT;
BEGIN
LOOP
UPDATE users
SET org_id = accounts.organisation_id
FROM accounts
WHERE users.account_id = accounts.id
AND users.org_id IS NULL
LIMIT batch_size;
GET DIAGNOSTICS affected = ROW_COUNT;
EXIT WHEN affected = 0;
PERFORM pg_sleep(0.1);
END LOOP;
END $$;
Monitoring during backfill:
SELECT
COUNT(*) FILTER (WHERE org_id IS NOT NULL) AS backfilled,
COUNT(*) FILTER (WHERE org_id IS NULL) AS remaining,
COUNT(*) AS total,
ROUND(
100.0 * COUNT(*) FILTER (WHERE org_id IS NOT NULL) / COUNT(*), 2
) AS pct_complete
FROM users;
Backfill completion validation:
SELECT COUNT(*) AS unbackfilled_rows
FROM users
WHERE org_id IS NULL;
SELECT COUNT(*) AS recent_missing
FROM users
WHERE org_id IS NULL
AND created_at > now() - INTERVAL '1 hour';
Rollback (Phase 2 — app only):
- Deploy previous app version (single-write to old column)
org_id column remains nullable; no data is lost
- Backfilled values remain; harmless
Phase 3 — Enforce Constraints
Goal: Add NOT NULL constraint and remove dependency on the old column.
Prerequisites: Phase 2 backfill must be 100% complete (zero rows with org_id IS NULL).
Deploy order: Run migration, then deploy app version that reads only from org_id.
PostgreSQL — use NOT VALID + VALIDATE for large tables:
ALTER TABLE users
ADD CONSTRAINT users_org_id_not_null
CHECK (org_id IS NOT NULL) NOT VALID;
ALTER TABLE users
VALIDATE CONSTRAINT users_org_id_not_null;
ALTER TABLE users
ALTER COLUMN org_id SET NOT NULL;
ALTER TABLE users
DROP CONSTRAINT users_org_id_not_null;
Validation after Phase 3:
SELECT column_name, is_nullable
FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
BEGIN;
INSERT INTO users (email) VALUES ('test@example.com');
ROLLBACK;
Rollback (Phase 3):
ALTER TABLE users ALTER COLUMN org_id DROP NOT NULL;
Phase 4 — Contract (Remove Old Column)
Goal: Remove the old column once the app no longer references it.
Prerequisites: Phase 3 fully deployed and stable for at least [X days/hours rollback window].
Warning: This phase is destructive — the old column's data is permanently deleted.
BEGIN;
ALTER TABLE users DROP COLUMN account_id;
DROP INDEX IF EXISTS users_account_id_idx;
COMMIT;
Pre-drop validation:
SELECT COUNT(*) FROM users WHERE account_id IS NOT NULL;
Rollback: Not straightforward — dropped column data cannot be recovered. Only proceed to Phase 4 after the rollback window has passed and the change is confirmed stable.
4. Data Validation Plan
Run these queries before and after the full migration to confirm data integrity.
Pre-migration baseline:
SELECT COUNT(*) AS total_users FROM users;
SELECT COUNT(*) AS total_orgs FROM organisations;
SELECT MIN(created_at), MAX(created_at) FROM users;
SELECT COUNT(*) AS users_without_account
FROM users WHERE account_id IS NULL;
Post-backfill integrity check:
SELECT COUNT(*) AS orphaned_org_refs
FROM users u
WHERE u.org_id IS NOT NULL
AND NOT EXISTS (
SELECT 1 FROM organisations o WHERE o.id = u.org_id
);
SELECT COUNT(*) AS mismatched_backfill
FROM users u
JOIN accounts a ON u.account_id = a.id
WHERE u.org_id != a.organisation_id;
SELECT COUNT(*) AS total_users_after FROM users;
Post-contract final check:
SELECT COUNT(*) FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'account_id';
SELECT is_nullable FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'org_id';
5. Performance Impact Assessment
| Step | Lock type | Lock duration | Traffic impact |
|---|
| Add nullable column | ACCESS EXCLUSIVE | Milliseconds | Negligible |
| CREATE INDEX CONCURRENTLY | SHARE UPDATE EXCLUSIVE | Minutes (proportional to table size) | Reads and writes continue |
| Batch backfill | Row-level locks only | <5s per batch | Low if batches are small |
| ADD CONSTRAINT NOT VALID | ACCESS EXCLUSIVE | Milliseconds | Negligible |
| VALIDATE CONSTRAINT | SHARE UPDATE EXCLUSIVE | Minutes | Reads and writes continue |
| ALTER COLUMN SET NOT NULL | ACCESS EXCLUSIVE | Milliseconds (if check constraint validated) | Negligible |
| DROP COLUMN | ACCESS EXCLUSIVE | Milliseconds | Negligible |
Expected load increase during backfill:
- DB CPU: [estimated % increase during batch writes]
- DB I/O: [estimated increase]
- Monitoring threshold to pause backfill: [e.g. DB CPU > 80% for >2 minutes]
Backfill rate estimate:
- Table size: [X million rows]
- Batch size: [1000 rows]
- Pause between batches: [100ms]
- Estimated total duration: [X hours at Y rows/second]
6. Deployment Runbook
Follow this checklist on the day of migration. Mark each step as done before proceeding.
Pre-migration (day before):
Phase 1 — Expand (T+0):
Phase 2 — Backfill (T+[X hours]):
Phase 3 — Enforce (T+[X days]):
Phase 4 — Contract (T+[X days after rollback window]):
Quality Checks
Anti-Patterns