| name | database-schema-designer |
| description | Design robust, scalable database schemas for SQL and NoSQL databases. Provides normalization guidelines, indexing strategies, migration patterns, constraint design, and performance optimization. Ensures data integrity, query performance, and maintainable data models. |
| license | MIT |
Database Schema Designer
Design production-ready database schemas with best practices built-in.
Quick Start
Just describe your data model:
design a schema for an e-commerce platform with users, products, orders
You'll get a complete SQL schema like:
CREATE TABLE users (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id),
total DECIMAL(10,2) NOT NULL,
INDEX idx_orders_user (user_id)
);
What to include in your request:
- Entities (users, products, orders)
- Key relationships (users have orders, orders have items)
- Scale hints (high-traffic, millions of records)
- Database preference (SQL/NoSQL) - defaults to SQL if not specified
Triggers
| Trigger | Example |
|---|
design schema | "design a schema for user authentication" |
database design | "database design for multi-tenant SaaS" |
create tables | "create tables for a blog system" |
schema for | "schema for inventory management" |
model data | "model data for real-time analytics" |
I need a database | "I need a database for tracking orders" |
design NoSQL | "design NoSQL schema for product catalog" |
Key Terms
| Term | Definition |
|---|
| Normalization | Organizing data to reduce redundancy (1NF → 2NF → 3NF) |
| 3NF | Third Normal Form - no transitive dependencies between columns |
| OLTP | Online Transaction Processing - write-heavy, needs normalization |
| OLAP | Online Analytical Processing - read-heavy, benefits from denormalization |
| Foreign Key (FK) | Column that references another table's primary key |
| Index | Data structure that speeds up queries (at cost of slower writes) |
| Access Pattern | How your app reads/writes data (queries, joins, filters) |
| Denormalization | Intentionally duplicating data to speed up reads |
Quick Reference
| Task | Approach | Key Consideration |
|---|
| New schema | Normalize to 3NF first | Domain modeling over UI |
| SQL vs NoSQL | Access patterns decide | Read/write ratio matters |
| Primary keys | INT or UUID | UUID for distributed systems |
| Foreign keys | Always constrain | ON DELETE strategy critical |
| Indexes | FKs + WHERE columns | Column order matters |
| Migrations | Always reversible | Backward compatible first |
Process Overview
Your Data Requirements
|
v
+-----------------------------------------------------+
| Phase 1: ANALYSIS |
| * Identify entities and relationships |
| * Determine access patterns (read vs write heavy) |
| * Choose SQL or NoSQL based on requirements |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 2: DESIGN |
| * Normalize to 3NF (SQL) or embed/reference (NoSQL) |
| * Define primary keys and foreign keys |
| * Choose appropriate data types |
| * Add constraints (UNIQUE, CHECK, NOT NULL) |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 3: OPTIMIZE |
| * Plan indexing strategy |
| * Consider denormalization for read-heavy queries |
| * Add timestamps (created_at, updated_at) |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 4: MIGRATE |
| * Generate migration scripts (up + down) |
| * Ensure backward compatibility |
| * Plan zero-downtime deployment |
+-----------------------------------------------------+
|
v
Production-Ready Schema
Commands
| Command | When to Use | Action |
|---|
design schema for {domain} | Starting fresh | Full schema generation |
normalize {table} | Fixing existing table | Apply normalization rules |
add indexes for {table} | Performance issues | Generate index strategy |
migration for {change} | Schema evolution | Create reversible migration |
review schema | Code review | Audit existing schema |
Workflow: Start with design schema → iterate with normalize → optimize with add indexes → evolve with migration
Core Principles
| Principle | WHY | Implementation |
|---|
| Model the Domain | UI changes, domain doesn't | Entity names reflect business concepts |
| Data Integrity First | Corruption is costly to fix | Constraints at database level |
| Optimize for Access Pattern | Can't optimize for both | OLTP: normalized, OLAP: denormalized |
| Plan for Scale | Retrofitting is painful | Index strategy + partitioning plan |
Anti-Patterns
| Avoid | Why | Instead |
|---|
| VARCHAR(255) everywhere | Wastes storage, hides intent | Size appropriately per field |
| FLOAT for money | Rounding errors | DECIMAL(10,2) |
| Missing FK constraints | Orphaned data | Always define foreign keys |
| No indexes on FKs | Slow JOINs | Index every foreign key |
| Storing dates as strings | Can't compare/sort | DATE, TIMESTAMP types |
| SELECT * in queries | Fetches unnecessary data | Explicit column lists |
| Non-reversible migrations | Can't rollback | Always write DOWN migration |
| Adding NOT NULL without default | Breaks existing rows | Add nullable, backfill, then constrain |
Verification Checklist
After designing a schema:
Deep Dive: Normalization (SQL)
Normal Forms
| Form | Rule | Violation Example |
|---|
| 1NF | Atomic values, no repeating groups | product_ids = '1,2,3' |
| 2NF | 1NF + no partial dependencies | customer_name in order_items |
| 3NF | 2NF + no transitive dependencies | country derived from postal_code |
1st Normal Form (1NF)
CREATE TABLE orders (
id INT PRIMARY KEY,
product_ids VARCHAR(255)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT
);
CREATE TABLE order_items (
id INT PRIMARY KEY,
order_id INT REFERENCES orders(id),
product_id INT
);
2nd Normal Form (2NF)
CREATE TABLE order_items (
order_id INT,
product_id INT,
customer_name VARCHAR(100),
PRIMARY KEY (order_id, product_id)
);
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100)
);
3rd Normal Form (3NF)
CREATE TABLE customers (
id INT PRIMARY KEY,
postal_code VARCHAR(10),
country VARCHAR(50)
);
CREATE TABLE postal_codes (
code VARCHAR(10) PRIMARY KEY,
country VARCHAR(50)
);
When to Denormalize
| Scenario | Denormalization Strategy |
|---|
| Read-heavy reporting | Pre-calculated aggregates |
| Expensive JOINs | Cached derived columns |
| Analytics dashboards | Materialized views |
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
total_amount DECIMAL(10,2),
item_count INT
);
Deep Dive: Data Types
String Types
| Type | Use Case | Example |
|---|
| CHAR(n) | Fixed length | State codes, ISO dates |
| VARCHAR(n) | Variable length | Names, emails |
| TEXT | Long content | Articles, descriptions |
email VARCHAR(255)
phone VARCHAR(20)
country_code CHAR(2)
Numeric Types
| Type | Range | Use Case |
|---|
| TINYINT | -128 to 127 | Age, status codes |
| SMALLINT | -32K to 32K | Quantities |
| INT | -2.1B to 2.1B | IDs, counts |
| BIGINT | Very large | Large IDs, timestamps |
| DECIMAL(p,s) | Exact precision | Money |
| FLOAT/DOUBLE | Approximate | Scientific data |
price DECIMAL(10, 2)
price FLOAT
Date/Time Types
DATE
TIME
DATETIME
TIMESTAMP
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
Boolean
is_active BOOLEAN DEFAULT TRUE
is_active TINYINT(1) DEFAULT 1
Deep Dive: Indexing Strategy
When to Create Indexes
| Always Index | Reason |
|---|
| Foreign keys | Speed up JOINs |
| WHERE clause columns | Speed up filtering |
| ORDER BY columns | Speed up sorting |
| Unique constraints | Enforced uniqueness |
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_status_date ON orders(status, created_at);
Index Types
| Type | Best For | Example |
|---|
| B-Tree | Ranges, equality | price > 100 |
| Hash | Exact matches only | email = 'x@y.com' |
| Full-text | Text search | MATCH AGAINST |
| Partial | Subset of rows | WHERE is_active = true |
Composite Index Order
CREATE INDEX idx_customer_status ON orders(customer_id, status);
SELECT * FROM orders WHERE customer_id = 123;
SELECT * FROM orders WHERE customer_id = 123 AND status = 'pending';
SELECT * FROM orders WHERE status = 'pending';
Rule: Most selective column first, or column most queried alone.
Index Pitfalls
| Pitfall | Problem | Solution |
|---|
| Over-indexing | Slow writes | Only index what's queried |
| Wrong column order | Unused index | Match query patterns |
| Missing FK indexes | Slow JOINs | Always index FKs |
Deep Dive: Constraints
Primary Keys
id INT AUTO_INCREMENT PRIMARY KEY
id CHAR(36) PRIMARY KEY DEFAULT (UUID())
PRIMARY KEY (student_id, course_id)
Foreign Keys
FOREIGN KEY (customer_id) REFERENCES customers(id)
ON DELETE CASCADE
ON DELETE RESTRICT
ON DELETE SET NULL
ON UPDATE CASCADE
| Strategy | Use When |
|---|
| CASCADE | Dependent data (order_items) |
| RESTRICT | Important references (prevent accidents) |
| SET NULL | Optional relationships |
Other Constraints
email VARCHAR(255) UNIQUE NOT NULL
UNIQUE (student_id, course_id)
price DECIMAL(10,2) CHECK (price >= 0)
discount INT CHECK (discount BETWEEN 0 AND 100)
name VARCHAR(100) NOT NULL
Deep Dive: Relationship Patterns
One-to-Many
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL REFERENCES customers(id)
);
CREATE TABLE order_items (
id INT PRIMARY KEY,
order_id INT NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
product_id INT NOT NULL,
quantity INT NOT NULL
);
Many-to-Many
CREATE TABLE enrollments (
student_id INT REFERENCES students(id) ON DELETE CASCADE,
course_id INT REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (student_id, course_id)
);
Self-Referencing
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
manager_id INT REFERENCES employees(id)
);
Polymorphic
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT NOT NULL,
post_id INT REFERENCES posts(id),
photo_id INT REFERENCES photos(id),
CHECK (
(post_id IS NOT NULL AND photo_id IS NULL) OR
(post_id IS NULL AND photo_id IS NOT NULL)
)
);
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT NOT NULL,
commentable_type VARCHAR(50) NOT NULL,
commentable_id INT NOT NULL
);
Deep Dive: NoSQL Design (MongoDB)
Embedding vs Referencing
| Factor | Embed | Reference |
|---|
| Access pattern | Read together | Read separately |
| Relationship | 1:few | 1:many |
| Document size | Small | Approaching 16MB |
| Update frequency | Rarely | Frequently |
Embedded Document
{
"_id": "order_123",
"customer": {
"id": "cust_456",
"name": "Jane Smith",
"email": "jane@example.com"
},
"items": [
{ "product_id": "prod_789", "quantity": 2, "price": 29.99 }
],
"total": 109.97
}
Referenced Document
{
"_id": "order_123",
"customer_id": "cust_456",
"item_ids": ["item_1", "item_2"],
"total": 109.97
}
MongoDB Indexes
db.users.createIndex({ email: 1 }, { unique: true });
db.orders.createIndex({ customer_id: 1, created_at: -1 });
db.articles.createIndex({ title: "text", content: "text" });
db.stores.createIndex({ location: "2dsphere" });
Deep Dive: Migrations
Migration Best Practices
| Practice | WHY |
|---|
| Always reversible | Need to rollback |
| Backward compatible | Zero-downtime deploys |
| Schema before data | Separate concerns |
| Test on staging | Catch issues early |
Adding a Column (Zero-Downtime)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
UPDATE users SET phone = '' WHERE phone IS NULL;
ALTER TABLE users MODIFY phone VARCHAR(20) NOT NULL;
Renaming a Column (Zero-Downtime)
ALTER TABLE users ADD COLUMN email_address VARCHAR(255);
UPDATE users SET email_address = email;
ALTER TABLE users DROP COLUMN email;
Migration Template
BEGIN;
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
CREATE INDEX idx_users_phone ON users(phone);
COMMIT;
BEGIN;
DROP INDEX idx_users_phone ON users;
ALTER TABLE users DROP COLUMN phone;
COMMIT;
Deep Dive: Performance Optimization
Query Analysis
EXPLAIN SELECT * FROM orders
WHERE customer_id = 123 AND status = 'pending';
| Look For | Meaning |
|---|
| type: ALL | Full table scan (bad) |
| type: ref | Index used (good) |
| key: NULL | No index used |
| rows: high | Many rows scanned |
N+1 Query Problem
orders = db.query("SELECT * FROM orders")
for order in orders:
customer = db.query(f"SELECT * FROM customers WHERE id = {order.customer_id}")
results = db.query("""
SELECT orders.*, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id
""")
Optimization Techniques
| Technique | When to Use |
|---|
| Add indexes | Slow WHERE/ORDER BY |
| Denormalize | Expensive JOINs |
| Pagination | Large result sets |
| Caching | Repeated queries |
| Read replicas | Read-heavy load |
| Partitioning | Very large tables |
Extension Points
- Database-Specific Patterns: Add MySQL vs PostgreSQL vs SQLite variations
- Advanced Patterns: Time-series, event sourcing, CQRS, multi-tenancy
- ORM Integration: TypeORM, Prisma, SQLAlchemy patterns
- Monitoring: Query performance tracking, slow query alerts