Database Schema Designer

Name: Database Schema Designer
Author: softaworks

This skill assists in designing production-ready database schemas by providing best practices, including normalization, indexing, and foreign key constraints, with support for both SQL and NoSQL models. Users can describe their data models, specify relationships, scale requirements, and receive comprehensive schema definitions, migration scripts, and optimization strategies. It's ideal for database developers, system architects, and data engineers seeking to create efficient, scalable, and maintainable database structures.

npx skills add https://github.com/softaworks/agent-toolkit --skill database-schema-designer

Database Schema Designer

Design production-ready database schemas with best practices built-in.

Quick Start

Just describe your data model:

design a schema for an e-commerce platform with users, products, orders

You'll get a complete SQL schema like:

CREATE TABLE users (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(id),
  total DECIMAL(10,2) NOT NULL,
  INDEX idx_orders_user (user_id)
);

What to include in your request:

Entities (users, products, orders)
Key relationships (users have orders, orders have items)
Scale hints (high-traffic, millions of records)
Database preference (SQL/NoSQL) - defaults to SQL if not specified

Triggers

Trigger Example design schema "design a schema for user authentication" database design "database design for multi-tenant SaaS" create tables "create tables for a blog system" schema for "schema for inventory management" model data "model data for real-time analytics" I need a database "I need a database for tracking orders" design NoSQL "design NoSQL schema for product catalog"

Key Terms

Term Definition Normalization Organizing data to reduce redundancy (1NF → 2NF → 3NF) 3NF Third Normal Form - no transitive dependencies between columns OLTP Online Transaction Processing - write-heavy, needs normalization OLAP Online Analytical Processing - read-heavy, benefits from denormalization Foreign Key (FK) Column that references another table's primary key Index Data structure that speeds up queries (at cost of slower writes) Access Pattern How your app reads/writes data (queries, joins, filters) Denormalization Intentionally duplicating data to speed up reads

Quick Reference

Task Approach Key Consideration New schema Normalize to 3NF first Domain modeling over UI SQL vs NoSQL Access patterns decide Read/write ratio matters Primary keys INT or UUID UUID for distributed systems Foreign keys Always constrain ON DELETE strategy critical Indexes FKs + WHERE columns Column order matters Migrations Always reversible Backward compatible first

Process Overview

Your Data Requirements
    |
    v
+-----------------------------------------------------+
| Phase 1: ANALYSIS                                   |
| * Identify entities and relationships               |
| * Determine access patterns (read vs write heavy)   |
| * Choose SQL or NoSQL based on requirements         |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 2: DESIGN                                     |
| * Normalize to 3NF (SQL) or embed/reference (NoSQL) |
| * Define primary keys and foreign keys              |
| * Choose appropriate data types                     |
| * Add constraints (UNIQUE, CHECK, NOT NULL)         |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 3: OPTIMIZE                                   |
| * Plan indexing strategy                            |
| * Consider denormalization for read-heavy queries   |
| * Add timestamps (created_at, updated_at)           |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 4: MIGRATE                                    |
| * Generate migration scripts (up + down)            |
| * Ensure backward compatibility                     |
| * Plan zero-downtime deployment                     |
+-----------------------------------------------------+
    |
    v
Production-Ready Schema

Commands

Command When to Use Action design schema for {domain} Starting fresh Full schema generation normalize {table} Fixing existing table Apply normalization rules add indexes for {table} Performance issues Generate index strategy migration for {change} Schema evolution Create reversible migration review schema Code review Audit existing schema Workflow: Start with design schema → iterate with normalize → optimize with add indexes → evolve with migration

Core Principles

Principle WHY Implementation Model the Domain UI changes, domain doesn't Entity names reflect business concepts Data Integrity First Corruption is costly to fix Constraints at database level Optimize for Access Pattern Can't optimize for both OLTP: normalized, OLAP: denormalized Plan for Scale Retrofitting is painful Index strategy + partitioning plan

Anti-Patterns

Avoid Why Instead VARCHAR(255) everywhere Wastes storage, hides intent Size appropriately per field FLOAT for money Rounding errors DECIMAL(10,2) Missing FK constraints Orphaned data Always define foreign keys No indexes on FKs Slow JOINs Index every foreign key Storing dates as strings Can't compare/sort DATE, TIMESTAMP types SELECT * in queries Fetches unnecessary data Explicit column lists Non-reversible migrations Can't rollback Always write DOWN migration Adding NOT NULL without default Breaks existing rows Add nullable, backfill, then constrain

Verification Checklist

After designing a schema:

Every table has a primary key
All relationships have foreign key constraints
ON DELETE strategy defined for each FK
Indexes exist on all foreign keys
Indexes exist on frequently queried columns
Appropriate data types (DECIMAL for money, etc.)
NOT NULL on required fields
UNIQUE constraints where needed
CHECK constraints for validation
created_at and updated_at timestamps
Migration scripts are reversible
Tested on staging with production data

Normal Forms

Form Rule Violation Example 1NF Atomic values, no repeating groups product_ids = '1,2,3' 2NF 1NF + no partial dependencies customer_name in order_items 3NF 2NF + no transitive dependencies country derived from postal_code

1st Normal Form (1NF)

-- BAD: Multiple values in column
CREATE TABLE orders (
  id INT PRIMARY KEY,
  product_ids VARCHAR(255)  -- '101,102,103'
);
-- GOOD: Separate table for items
CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT
);
CREATE TABLE order_items (
  id INT PRIMARY KEY,
  order_id INT REFERENCES orders(id),
  product_id INT
);

2nd Normal Form (2NF)

-- BAD: customer_name depends only on customer_id
CREATE TABLE order_items (
  order_id INT,
  product_id INT,
  customer_name VARCHAR(100),  -- Partial dependency!
  PRIMARY KEY (order_id, product_id)
);
-- GOOD: Customer data in separate table
CREATE TABLE customers (
  id INT PRIMARY KEY,
  name VARCHAR(100)
);

3rd Normal Form (3NF)

-- BAD: country depends on postal_code
CREATE TABLE customers (
  id INT PRIMARY KEY,
  postal_code VARCHAR(10),
  country VARCHAR(50)  -- Transitive dependency!
);
-- GOOD: Separate postal_codes table
CREATE TABLE postal_codes (
  code VARCHAR(10) PRIMARY KEY,
  country VARCHAR(50)
);

When to Denormalize

Scenario Denormalization Strategy Read-heavy reporting Pre-calculated aggregates Expensive JOINs Cached derived columns Analytics dashboards Materialized views

-- Denormalized for performance
CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT,
  total_amount DECIMAL(10,2),  -- Calculated
  item_count INT               -- Calculated
);

String Types

Type Use Case Example CHAR(n) Fixed length State codes, ISO dates VARCHAR(n) Variable length Names, emails TEXT Long content Articles, descriptions

-- Good sizing
email VARCHAR(255)
phone VARCHAR(20)
country_code CHAR(2)

Numeric Types

Type Range Use Case TINYINT -128 to 127 Age, status codes SMALLINT -32K to 32K Quantities INT -2.1B to 2.1B IDs, counts BIGINT Very large Large IDs, timestamps DECIMAL(p,s) Exact precision Money FLOAT/DOUBLE Approximate Scientific data

-- ALWAYS use DECIMAL for money
price DECIMAL(10, 2)  -- $99,999,999.99
-- NEVER use FLOAT for money
price FLOAT  -- Rounding errors!

Date/Time Types

DATE        -- 2025-10-31
TIME        -- 14:30:00
DATETIME    -- 2025-10-31 14:30:00
TIMESTAMP   -- Auto timezone conversion
-- Always store in UTC
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

Boolean

-- PostgreSQL
is_active BOOLEAN DEFAULT TRUE
-- MySQL
is_active TINYINT(1) DEFAULT 1

When to Create Indexes

Always Index Reason Foreign keys Speed up JOINs WHERE clause columns Speed up filtering ORDER BY columns Speed up sorting Unique constraints Enforced uniqueness

-- Foreign key index
CREATE INDEX idx_orders_customer ON orders(customer_id);
-- Query pattern index
CREATE INDEX idx_orders_status_date ON orders(status, created_at);

Index Types

Type Best For Example B-Tree Ranges, equality price > 100 Hash Exact matches only email = 'x@y.com' Full-text Text search MATCH AGAINST Partial Subset of rows WHERE is_active = true

Composite Index Order

CREATE INDEX idx_customer_status ON orders(customer_id, status);
-- Uses index (customer_id first)
SELECT * FROM orders WHERE customer_id = 123;
SELECT * FROM orders WHERE customer_id = 123 AND status = 'pending';
-- Does NOT use index (status alone)
SELECT * FROM orders WHERE status = 'pending';

Rule: Most selective column first, or column most queried alone.

Index Pitfalls

Pitfall Problem Solution Over-indexing Slow writes Only index what's queried Wrong column order Unused index Match query patterns Missing FK indexes Slow JOINs Always index FKs

Primary Keys

-- Auto-increment (simple)
id INT AUTO_INCREMENT PRIMARY KEY
-- UUID (distributed systems)
id CHAR(36) PRIMARY KEY DEFAULT (UUID())
-- Composite (junction tables)
PRIMARY KEY (student_id, course_id)

Foreign Keys

FOREIGN KEY (customer_id) REFERENCES customers(id)
  ON DELETE CASCADE     -- Delete children with parent
  ON DELETE RESTRICT    -- Prevent deletion if referenced
  ON DELETE SET NULL    -- Set to NULL when parent deleted
  ON UPDATE CASCADE     -- Update children when parent changes

Strategy Use When CASCADE Dependent data (order_items) RESTRICT Important references (prevent accidents) SET NULL Optional relationships

Other Constraints

-- Unique
email VARCHAR(255) UNIQUE NOT NULL
-- Composite unique
UNIQUE (student_id, course_id)
-- Check
price DECIMAL(10,2) CHECK (price >= 0)
discount INT CHECK (discount BETWEEN 0 AND 100)
-- Not null
name VARCHAR(100) NOT NULL

One-to-Many

CREATE TABLE orders (
  id INT PRIMARY KEY,
  customer_id INT NOT NULL REFERENCES customers(id)
);
CREATE TABLE order_items (
  id INT PRIMARY KEY,
  order_id INT NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
  product_id INT NOT NULL,
  quantity INT NOT NULL
);

Many-to-Many

-- Junction table
CREATE TABLE enrollments (
  student_id INT REFERENCES students(id) ON DELETE CASCADE,
  course_id INT REFERENCES courses(id) ON DELETE CASCADE,
  enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (student_id, course_id)
);

Self-Referencing

CREATE TABLE employees (
  id INT PRIMARY KEY,
  name VARCHAR(100) NOT NULL,
  manager_id INT REFERENCES employees(id)
);

Polymorphic

-- Approach 1: Separate FKs (stronger integrity)
CREATE TABLE comments (
  id INT PRIMARY KEY,
  content TEXT NOT NULL,
  post_id INT REFERENCES posts(id),
  photo_id INT REFERENCES photos(id),
  CHECK (
    (post_id IS NOT NULL AND photo_id IS NULL) OR
    (post_id IS NULL AND photo_id IS NOT NULL)
  )
);
-- Approach 2: Type + ID (flexible, weaker integrity)
CREATE TABLE comments (
  id INT PRIMARY KEY,
  content TEXT NOT NULL,
  commentable_type VARCHAR(50) NOT NULL,
  commentable_id INT NOT NULL
);

Embedding vs Referencing

Factor Embed Reference Access pattern Read together Read separately Relationship 1:few 1:many Document size Small Approaching 16MB Update frequency Rarely Frequently

Embedded Document

{
  "_id": "order_123",
  "customer": {
    "id": "cust_456",
    "name": "Jane Smith",
    "email": "jane@example.com"
  },
  "items": [
    { "product_id": "prod_789", "quantity": 2, "price": 29.99 }
  ],
  "total": 109.97
}

Referenced Document

{
  "_id": "order_123",
  "customer_id": "cust_456",
  "item_ids": ["item_1", "item_2"],
  "total": 109.97
}

MongoDB Indexes

// Single field
db.users.createIndex({ email: 1 }, { unique: true });
// Composite
db.orders.createIndex({ customer_id: 1, created_at: -1 });
// Text search
db.articles.createIndex({ title: "text", content: "text" });
// Geospatial
db.stores.createIndex({ location: "2dsphere" });

Migration Best Practices

Practice WHY Always reversible Need to rollback Backward compatible Zero-downtime deploys Schema before data Separate concerns Test on staging Catch issues early

Adding a Column (Zero-Downtime)

-- Step 1: Add nullable column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
-- Step 2: Deploy code that writes to new column
-- Step 3: Backfill existing rows
UPDATE users SET phone = '' WHERE phone IS NULL;
-- Step 4: Make required (if needed)
ALTER TABLE users MODIFY phone VARCHAR(20) NOT NULL;

Renaming a Column (Zero-Downtime)

-- Step 1: Add new column
ALTER TABLE users ADD COLUMN email_address VARCHAR(255);
-- Step 2: Copy data
UPDATE users SET email_address = email;
-- Step 3: Deploy code reading from new column
-- Step 4: Deploy code writing to new column
-- Step 5: Drop old column
ALTER TABLE users DROP COLUMN email;

Migration Template

-- Migration: YYYYMMDDHHMMSS_description.sql
-- UP
BEGIN;
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
CREATE INDEX idx_users_phone ON users(phone);
COMMIT;
-- DOWN
BEGIN;
DROP INDEX idx_users_phone ON users;
ALTER TABLE users DROP COLUMN phone;
COMMIT;

Query Analysis

EXPLAIN SELECT * FROM orders
WHERE customer_id = 123 AND status = 'pending';

Look For Meaning type: ALL Full table scan (bad) type: ref Index used (good) key: NULL No index used rows: high Many rows scanned

N+1 Query Problem

# BAD: N+1 queries
orders = db.query("SELECT * FROM orders")
for order in orders:
    customer = db.query(f"SELECT * FROM customers WHERE id = {order.customer_id}")
# GOOD: Single JOIN
results = db.query("""
    SELECT orders.*, customers.name
    FROM orders
    JOIN customers ON orders.customer_id = customers.id
""")

Optimization Techniques

Technique When to Use Add indexes Slow WHERE/ORDER BY Denormalize Expensive JOINs Pagination Large result sets Caching Repeated queries Read replicas Read-heavy load Partitioning Very large tables

Extension Points

Database-Specific Patterns: Add MySQL vs PostgreSQL vs SQLite variations
Advanced Patterns: Time-series, event sourcing, CQRS, multi-tenancy
ORM Integration: TypeORM, Prisma, SQLAlchemy patterns
Monitoring: Query performance tracking, slow query alerts

GitHub Owner

Owner: softaworks

GitHub Links

Website: https://softaworks.com/

SKILL.md

name: database-schema-designer description: Design robust, scalable database schemas for SQL and NoSQL databases. Provides normalization guidelines, indexing strategies, migration patterns, constraint design, and performance optimization. Ensures data integrity, query performance, and maintainable data models. license: MIT

Database Schema Designer

Design production-ready database schemas with best practices built-in.

Quick Start

Just describe your data model:

design a schema for an e-commerce platform with users, products, orders

You'll get a complete SQL schema like:

CREATE TABLE users (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  user_id BIGINT NOT NULL REFERENCES users(id),
  total DECIMAL(10,2) NOT NULL,
  INDEX idx_orders_user (user_id)
);

What to include in your request:

Entities (users, products, orders)
Key relationships (users have orders, orders have items)
Scale hints (high-traffic, millions of records)
Database preference (SQL/NoSQL) - defaults to SQL if not specified

Triggers

Trigger	Example
`design schema`	"design a schema for user authentication"
`database design`	"database design for multi-tenant SaaS"
`create tables`	"create tables for a blog system"
`schema for`	"schema for inventory management"
`model data`	"model data for real-time analytics"
`I need a database`	"I need a database for tracking orders"
`design NoSQL`	"design NoSQL schema for product catalog"

Key Terms

Term	Definition
Normalization	Organizing data to reduce redundancy (1NF → 2NF → 3NF)
3NF	Third Normal Form - no transitive dependencies between columns
OLTP	Online Transaction Processing - write-heavy, needs normalization
OLAP	Online Analytical Processing - read-heavy, benefits from denormalization
Foreign Key (FK)	Column that references another table's primary key
Index	Data structure that speeds up queries (at cost of slower writes)
Access Pattern	How your app reads/writes data (queries, joins, filters)
Denormalization	Intentionally duplicating data to speed up reads

Quick Reference

Task	Approach	Key Consideration
New schema	Normalize to 3NF first	Domain modeling over UI
SQL vs NoSQL	Access patterns decide	Read/write ratio matters
Primary keys	INT or UUID	UUID for distributed systems
Foreign keys	Always constrain	ON DELETE strategy critical
Indexes	FKs + WHERE columns	Column order matters
Migrations	Always reversible	Backward compatible first

Process Overview

Your Data Requirements
    |
    v
+-----------------------------------------------------+
| Phase 1: ANALYSIS                                   |
| * Identify entities and relationships               |
| * Determine access patterns (read vs write heavy)   |
| * Choose SQL or NoSQL based on requirements         |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 2: DESIGN                                     |
| * Normalize to 3NF (SQL) or embed/reference (NoSQL) |
| * Define primary keys and foreign keys              |
| * Choose appropriate data types                     |
| * Add constraints (UNIQUE, CHECK, NOT NULL)         |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 3: OPTIMIZE                                   |
| * Plan indexing strategy                            |
| * Consider denormalization for read-heavy queries   |
| * Add timestamps (created_at, updated_at)           |
+-----------------------------------------------------+
    |
    v
+-----------------------------------------------------+
| Phase 4: MIGRATE                                    |
| * Generate migration scripts (up + down)            |
| * Ensure backward compatibility                     |
| * Plan zero-downtime deployment                     |
+-----------------------------------------------------+
    |
    v
Production-Ready Schema

Commands

Command	When to Use	Action
`design schema for {domain}`	Starting fresh	Full schema generation
`normalize {table}`	Fixing existing table	Apply normalization rules
`add indexes for {table}`	Performance issues	Generate index strategy
`migration for {change}`	Schema evolution	Create reversible migration
`review schema`	Code review	Audit existing schema
Workflow: Start with `design schema` → iterate with `normalize` → optimize with `add indexes` → evolve with `migration`

Core Principles

Principle	WHY	Implementation
Model the Domain	UI changes, domain doesn't	Entity names reflect business concepts
Data Integrity First	Corruption is costly to fix	Constraints at database level
Optimize for Access Pattern	Can't optimize for both	OLTP: normalized, OLAP: denormalized
Plan for Scale	Retrofitting is painful	Index strategy + partitioning plan

Anti-Patterns

Avoid	Why	Instead
VARCHAR(255) everywhere	Wastes storage, hides intent	Size appropriately per field
FLOAT for money	Rounding errors	DECIMAL(10,2)
Missing FK constraints	Orphaned data	Always define foreign keys
No indexes on FKs	Slow JOINs	Index every foreign key
Storing dates as strings	Can't compare/sort	DATE, TIMESTAMP types
SELECT * in queries	Fetches unnecessary data	Explicit column lists
Non-reversible migrations	Can't rollback	Always write DOWN migration
Adding NOT NULL without default	Breaks existing rows	Add nullable, backfill, then constrain

Verification Checklist

After designing a schema:

Database Schema Designer

Quick Start

Triggers

Key Terms

Quick Reference

Process Overview

Commands

Core Principles

Anti-Patterns

Verification Checklist

Normal Forms

1st Normal Form (1NF)

2nd Normal Form (2NF)

3rd Normal Form (3NF)

When to Denormalize

String Types

Numeric Types

Date/Time Types

Boolean

When to Create Indexes

Index Types

Composite Index Order

Index Pitfalls

Primary Keys

Foreign Keys

Other Constraints

One-to-Many

Many-to-Many

Self-Referencing

Polymorphic

Embedding vs Referencing

Embedded Document

Referenced Document

MongoDB Indexes

Migration Best Practices

Adding a Column (Zero-Downtime)

Renaming a Column (Zero-Downtime)

Migration Template

Query Analysis

N+1 Query Problem

Optimization Techniques

Extension Points

GitHub Owner

GitHub Links

SKILL.md

Database Schema Designer

Design production-ready database schemas with best practices built-in.

Quick Start

Triggers

Key Terms

Quick Reference

Process Overview

Commands

Core Principles

Anti-Patterns

Verification Checklist

More skills