Search Docs…

Search Docs…

Review SQL Migrations in 30 Seconds: Seed, Migrate, Compare

Review SQL Migrations in 30 Seconds: Seed, Migrate, Compare

You review application code with tests. You review migrations by reading SQL and hoping for the best. Here's how to actually validate what a migration does before you approve it.

It's Wednesday morning. You have three PRs to review before standup. Two are application code — you check the tests, read the logic, leave a comment about a missing null check, approve the rest. The third PR adds a migration: ALTER TABLE users ALTER COLUMN email TYPE TEXT. You read the SQL. It looks fine. You approve.

On Thursday, the migration runs against production. The users table has 4 million rows. The column type change triggers a full table rewrite. That's not the problem — the team expected that. The problem is that 23,000 rows have email addresses longer than 254 characters stored in the old VARCHAR(255) column (a data entry bug from 2023 that nobody caught). When the application code that validates email length on read suddenly gets TEXT fields with no length constraint, downstream services start choking on payloads they weren't built for.

The migration was syntactically correct. The schema change made sense. But the effect of that migration on real data was something nobody looked at. You approved a diff. You didn't approve an outcome.

The Code Review Blind Spot

Code review has a well-established loop: read the change, check the tests, verify the behavior matches the intent. When someone adds a new endpoint, you can look at the test suite and see exactly what inputs produce what outputs. When someone refactors a service, the tests prove that existing behavior is preserved.

Migrations have none of this. A typical migration PR looks like:

-- migrations/20260226_consolidate_user_names.sql

ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
UPDATE users SET full_name = first_name || ' ' || last_name;
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN

-- migrations/20260226_consolidate_user_names.sql

ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
UPDATE users SET full_name = first_name || ' ' || last_name;
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN

-- migrations/20260226_consolidate_user_names.sql

ALTER TABLE users ADD COLUMN full_name VARCHAR(200);
UPDATE users SET full_name = first_name || ' ' || last_name;
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN

As a reviewer, you read this and think: "Okay, concatenating first and last name into a single column. Makes sense." You check for obvious mistakes — missing WHERE clause, wrong column name, bad syntax. You approve.

What you didn't check:

  • Are there rows where first_name is NULL? The concatenation produces NULL || ' ' || 'Smith' = NULL. The full_name column now has NULLs where you expected names.

  • Are there rows where first_name or last_name contain leading or trailing spaces? You now have " John " || " " || "Smith" = " John Smith".

  • Does the combined length of first_name + ' ' + last_name ever exceed 200 characters? If it does, the UPDATE silently truncates or throws — depending on your database settings.

These aren't edge cases. They're the normal state of any database that's been in production for more than six months. But you can't see them by reading SQL. You can only see them by running the migration against data that has the same shape as production.

What Reviewers Actually Need

When you review application code, you don't just read the source — you read the test output. The tests tell you: "Given this input, this is what happens." You're reviewing behavior, not just syntax.

Migration reviews should work the same way. What a reviewer actually needs is:

  1. Before state: What does the data look like right now?

  2. After state: What does the data look like after the migration?

  3. Diff: What changed, and does the change match the stated intent?

If a migration says "consolidate first and last name into full_name," the reviewer should be able to see a sample of actual rows before and after. Not imagine them. See them.

The problem is that this requires data. Your local development database has 12 rows, all with perfectly formatted names like "John Smith" and "Jane Doe." That data will never reveal the NULL concatenation bug, the trailing spaces, or the truncation. It's too clean, too small, and too synthetic.

The Workflow: Seed, Migrate, Compare

The fix is a three-step workflow that takes about 30 seconds of hands-on time:

Step 1: Seed Realistic Data

Start with a database that has your current schema (before the migration) and fill it with data that has realistic variety — NULLs, edge-case strings, varying lengths, the kind of messiness that accumulates in production.

# Seed users with realistic variation
seedfast seed --scope "500 users with varied names, some with NULL first or last names, international characters"
# Seed users with realistic variation
seedfast seed --scope "500 users with varied names, some with NULL first or last names, international characters"
# Seed users with realistic variation
seedfast seed --scope "500 users with varied names, some with NULL first or last names, international characters"

Seedfast generates data that reflects real-world patterns: some names are short, some are long, some fields are NULL, unicode characters show up where they would in production. This is the critical difference from hand-crafted test fixtures — you get the variety you didn't think to test for.

Step 2: Snapshot, Migrate, Snapshot

Capture the before state, run the migration, capture the after state.

# Snapshot the before state
psql $DATABASE_URL -c "SELECT first_name, last_name FROM users LIMIT 20" > before.txt

# Run the migration
psql $DATABASE_URL -f migrations/20260226_consolidate_user_names.sql

# Snapshot the after state
psql $DATABASE_URL -c "SELECT full_name FROM users LIMIT 20" > after.txt
# Snapshot the before state
psql $DATABASE_URL -c "SELECT first_name, last_name FROM users LIMIT 20" > before.txt

# Run the migration
psql $DATABASE_URL -f migrations/20260226_consolidate_user_names.sql

# Snapshot the after state
psql $DATABASE_URL -c "SELECT full_name FROM users LIMIT 20" > after.txt
# Snapshot the before state
psql $DATABASE_URL -c "SELECT first_name, last_name FROM users LIMIT 20" > before.txt

# Run the migration
psql $DATABASE_URL -f migrations/20260226_consolidate_user_names.sql

# Snapshot the after state
psql $DATABASE_URL -c "SELECT full_name FROM users LIMIT 20" > after.txt

Step 3: Inspect the Result

Now look at what actually happened:

# Check for NULLs that shouldn't be there
psql $DATABASE_URL -c "SELECT count(*) FROM users WHERE full_name IS NULL"

# Check for unexpected whitespace
psql $DATABASE_URL -c "SELECT full_name FROM users WHERE full_name LIKE '%  %' OR full_name LIKE ' %' LIMIT 10"

# Check for truncation
psql $DATABASE_URL -c "SELECT full_name, length(full_name) FROM users ORDER BY length(full_name) DESC LIMIT 5"
# Check for NULLs that shouldn't be there
psql $DATABASE_URL -c "SELECT count(*) FROM users WHERE full_name IS NULL"

# Check for unexpected whitespace
psql $DATABASE_URL -c "SELECT full_name FROM users WHERE full_name LIKE '%  %' OR full_name LIKE ' %' LIMIT 10"

# Check for truncation
psql $DATABASE_URL -c "SELECT full_name, length(full_name) FROM users ORDER BY length(full_name) DESC LIMIT 5"
# Check for NULLs that shouldn't be there
psql $DATABASE_URL -c "SELECT count(*) FROM users WHERE full_name IS NULL"

# Check for unexpected whitespace
psql $DATABASE_URL -c "SELECT full_name FROM users WHERE full_name LIKE '%  %' OR full_name LIKE ' %' LIMIT 10"

# Check for truncation
psql $DATABASE_URL -c "SELECT full_name, length(full_name) FROM users ORDER BY length(full_name) DESC LIMIT 5"

If the migration is correct, these queries return clean results. If it's not, you just caught a production bug from your laptop in under a minute.

Five Scenarios Where This Catches Real Bugs

1. NULL Concatenation

The migration:

UPDATE users SET full_name = first_name || ' '
UPDATE users SET full_name = first_name || ' '
UPDATE users SET full_name = first_name || ' '

The bug: Any row where first_name or last_name is NULL produces a NULL full_name. In PostgreSQL, NULL || anything = NULL.

What the reviewer sees after seeding and running:

 count
-------
   47
(1 row)
 count
-------
   47
(1 row)
 count
-------
   47
(1 row)

47 users just lost their names. The fix is obvious once you see it:

UPDATE users SET full_name = COALESCE(first_name, '') || ' ' || COALESCE(last_name, '')
UPDATE users SET full_name = COALESCE(first_name, '') || ' ' || COALESCE(last_name, '')
UPDATE users SET full_name = COALESCE(first_name, '') || ' ' || COALESCE(last_name, '')

But you'd never catch this by reading the original SQL — because the SQL is syntactically valid and logically reasonable. The bug is in the data, not the code.

2. NOT NULL Constraint on a Column With Existing NULLs

The migration:

ALTER TABLE orders ALTER COLUMN shipping_address SET NOT NULL
ALTER TABLE orders ALTER COLUMN shipping_address SET NOT NULL
ALTER TABLE orders ALTER COLUMN shipping_address SET NOT NULL

The reviewer reads this and thinks: "Good, shipping address should be required." They approve. In production, 8% of orders are digital-only — no shipping address. The migration fails on deploy.

After seeding 500 orders with realistic variety:

psql $DATABASE_URL -c "SELECT count(*) FROM orders WHERE shipping_address IS NULL"
psql $DATABASE_URL -c "SELECT count(*) FROM orders WHERE shipping_address IS NULL"
psql $DATABASE_URL -c "SELECT count(*) FROM orders WHERE shipping_address IS NULL"
 count
-------
    41
(1 row)
 count
-------
    41
(1 row)
 count
-------
    41
(1 row)

The reviewer now knows: this migration needs a backfill step first, or it needs to handle digital orders differently.

3. Column Type Change That Loses Data

The migration:

ALTER TABLE products ALTER COLUMN price TYPE INTEGER USING price::
ALTER TABLE products ALTER COLUMN price TYPE INTEGER USING price::
ALTER TABLE products ALTER COLUMN price TYPE INTEGER USING price::

The intent: "We don't need decimal precision, let's simplify." The reviewer checks the cast syntax — looks correct. Approved.

After seeding:

  price
---------
  29.99
  14.50
  99.95
  7.25
  149.99
  price
---------
  29.99
  14.50
  99.95
  7.25
  149.99
  price
---------
  29.99
  14.50
  99.95
  7.25
  149.99

Every one of those prices will be rounded. $29.99 becomes $29. Across thousands of products, financial reports will be off by thousands of dollars. The reviewer catches this instantly by looking at the data, not the DDL.

4. DEFAULT Value That Doesn't Match Business Logic

The migration:

ALTER TABLE users ADD COLUMN role VARCHAR(20) NOT NULL DEFAULT 'user'
ALTER TABLE users ADD COLUMN role VARCHAR(20) NOT NULL DEFAULT 'user'
ALTER TABLE users ADD COLUMN role VARCHAR(20) NOT NULL DEFAULT 'user'

Seems fine. But after seeding and inspecting existing rows:

   role   | count
----------+-------
 user     |   412
 admin    |    23
 manager  |    65
   role   | count
----------+-------
 user     |   412
 admin    |    23
 manager  |    65
   role   | count
----------+-------
 user     |   412
 admin    |    23
 manager  |    65

Wait — all existing rows now have role = 'user', including the 23 admins and 65 managers. If the application was previously storing role information in a different table or field, this migration just demoted every admin to a regular user. The DEFAULT applies to existing rows because the column is new, and every row gets the default.

The fix: add the column as nullable first, backfill from the authoritative source, then add the NOT NULL constraint.

5. Unique Constraint on Non-Unique Data

The migration:

ALTER TABLE customers ADD CONSTRAINT unique_email UNIQUE (email)
ALTER TABLE customers ADD CONSTRAINT unique_email UNIQUE (email)
ALTER TABLE customers ADD CONSTRAINT unique_email UNIQUE (email)

After seeding:

         email          | count
------------------------+-------
 john.smith@example.com |     3
 info@company.org       |     2
         email          | count
------------------------+-------
 john.smith@example.com |     3
 info@company.org       |     2
         email          | count
------------------------+-------
 john.smith@example.com |     3
 info@company.org       |     2

The migration will fail on deploy. Duplicate emails exist — maybe from a legacy import, maybe from a bulk account creation that didn't enforce uniqueness, maybe from a bug that was fixed years ago but left dirty data behind. The constraint is correct, but the data isn't ready for it.

Making It Repeatable: The Review Script

Wrap the workflow into a script that any reviewer can run:

#!/bin/bash
# review-migration.sh Validate a migration against realistic data
set -e

MIGRATION_FILE=$1
DATABASE_URL=${DATABASE_URL:-"postgres://dev:dev@localhost:5432/migration_review"}

if [ -z "$MIGRATION_FILE" ]; then
  echo "Usage: ./review-migration.sh <migration-file>"
  exit 1
fi

echo "=== Step 1: Reset database ==="
dropdb --if-exists migration_review
createdb migration_review

echo "=== Step 2: Apply base schema ==="
for f in migrations/*.sql; do
  if [ "$f" != "$MIGRATION_FILE" ]; then
    psql $DATABASE_URL -f "$f" 2>/dev/null || true
  fi
done

echo "=== Step 3: Seed realistic data ==="
seedfast seed --scope "Realistic data across all tables, include NULLs and edge cases" --output plain

echo "=== Step 4: Capture before state ==="
pg_dump $DATABASE_URL --data-only --inserts > /tmp/before_data.sql
psql $DATABASE_URL -c "\d+" > /tmp/before_schema.txt

echo "=== Step 5: Run migration ==="
echo "Running: $MIGRATION_FILE"
if ! psql $DATABASE_URL -f "$MIGRATION_FILE" 2>/tmp/migration_errors.txt; then
  echo "MIGRATION FAILED:"
  cat /tmp/migration_errors.txt
  exit 1
fi

echo "=== Step 6: Capture after state ==="
psql $DATABASE_URL -c "\d+" > /tmp/after_schema.txt

echo "=== Step 7: Show schema diff ==="
diff /tmp/before_schema.txt /tmp/after_schema.txt || true

echo "=== Step 8: Validate data integrity ==="
psql $DATABASE_URL -c "
  SELECT table_name, column_name
  FROM information_schema.columns
  WHERE table_schema = 'public'
    AND is_nullable = 'NO'
    AND column_default IS NULL
  ORDER BY table_name, column_name;
"

echo "=== Done. Review the output above. ==="
#!/bin/bash
# review-migration.sh Validate a migration against realistic data
set -e

MIGRATION_FILE=$1
DATABASE_URL=${DATABASE_URL:-"postgres://dev:dev@localhost:5432/migration_review"}

if [ -z "$MIGRATION_FILE" ]; then
  echo "Usage: ./review-migration.sh <migration-file>"
  exit 1
fi

echo "=== Step 1: Reset database ==="
dropdb --if-exists migration_review
createdb migration_review

echo "=== Step 2: Apply base schema ==="
for f in migrations/*.sql; do
  if [ "$f" != "$MIGRATION_FILE" ]; then
    psql $DATABASE_URL -f "$f" 2>/dev/null || true
  fi
done

echo "=== Step 3: Seed realistic data ==="
seedfast seed --scope "Realistic data across all tables, include NULLs and edge cases" --output plain

echo "=== Step 4: Capture before state ==="
pg_dump $DATABASE_URL --data-only --inserts > /tmp/before_data.sql
psql $DATABASE_URL -c "\d+" > /tmp/before_schema.txt

echo "=== Step 5: Run migration ==="
echo "Running: $MIGRATION_FILE"
if ! psql $DATABASE_URL -f "$MIGRATION_FILE" 2>/tmp/migration_errors.txt; then
  echo "MIGRATION FAILED:"
  cat /tmp/migration_errors.txt
  exit 1
fi

echo "=== Step 6: Capture after state ==="
psql $DATABASE_URL -c "\d+" > /tmp/after_schema.txt

echo "=== Step 7: Show schema diff ==="
diff /tmp/before_schema.txt /tmp/after_schema.txt || true

echo "=== Step 8: Validate data integrity ==="
psql $DATABASE_URL -c "
  SELECT table_name, column_name
  FROM information_schema.columns
  WHERE table_schema = 'public'
    AND is_nullable = 'NO'
    AND column_default IS NULL
  ORDER BY table_name, column_name;
"

echo "=== Done. Review the output above. ==="
#!/bin/bash
# review-migration.sh Validate a migration against realistic data
set -e

MIGRATION_FILE=$1
DATABASE_URL=${DATABASE_URL:-"postgres://dev:dev@localhost:5432/migration_review"}

if [ -z "$MIGRATION_FILE" ]; then
  echo "Usage: ./review-migration.sh <migration-file>"
  exit 1
fi

echo "=== Step 1: Reset database ==="
dropdb --if-exists migration_review
createdb migration_review

echo "=== Step 2: Apply base schema ==="
for f in migrations/*.sql; do
  if [ "$f" != "$MIGRATION_FILE" ]; then
    psql $DATABASE_URL -f "$f" 2>/dev/null || true
  fi
done

echo "=== Step 3: Seed realistic data ==="
seedfast seed --scope "Realistic data across all tables, include NULLs and edge cases" --output plain

echo "=== Step 4: Capture before state ==="
pg_dump $DATABASE_URL --data-only --inserts > /tmp/before_data.sql
psql $DATABASE_URL -c "\d+" > /tmp/before_schema.txt

echo "=== Step 5: Run migration ==="
echo "Running: $MIGRATION_FILE"
if ! psql $DATABASE_URL -f "$MIGRATION_FILE" 2>/tmp/migration_errors.txt; then
  echo "MIGRATION FAILED:"
  cat /tmp/migration_errors.txt
  exit 1
fi

echo "=== Step 6: Capture after state ==="
psql $DATABASE_URL -c "\d+" > /tmp/after_schema.txt

echo "=== Step 7: Show schema diff ==="
diff /tmp/before_schema.txt /tmp/after_schema.txt || true

echo "=== Step 8: Validate data integrity ==="
psql $DATABASE_URL -c "
  SELECT table_name, column_name
  FROM information_schema.columns
  WHERE table_schema = 'public'
    AND is_nullable = 'NO'
    AND column_default IS NULL
  ORDER BY table_name, column_name;
"

echo "=== Done. Review the output above. ==="

Now any reviewer can run ./review-migration.sh migrations/20260226_consolidate_user_names.sql and see the actual effect of the migration before approving the PR.

CI/CD: Automated Migration Validation in PR Checks

Don't rely on reviewers remembering to run the script. Make it part of your PR pipeline.

name: Migration Review Check

on:
  pull_request:
    paths:
      - 'migrations/**'

jobs:
  validate-migration:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: review
          POSTGRES_USER: review
          POSTGRES_PASSWORD: review
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Apply base schema
        run: |
          for f in migrations/*.sql; do
            psql $DATABASE_URL -f "$f" 2>/dev/null || true
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Seed realistic data
        run: seedfast seed --scope "Realistic dataset across all tables, include NULLs and varied data" --output plain
        env:
          SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Apply new migration(s)
        run: |
          NEW_MIGRATIONS=$(git diff --name-only origin/main -- migrations/ | sort)
          for f in $NEW_MIGRATIONS; do
            echo "--- Applying: $f ---"
            psql $DATABASE_URL -f "$f"
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Validate post-migration state
        run: |
          echo "=== Row counts after migration ==="
          for table in $(psql $DATABASE_URL -t -c "SELECT tablename FROM pg_tables WHERE schemaname='public'"); do
            psql $DATABASE_URL -c "SELECT '$table' as table_name, count(*) as rows FROM $table"
          done

          echo "=== Check for unexpected NULLs ==="
          psql $DATABASE_URL -c "
            SELECT c.table_name, c.column_name, c.is_nullable,
                   (SELECT count(*) FROM information_schema.columns) as context
            FROM information_schema.columns c
            WHERE c.table_schema = 'public'
            ORDER BY c.table_name, c.ordinal_position;
          "
        env:
          DATABASE_URL

name: Migration Review Check

on:
  pull_request:
    paths:
      - 'migrations/**'

jobs:
  validate-migration:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: review
          POSTGRES_USER: review
          POSTGRES_PASSWORD: review
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Apply base schema
        run: |
          for f in migrations/*.sql; do
            psql $DATABASE_URL -f "$f" 2>/dev/null || true
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Seed realistic data
        run: seedfast seed --scope "Realistic dataset across all tables, include NULLs and varied data" --output plain
        env:
          SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Apply new migration(s)
        run: |
          NEW_MIGRATIONS=$(git diff --name-only origin/main -- migrations/ | sort)
          for f in $NEW_MIGRATIONS; do
            echo "--- Applying: $f ---"
            psql $DATABASE_URL -f "$f"
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Validate post-migration state
        run: |
          echo "=== Row counts after migration ==="
          for table in $(psql $DATABASE_URL -t -c "SELECT tablename FROM pg_tables WHERE schemaname='public'"); do
            psql $DATABASE_URL -c "SELECT '$table' as table_name, count(*) as rows FROM $table"
          done

          echo "=== Check for unexpected NULLs ==="
          psql $DATABASE_URL -c "
            SELECT c.table_name, c.column_name, c.is_nullable,
                   (SELECT count(*) FROM information_schema.columns) as context
            FROM information_schema.columns c
            WHERE c.table_schema = 'public'
            ORDER BY c.table_name, c.ordinal_position;
          "
        env:
          DATABASE_URL

name: Migration Review Check

on:
  pull_request:
    paths:
      - 'migrations/**'

jobs:
  validate-migration:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: review
          POSTGRES_USER: review
          POSTGRES_PASSWORD: review
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Apply base schema
        run: |
          for f in migrations/*.sql; do
            psql $DATABASE_URL -f "$f" 2>/dev/null || true
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Seed realistic data
        run: seedfast seed --scope "Realistic dataset across all tables, include NULLs and varied data" --output plain
        env:
          SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Apply new migration(s)
        run: |
          NEW_MIGRATIONS=$(git diff --name-only origin/main -- migrations/ | sort)
          for f in $NEW_MIGRATIONS; do
            echo "--- Applying: $f ---"
            psql $DATABASE_URL -f "$f"
          done
        env:
          DATABASE_URL: postgres://review:review@localhost:5432/review

      - name: Validate post-migration state
        run: |
          echo "=== Row counts after migration ==="
          for table in $(psql $DATABASE_URL -t -c "SELECT tablename FROM pg_tables WHERE schemaname='public'"); do
            psql $DATABASE_URL -c "SELECT '$table' as table_name, count(*) as rows FROM $table"
          done

          echo "=== Check for unexpected NULLs ==="
          psql $DATABASE_URL -c "
            SELECT c.table_name, c.column_name, c.is_nullable,
                   (SELECT count(*) FROM information_schema.columns) as context
            FROM information_schema.columns c
            WHERE c.table_schema = 'public'
            ORDER BY c.table_name, c.ordinal_position;
          "
        env:
          DATABASE_URL

When a PR adds a migration, this pipeline seeds realistic data, runs the migration, and reports the outcome. Reviewers get concrete evidence in the PR checks instead of relying on their ability to mentally execute SQL.

The Difference Between Performance Testing and Correctness Testing

There's an important distinction here. Testing migrations against production-scale data (millions of rows) answers the question: "Will this migration be fast enough?" That's about lock duration, rewrite time, and index build speed.

Testing migrations against production-shaped data (realistic variety, NULLs, edge cases, messy strings) answers a different question: "Will this migration produce correct results?" That's about data integrity, constraint violations, and silent data loss.

You don't need 5 million rows to catch a NULL concatenation bug. You need 500 rows with realistic NULL patterns. You don't need production scale to discover that a unique constraint will fail on duplicates. You need a dataset that contains duplicates.

This article is about the second kind of testing. It's about giving reviewers the ability to see what a migration does to data, not just how long it takes. Both are important. Both require realistic data. But they answer different questions, and they fit into different parts of the development process.

Performance testing happens before deploy. Correctness testing happens during code review.

A Migration Reviewer's Checklist

Before approving any PR that includes a schema migration:

Data integrity

  • Seed realistic data and run the migration locally (or check the CI results)

  • Verify no unexpected NULLs were introduced

  • Verify no data was silently truncated or rounded

  • Verify row counts are unchanged (unless rows are intentionally removed)

  • Check that new DEFAULT values make sense for existing rows, not just new ones

Constraint safety

  • If adding NOT NULL: confirm no existing rows violate the constraint

  • If adding UNIQUE: confirm no existing duplicates

  • If adding a FOREIGN KEY: confirm all referenced rows exist

  • If changing a column type: confirm all existing values fit the new type

Behavioral correctness

  • String operations handle NULLs (use COALESCE or explicit NULL checks)

  • Backfill queries produce correct results on a sample of real-shaped data

  • The migration is idempotent or has a working rollback

Review process

  • PR includes what the migration does AND why (not just the SQL)

  • If the migration is multi-step, each step's purpose is documented

  • CI migration validation check passed

Stop Approving Migrations You Haven't Tested

We don't approve application code by reading the source and imagining what it does. We run the tests. We look at the output. We verify behavior.

Migrations deserve the same rigor. The SQL might look correct. The intent might be sound. But the only way to know what a migration does to your data is to run it against data that looks like yours.

Seed. Migrate. Compare. It takes 30 seconds and it catches the bugs that code review alone never will.