Seedfast – Documentation | Your seed.sql Was Outdated the Day You Wrote It

Search Docs…

Your seed.sql Was Outdated the Day You Wrote It

The eternal maintenance burden of static seed files — and why your team quietly stopped running them months ago.

Every codebase has one. It might be called seed.sql, fixtures.sql, dev_data.sql, or testdata/init.sql. It lives somewhere in the repo, usually committed by a developer who left the company two years ago. It was the single most helpful file in the project for about three weeks. Then a migration landed, and it’s been a source of quiet frustration ever since.

If your team has a seed file that actually works on the current schema — without modifications, without commenting out lines, without someone saying “oh yeah, you have to run the migration first and then manually fix line 847” — you are in a vanishingly small minority. Congratulations.

For everyone else: this article is about the file you all know is broken and nobody wants to fix.

The Life and Death of seed.sql

The lifecycle is so predictable it could be a template:

Week 1: Creation. A motivated developer (usually someone onboarding) writes a seed file. It inserts users, orders, products — whatever the app needs to look populated. It works. The PR gets merged. The team is grateful. Local development is smooth. People actually use the app locally instead of staring at empty states.

-- seed.sql (v1, the golden age)
INSERT INTO users (id, name, email, created_at)
VALUES
  (1, 'Alice Johnson', 'alice@example.com', '2025-01-15'),
  (2, 'Bob Smith', 'bob@example.com', '2025-02-20'),
  (3, 'Carol Davis', 'carol@example.com', '2025-03-10');

INSERT INTO orders (id, user_id, total, status, created_at)
VALUES
  (1, 1, 99.99, 'completed', '2025-01-20'),
  (2, 1, 149.50, 'completed', '2025-02-15'),
  (3, 2, 75.00, 'pending', '2025-03-01');

INSERT INTO products (id, name, price, category)
VALUES
  (1, 'Widget Pro', 49.99, 'electronics'),
  (2, 'Gadget Plus', 29.99, 'electronics'),
  (3, 'Thingamajig', 19.99, 'accessories')

-- seed.sql (v1, the golden age)
INSERT INTO users (id, name, email, created_at)
VALUES
  (1, 'Alice Johnson', 'alice@example.com', '2025-01-15'),
  (2, 'Bob Smith', 'bob@example.com', '2025-02-20'),
  (3, 'Carol Davis', 'carol@example.com', '2025-03-10');

INSERT INTO orders (id, user_id, total, status, created_at)
VALUES
  (1, 1, 99.99, 'completed', '2025-01-20'),
  (2, 1, 149.50, 'completed', '2025-02-15'),
  (3, 2, 75.00, 'pending', '2025-03-01');

INSERT INTO products (id, name, price, category)
VALUES
  (1, 'Widget Pro', 49.99, 'electronics'),
  (2, 'Gadget Plus', 29.99, 'electronics'),
  (3, 'Thingamajig', 19.99, 'accessories')

-- seed.sql (v1, the golden age)
INSERT INTO users (id, name, email, created_at)
VALUES
  (1, 'Alice Johnson', 'alice@example.com', '2025-01-15'),
  (2, 'Bob Smith', 'bob@example.com', '2025-02-20'),
  (3, 'Carol Davis', 'carol@example.com', '2025-03-10');

INSERT INTO orders (id, user_id, total, status, created_at)
VALUES
  (1, 1, 99.99, 'completed', '2025-01-20'),
  (2, 1, 149.50, 'completed', '2025-02-15'),
  (3, 2, 75.00, 'pending', '2025-03-01');

INSERT INTO products (id, name, price, category)
VALUES
  (1, 'Widget Pro', 49.99, 'electronics'),
  (2, 'Gadget Plus', 29.99, 'electronics'),
  (3, 'Thingamajig', 19.99, 'accessories')

Week 3: First crack. A migration adds a NOT NULL column to users. The seed file doesn’t include it. New developers run the seed, get an error, and ask in Slack. Someone replies “oh just add role DEFAULT 'user' to the users table insert.” Nobody updates the file.

ERROR:  null value in column "role" of relation "users" violates not-null constraint
DETAIL:  Failing row contains (1, Alice Johnson, alice@example.com, 2025-01-15, null)

ERROR:  null value in column "role" of relation "users" violates not-null constraint
DETAIL:  Failing row contains (1, Alice Johnson, alice@example.com, 2025-01-15, null)

ERROR:  null value in column "role" of relation "users" violates not-null constraint
DETAIL:  Failing row contains (1, Alice Johnson, alice@example.com, 2025-01-15, null)

Month 2: The patch. Someone gets frustrated enough to fix it. They add the missing column. They also notice that orders now has a shipping_address_id foreign key to a new addresses table. They add an addresses insert block. The PR is 200 lines of SQL changes for a file that was supposed to be “set and forget.” It passes review because nobody wants to think about it too hard.

Month 4: The second break. The products table was renamed to catalog_items as part of a domain modeling cleanup. The seed file still references products. Someone opens an issue. The issue sits in the backlog for six weeks because it’s not a production bug, it’s “just” developer experience.

Month 6: The workaround. The seed file has broken twice in two months. A senior developer wraps it in a script:

#!/bin/bash
# run-seed.sh — "best effort" seeding
set -e  # just kidding
psql $DATABASE_URL < seed.sql 2>/dev/null || echo "Seed had errors (this is normal)"

#!/bin/bash
# run-seed.sh — "best effort" seeding
set -e  # just kidding
psql $DATABASE_URL < seed.sql 2>/dev/null || echo "Seed had errors (this is normal)"

#!/bin/bash
# run-seed.sh — "best effort" seeding
set -e  # just kidding
psql $DATABASE_URL < seed.sql 2>/dev/null || echo "Seed had errors (this is normal)"

The || echo is doing a lot of heavy lifting there. “This is normal” is doing even more.

Month 9: Abandonment. The README still says “Run ./run-seed.sh to populate your local database.” New developers try it. It fails silently on half the tables. They ask in Slack. Someone says “I just use the staging database” or “I manually insert what I need.” The seed file is effectively dead. It exists in the repo. Nobody deletes it — that would require acknowledging the problem. Nobody fixes it — that would require ongoing commitment. It just sits there, a monument to good intentions.

Month 12: The zombie. A new developer finds the seed file, spends two hours fixing it for the current schema, opens a PR, and the cycle begins again.

Why It Always Drifts

The fundamental tension is simple: your schema changes constantly, but your seed file is static.

Consider what happens during a typical sprint. A developer adds a phone_number column to users. Another developer creates a user_preferences table with a foreign key to users. A third developer changes orders.status from a text field to an enum type. A fourth developer adds a check constraint that orders.total must be positive.

Each of these changes is small. Each migration is tested. Each PR is reviewed. And none of them update the seed file, because why would they? The seed file isn’t part of the feature. It’s not in the test suite. It’s not in the CI pipeline (or if it is, it was removed six months ago because it kept breaking the build).

The result is that the seed file drifts from the schema at the exact rate that your team ships features. The more productive your team is, the faster the seed file becomes useless.

Nobody Owns It

This is the human problem underneath the technical one. Who is responsible for seed.sql?

Not the developer who wrote it — they moved to another team. Not the developer who added the new column — they’re shipping features, not maintaining test infrastructure. Not the tech lead — they have 40 other things to worry about. Not DevOps — it’s application-level data, not infrastructure.

Seed files are communal property, and communal property is everyone’s responsibility and therefore nobody’s. The same thing that happens to shared kitchen spaces in offices happens to seed files in repos: slow, inevitable decay until someone snaps and does a deep clean. Except with seed files, nobody snaps. They just route around the damage.

Migrations Are One-Way

There’s a deeper structural issue: migrations transform schema forward in time, but seed files are frozen in the past. Your migration system knows how to get from schema version 47 to version 48. It doesn’t know how to update the test data that was valid at version 47 to also be valid at version 48.

Some teams try to solve this by running seed files through the migration system — seeding at version 1, then migrating up. This works exactly until your first breaking migration, which is usually the third or fourth one. Then you need to version your seed files alongside your migrations, which means maintaining parallel histories of schema changes and data changes. Nobody does this for long.

The Hidden Cost

The seed file seems like a small thing. It’s a convenience file for local development. How much damage can a broken convenience file really do?

More than you’d think.

Onboarding Delay

A new developer joins your team. The README says to clone the repo, run migrations, and run the seed file. The seed file fails. The new developer doesn’t know if the failure is expected, if their local setup is wrong, or if they did something out of order. They spend an hour debugging before asking for help. A senior developer spends 30 minutes walking them through the workaround.

Multiply this by every new developer, every quarter. Now multiply by the morale cost: the new person’s first experience with the codebase is discovering that the documented setup doesn’t work. That’s not a great first impression of your engineering culture.

Broken Local Development

Without working seed data, local development means staring at empty states. The dashboard shows “No data found.” The list views are empty. The search returns nothing. The graph components render a flat line.

Developers start creating data manually through the UI, which takes ten minutes every time they reset their database. Or they stop resetting their database, which means their local state diverges from everyone else’s. Or they just develop against staging, which has its own problems (shared state, slow connections, risk of interfering with QA).

The empty local database is a productivity drain that’s hard to quantify because it’s spread across every developer, every day, in small increments. Five minutes here to create a test user. Ten minutes there to set up an order with the right status. Twenty minutes to create the specific data configuration needed to test a new feature. It adds up to hours per developer per week.

CI Failures

If your CI pipeline includes a seeding step (it should), a broken seed file means broken builds. The options are:

Fix the seed file every time it breaks. This works, but it means someone is on permanent seed-file duty, patching SQL after every migration.
Remove the seeding step from CI. This is what most teams actually do. The CI pipeline now tests against an empty database, which misses entire categories of bugs.
Make the seeding step non-fatal. The || true approach. The seed runs, fails halfway, inserts data into some tables but not others, and the test suite runs against an inconsistent partial dataset. This is arguably worse than an empty database, because the failures are intermittent and hard to diagnose.

The “Just Comment It Out” Culture

The most corrosive effect of a broken seed file is cultural. When developers learn that the seed file is unreliable, they develop a reflexive distrust of all shared data tooling. Suggestions to invest in better seeding infrastructure are met with “we tried that, it didn’t work.” Proposals for data-dependent integration tests are rejected with “those will just break when the seed file drifts.”

The broken seed file becomes a learned helplessness that prevents the team from investing in the thing they actually need.

The Coping Mechanisms

Teams develop creative ways to live with broken seed files. All of them are worse than fixing the root cause.

The Optional Seed

## Local Setup
1. Run `make migrate`
2. (Optional) Run `make seed` to populate test data

## Local Setup
1. Run `make migrate`
2. (Optional) Run `make seed` to populate test data

## Local Setup
1. Run `make migrate`
2. (Optional) Run `make seed` to populate test data

When the seed step is “optional,” it means “broken.” Nobody makes a working tool optional. You don’t see (Optional) Run the compiler in setup docs. The word “optional” is a signal that the team knows it doesn’t work reliably and has decided to make that someone else’s problem.

The Try-Catch Wrapper

# seed.py
for table in ["users", "orders", "products", "categories"]:
    try:
        run_sql(f"seed_{table}.sql")
    except Exception as e:
        print(f"Warning: {table} seed failed ({e}), continuing...")

# seed.py
for table in ["users", "orders", "products", "categories"]:
    try:
        run_sql(f"seed_{table}.sql")
    except Exception as e:
        print(f"Warning: {table} seed failed ({e}), continuing...")

# seed.py
for table in ["users", "orders", "products", "categories"]:
    try:
        run_sql(f"seed_{table}.sql")
    except Exception as e:
        print(f"Warning: {table} seed failed ({e}), continuing...")

Every error is swallowed. Half the tables succeed, half don’t. The developer doesn’t know which half. The local database has users but no orders, products but no categories. The app technically runs but half the features are untestable. Nobody investigates the warnings because there are always warnings.

The Versioned Seed

seeds/
  v1_initial.sql
  v2_add_roles.sql
  v3_add_addresses.sql
  v4_rename_products.sql
  v5_add_preferences.sql

seeds/
  v1_initial.sql
  v2_add_roles.sql
  v3_add_addresses.sql
  v4_rename_products.sql
  v5_add_preferences.sql

seeds/
  v1_initial.sql
  v2_add_roles.sql
  v3_add_addresses.sql
  v4_rename_products.sql
  v5_add_preferences.sql

This is the most disciplined approach, and it’s also the most labor-intensive. Every migration that affects seeded tables requires a corresponding seed update. In practice, this means the developer writing the migration now has two files to update and test — the migration and the seed delta. Compliance drops rapidly after the first month.

The Per-Developer Seed

Eventually, developers start maintaining their own personal seed files. Each one tailored to the features they work on. None of them complete. All of them incompatible with each other. The team now has N different versions of local state, where N is the number of developers.

“Works on my machine” takes on a new meaning when every machine has different data.

The Fundamental Problem

All of these failures stem from one root cause: static data cannot keep up with a dynamic schema.

A seed file is a snapshot. It captures the shape of your data at a single point in time. The moment your schema evolves — which it does constantly, because that’s what healthy software projects do — the snapshot is stale.

This isn’t a discipline problem. It’s not something that can be solved by “just keeping the seed file up to date” any more than you can solve clock drift by “just checking your watch more often.” The problem is structural: you’re using a static artifact to describe a moving target.

The fix isn’t a better seed file. The fix is eliminating the seed file entirely.

The Alternative: Read the Schema, Generate the Data

What if your seeding tool read your current schema every time it ran?

Not a file that was written six months ago. Not a snapshot that assumed the products table still exists. Not a script that hardcodes column names. The actual, current, live schema — with every column, constraint, foreign key, and enum that exists right now, at this moment.

seedfast seed

seedfast seed

seedfast seed

That’s it. No file to maintain. No columns to add after migrations. No foreign keys to wire up manually. Seedfast connects to your database, reads the schema as it exists today, and generates data that fits.

When a migration adds a NOT NULL column next week, Seedfast sees it next time it runs. When a table is renamed, Seedfast uses the new name. When a foreign key is added, Seedfast generates parent rows before child rows. When an enum type gains a new value, Seedfast includes it in the distribution.

There is no drift because there is no static artifact to drift.

Scope Instead of SQL

Instead of writing SQL inserts, you describe what you need in plain English:

# Instead of maintaining 500 lines of INSERT statements
seedfast seed --scope "seed 1,000 users with orders, payments, and support tickets"

# Instead of maintaining 500 lines of INSERT statements
seedfast seed --scope "seed 1,000 users with orders, payments, and support tickets"

# Instead of maintaining 500 lines of INSERT statements
seedfast seed --scope "seed 1,000 users with orders, payments, and support tickets"

Seedfast reads your schema, builds a dependency graph, proposes a plan, and seeds. The scope description works today and will work next month, because it references concepts (“users with orders”) rather than column names (user_id INTEGER NOT NULL REFERENCES users(id)).

When your schema changes, the same scope produces different data — data that matches the new schema. The command is the same. The intent is the same. The output adapts automatically.

What This Looks Like in Practice

Before (the seed.sql lifecycle):

Developer writes seed.sql (2 hours)
Works for 3 weeks
Migration breaks it (5 minutes to discover, 30 minutes to fix)
Works for 2 weeks
Another migration breaks it (someone files an issue)
Issue sits in backlog for 6 weeks
New developer fixes it (1 hour)
Works for 1 week
Two migrations land in the same sprint, seed file breaks in multiple places
Someone wraps it in || true
Team stops using it
Repeat from step 7 every few months

Cumulative time: dozens of hours per year. Effective uptime: maybe 40%.

After (seedfast):

# In your README
seedfast seed --scope "seed realistic data for all tables"

# In CI
seedfast seed --scope "seed 1,000 users with orders" --output plain

# In your README
seedfast seed --scope "seed realistic data for all tables"

# In CI
seedfast seed --scope "seed 1,000 users with orders" --output plain

# In your README
seedfast seed --scope "seed realistic data for all tables"

# In CI
seedfast seed --scope "seed 1,000 users with orders" --output plain

There is no step 2 through 12. The command works after every migration because it reads the current schema. Nobody maintains it. Nobody patches it. Nobody wraps it in error-swallowing scripts.

In CI/CD

The seed file in CI is where the pain compounds, because CI failures block everyone:

# Before: fragile, breaks every few sprints
- name: Seed test database
  run: psql $DATABASE_URL < seed.sql  # fingers crossed

# After: reads current schema every time
- name: Seed test database
  run: seedfast seed --scope "seed 5,000 users with orders and payments" --output plain
  env:
    SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
    DATABASE_URL: ${{ secrets.DATABASE_URL }}

# Before: fragile, breaks every few sprints
- name: Seed test database
  run: psql $DATABASE_URL < seed.sql  # fingers crossed

# After: reads current schema every time
- name: Seed test database
  run: seedfast seed --scope "seed 5,000 users with orders and payments" --output plain
  env:
    SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
    DATABASE_URL: ${{ secrets.DATABASE_URL }}

# Before: fragile, breaks every few sprints
- name: Seed test database
  run: psql $DATABASE_URL < seed.sql  # fingers crossed

# After: reads current schema every time
- name: Seed test database
  run: seedfast seed --scope "seed 5,000 users with orders and payments" --output plain
  env:
    SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
    DATABASE_URL: ${{ secrets.DATABASE_URL }}

The --scope flag makes it non-interactive. Table skipping makes it idempotent. If the tables already have data, Seedfast skips them. Safe to re-run on every build.

For Onboarding

The before-and-after for new developers is dramatic:

Before: Clone repo. Run migrations. Run seed. Seed fails. Ask Slack. Wait for response. Get workaround. Apply workaround. Half the data loads. Manually create the rest. Time: 1-3 hours.

After: Clone repo. Run migrations. Run seedfast seed. Time: 2 minutes.

No debugging. No Slack. No workarounds. The database is populated with realistic data that matches the current schema. The new developer sees a populated dashboard on their first day, not an empty state with a TODO comment.

The Uncomfortable Truth

Your seed.sql isn’t broken because your team is lazy. It’s broken because the premise is flawed. Asking a static file to keep up with a dynamic schema is asking for perpetual maintenance — and perpetual maintenance of non-production tooling is exactly the kind of work that gets deprioritized, postponed, and eventually abandoned.

The teams that have working seed files are the ones spending real engineering time maintaining them. That time could be spent on features, on tests, on the product. Maintaining seed files is not a valuable use of engineering time. It’s a tax you pay because the tool requires it.

Stop paying the tax. Delete the file. Let the schema speak for itself.

Ready to stop maintaining seed files?

Get Started | Documentation | Pricing

Seedfast reads your schema and generates matching data. No files to maintain. No drift. Always current.