Seed the Parent Branch, Not Every PR: Neon Branching Meets Seed Data
By the Seedfast team ·
Neon branches copy their parent in milliseconds. That property either saves you from the seeding problem entirely or drops a subtler one in your lap. This is how to tell which situation you're in and what to do in each.
Key Takeaways#
- Neon branches inherit their parent's data via copy-on-write, so the seeding question is not "how do I seed every branch" but "how do I seed the parent well enough that branches come up populated"
- Seed the parent once, let every preview branch inherit — that turns per-PR seeding into an O(1) cost, and most teams stop there
- The pattern breaks on two branches only: schema-drift PRs (migration adds a
NOT NULLcolumn the inherited rows don't satisfy) and schema-only branches (no data copied at all) - For both exceptions, regenerate the data from the live schema on that branch — a static
seed.sqlcan't describe the drifted schema, and a hand-writtenseed.tshas to change with every migration - Seedfast reads the branch's schema on each run and produces a valid, connected dataset from a plain-English scope — the same CLI handles parent-seed, drift-reseed, and schema-only-branch with one command
You provisioned Neon for the branching. Every pull request gets a fresh, full-fat database in about a second, copied from the parent branch, ready for CI. The catch is that "copied from the parent" only solves the seeding problem if the parent is actually seeded, and only holds while the branch schema still matches what was in the parent. This guide covers the three Neon branching seed data patterns that together cover the full workflow: parent-once, drift-reseed, and schema-only rescue.
If you haven't read the fundamentals yet, how to seed a Neon database covers the raw SQL, Prisma, and Drizzle mechanics that this article builds on. This one is specifically about how branching changes the seeding model.
How Neon branching actually copies data#
Neon's branches are copy-on-write. When you create a branch from main, the child branch points at the same storage as the parent — no bytes are duplicated until one side writes. Reads against the child see the parent's data instantly. Writes on either side diverge from that moment forward.
That model is the whole reason branching is fast. Neon's own numbers put branch creation at roughly one second regardless of database size, because no copying happens up front. You inherit one terabyte or one row at the same cost.
Two flags at branch creation decide how much you inherit:
- Full branch — copy schema and data. This is the default. Preview branches for PRs usually want this.
- Schema-only branch — copy the schema, leave the data out. You'd pick this when the parent contains sensitive data you can't replicate to every PR: production-like staging, a healthcare dataset, anything covered by an existing compliance lane.
Full branches make the seeding problem disappear for most PRs. Schema-only branches reintroduce it in a narrower, but harder, form. Most teams hit both.
Pattern 1: Seed the parent once, let preview branches inherit#
The core insight of Neon branching seed data is that you almost never need to seed a branch. You seed the parent — usually main, dev, or a dedicated seed branch — and every branch created from it comes up populated.
# Seed the parent once, from your local machine or a manual CI job
seedfast connect
seedfast seed --scope "realistic e-commerce: 3 orgs, 200 users, 1,000 orders across 50 products"
Now the preview workflow has no seeding step at all:
# .github/workflows/preview.yml
name: Preview branch per PR
on:
pull_request:
jobs:
preview:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Create Neon branch from main
id: neon
uses: neondatabase/create-branch-action@v6
with:
project_id: ${{ secrets.NEON_PROJECT_ID }}
branch_name: preview/pr-${{ github.event.pull_request.number }}
parent_branch: main
database: ${{ secrets.NEON_DATABASE_NAME }}
role: ${{ secrets.NEON_ROLE_NAME }}
api_key: ${{ secrets.NEON_API_KEY }}
- name: Run migrations on the new branch
run: npx prisma migrate deploy
env:
DIRECT_URL: ${{ steps.neon.outputs.db_url }}
- name: Run E2E tests
run: npm run test:e2e
env:
DATABASE_URL: ${{ steps.neon.outputs.db_url_pooled }}
Specify the database and role inputs when your Neon project uses non-default names. Without them the action falls back to neondb / neondb_owner, and if those don't exist on your project the output DSNs will point at credentials Postgres will refuse to connect with — the kind of silent misconfiguration that shows up only when the test step runs. No seed step otherwise. The branch is already populated. A typical preview pipeline runs in the time it takes to deploy the app — the database is the cheap part.
This pattern works until one of two things happens: a PR introduces a migration that changes the schema in ways the inherited rows can't satisfy, or your security review forces schema-only branching. Both cases are the rest of this article.
Pattern 2: Schema-drift branches — reseed only on the PRs that changed the schema#
Here's the failure mode. A developer opens a PR that adds ALTER TABLE orders ADD COLUMN fulfillment_provider TEXT NOT NULL. The branch is created from main. The 1,000 inherited orders have no fulfillment_provider. The migration fails — or worse, it doesn't fail because someone added a default, and now your tests are running against orders where every provider is the same literal string.
You don't want to reseed every branch (that defeats the O(1) advantage). You want to reseed the branches where the inherited data no longer matches what the schema expects. A simple heuristic is "does this PR touch the migrations directory?"
# .github/workflows/preview.yml (excerpt)
- name: Check if PR changes migrations
id: migrations
uses: tj-actions/changed-files@v46
with:
# Adjust the glob list to match your migrator. Cover every location
# where schema changes can land — Prisma, Drizzle, raw SQL, Flyway,
# Liquibase, custom folders. A missed path here means CI silently
# runs against stale inherited data.
files: |
prisma/migrations/**
drizzle/migrations/**
db/migrations/**
supabase/migrations/**
**/*.sql
- name: Run migrations on the new branch
run: npx prisma migrate deploy
env:
DIRECT_URL: ${{ steps.neon.outputs.db_url }}
- name: Reseed branch for schema-drift PRs
if: steps.migrations.outputs.any_changed == 'true'
run: npx seedfast seed --scope "e2e baseline: 3 orgs, 20 users, 100 orders" --output json
env:
SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
SEEDFAST_DSN: ${{ steps.neon.outputs.db_url }}
The reseed step only runs on branches that changed a migration file. The glob list above is the most brittle part of the workflow — if your team commits migrations somewhere the glob doesn't cover, CI silently runs against stale inherited data and green builds can ship breaking schema changes. Treat the list as a contract with your migrator and update it whenever that surface changes. The Seedfast CLI flags used here (--scope, --output, SEEDFAST_API_KEY, SEEDFAST_DSN) are covered in the CI/CD database seeding docs. Seedfast has a free tier that covers small schemas end to end — connect your Neon project and run the Pattern 2 step on a test PR before wiring it into the real workflow.
Two failure modes the glob heuristic does not catch: (a) business-logic PRs that need different data even though no schema changed — those need a manual reseed trigger on the parent branch; (b) schema changes that happen via side effects (e.g., a PR that runs a one-off SQL script from application code rather than a migration file). For (b), the only durable fix is "all schema changes go through the migrator," which is a team discipline issue, not a CI one.
This is the pattern where a static seed.sql stops working. The file describes the old schema — the one in main. The new branch's schema has a new column, a new table, or a renamed foreign key. The seed file can't describe the drifted schema until someone updates it, and that someone is usually the person trying to get their own PR to green. The schema-aware path — read the branch's live schema, generate valid rows against it — is what separates reseeding-on-drift from "rewriting the seed file in the PR." Why static seed files break covers the lifecycle in full; on Neon, the drift hits earlier because you feel every migration on every preview branch.
Pattern 3: The schema-only branch rescue#
Schema-only branching is Neon's answer to "we can't copy that data to every PR." The parent database holds records covered by a regulatory scope your team can't replicate to every developer's preview; or staging carries sensitive payment metadata that can't leave its access boundary; or the dataset is simply too large to reasonably copy a hundred times a day. Schema-only branches inherit the structure — tables, columns, indexes, foreign keys — and nothing else. Seedfast itself generates synthetic rows and holds no compliance certification of its own; the fit here is that synthetic data replaces the copy, and whatever compliance posture your team already operates continues to apply.
Which drops you at an empty database with a real, complex, schema in front of it. That's the seeding problem in its purest form.
The traditional answers are bad in this specific case:
- A committed
seed.sql— out of date the day after the next migration lands, and doubly out of date on a schema-only branch where no inherited rows mask the gap. See seed file maintenance for the full argument. - A Prisma or Drizzle
seed.ts— better than SQL, but still hand-written against a schema that changes under it. Every new FK is a hand-edit. - Copy-and-anonymize from production — right answer for some teams, but heavy. You're running a full data pipeline just to fill a CI database.
- Faker libraries — generate values but not relationships. Your 20-table schema becomes 20 one-table seed scripts with manual FK wiring. For real schemas this stops scaling around table ten.
Seedfast reads the branch's live schema on each run and generates rows that fit it — in topological order, with FKs pointing at real parents. You describe the dataset in a plain-English scope:
# On the schema-only branch
export DIRECT_URL="postgresql://...ep-xxxx.region.aws.neon.tech/db?sslmode=require"
seedfast seed --scope \
"small SaaS org: 2 workspaces, 15 members, 100 projects with 500 tasks and activity history"
No seed.sql. No seed.ts. The CLI connects, reads the current schema — including whatever the last migration added — and generates valid connected rows. If the schema changes next week, the same command works next week. Self-referential tables (employees.manager_id → employees.id) and circular FK chains are handled by inserting in a feasible order and filling nullable references in a second pass; schemas that require all-or-nothing deferred constraints need SET CONSTRAINTS ALL DEFERRED at the session level, which Seedfast supports when the constraint is declared DEFERRABLE.
The same binary that handles your parent-branch seed and your drift-reseed also handles schema-only. That's the reason most teams want one tool covering the full workflow instead of three different ones duct-taped together.
Seedfast is meant to coexist with the small set of deterministic fixtures most apps rely on — the admin@example.com account your Playwright login spec types, the feature-flag rows keyed by key, the country-code lookup table. Keep those in a short fixtures.sql or a trimmed-down seed.ts and run them after seedfast seed. Seedfast fills the bulk relational data around them; your fixture file pins the specific records your tests expect by literal value. The two layers compose cleanly because Seedfast writes rows through the same Postgres connection your fixture script uses — there is no tool-specific state to reconcile. Seedfast is free to try on any Neon connection string — run your first seed in about two minutes.
Reset a branch to its parent: the fast-feedback loop#
Sometimes a branch gets polluted mid-test. A destructive migration runs, a test accidentally writes real data, a developer wants to start over. Neon supports resetting a child branch to its parent — a one-call operation via the API or dashboard that drops the branch's divergent writes and restores parent state.
# Restore a branch to the latest state of its parent (main)
curl --request POST \
--url "https://console.neon.tech/api/v2/projects/$PROJECT_ID/branches/$BRANCH_ID/restore" \
--header "Authorization: Bearer $NEON_API_KEY" \
--header "Content-Type: application/json" \
--data '{"source_branch_id": "'"$PARENT_BRANCH_ID"'"}'
The reset takes about a second for the same reason branch creation does — no data is copied, just a pointer update. That makes "reset to parent, rerun tests" a tight feedback loop when you're debugging flaky tests that might be polluting each other. Combined with Pattern 1 (parent already seeded), you get an instant clean environment every time.
One caveat: reset-to-parent restores the branch's state to the parent's current state, which means if the parent has not had the PR's migration applied (it usually hasn't), the next CI run will re-apply the migration and overwrite whatever you reset to. The loop is "clean environment for test-data pollution," not "clean environment for migration rollback." For migration-drift branches in Pattern 2, the correct recovery is to delete and recreate the branch rather than reset, since reset will immediately be undone by the next prisma migrate deploy step.
If the parent itself is drifting out of shape — "main has ten-month-old orders that don't reflect the current product schema" — the fix is to reseed the parent, not to reset branches. seedfast seed is idempotent against a freshly-truncated schema; the operational pattern is truncate target tables, run seedfast seed with the current scope, done.
Cost framing: why seed-every-branch is a billing mistake#
Neon bills compute seconds per branch. A branch that runs a seed for the length of the seed is billing compute the whole time. For a concrete team, the number depends on how long your seed actually is — a short reference-data seed is seconds; a realistic-volume seed with FK chains is minutes. The shape of the bill does not: if every PR seeds, the line scales linearly with PR volume.
Seeding the parent once and letting branches inherit replaces that line with near-zero CI compute on most PRs. The parent-seed compute happens once after each schema change on main — usually a scheduled job or a manual workflow dispatch, not per-PR. Preview branches pay the compute cost of their actual tests, nothing more. If your team moves fast on migrations, "once after each schema change" still means a handful of parent reseeds per week, but each one replaces dozens of per-branch seeds that would otherwise run on every PR that opened during the same week.
When schema drift forces a reseed on a specific branch, you pay one seed's worth of compute for that one branch. That's the right bill — you asked the database to do work because the schema changed. The only real waste is paying for that seed on every PR, including the 90% that never touched migrations. That's the line the parent-once pattern eliminates.
Parallel tests, branch limits, and other gotchas#
Neon branch caps. Every Neon plan has a branch limit (Free plans are tighter than paid). PRs that sit open for weeks accumulate branches. The delete-branch-action on pull_request: closed is required, not optional — without it you'll hit the cap mid-PR and the preview workflow will fail at branch creation. Rename your cleanup job accordingly.
Parallel test isolation. If you run Jest or Playwright in parallel and all workers hit the same preview branch, they race for the same rows. For true isolation, create one branch per worker at the start of the test run and delete them at the end. The math: six workers with one-second branch creation is six seconds of overhead, vastly cheaper than six separate seeded databases.
Pooled vs unpooled for seeding. The parent-seed and drift-reseed jobs need the unpooled connection string — PgBouncer in transaction mode breaks prepared statements and can time out on large seeds. The app under test uses the pooled string. This is the same rule as non-branching Neon seeding; the Neon seeding guide covers the connection-string mechanics in detail.
Auto-suspend mid-seed. Neon computes suspend after an inactivity window — the default is 5 minutes on all plans, and on the Free plan it cannot be disabled. A seed that pauses between large batches can have its connection closed underneath it, surfacing as server closed the connection unexpectedly or a generic TCP reset. Paid plans can raise the suspend delay for the seed branch; Free-plan users need the seed to run continuously without long inter-batch pauses. seedfast seed runs one continuous session, so this mainly bites hand-rolled scripts that batch-and-wait.
Stale parent. The parent-seed pattern assumes the parent's data still reflects the current schema. When a migration lands on main, either (a) run the migration against the parent branch's compute and re-run the parent seed to regenerate rows that fit, or (b) accept that the parent will drift and rely on Pattern 2 (drift-reseed on every PR that touches migrations). Most teams pick (a) as a post-merge job on main.
Neon branching seed data: three patterns, one binary#
For the practical setup, the article covers three workflows and picks between them based on the PR:
| Scenario | Pattern | What happens in CI |
|---|---|---|
| PR touches app code only | Inherit from parent | Create branch, run migrations (no-op), run tests |
| PR touches migrations | Drift reseed | Create branch, run migrations, run Seedfast on the branch, run tests |
| Branch is schema-only (compliance scope) | Schema-only rescue | Create schema-only branch, run migrations, run Seedfast, run tests |
All three use the same seedfast binary with different scope strings. The difference is when you invoke it, not how. That's the operational simplification that makes Neon branching seed data manageable at team scale — one tool, three call sites, one mental model.
Frequently asked questions#
Do Neon branches inherit seed data from the parent?#
Yes, by default. A full branch copies both schema and data from the parent via copy-on-write — the preview branch sees the parent's rows instantly and writes diverge on either side from that moment. You can also create schema-only branches that copy the structure but no data; those branches come up empty and need seeding separately.
Should I seed every Neon branch or just the parent?#
Seed the parent. Preview branches inherit the data, which keeps CI time and Neon compute cost flat regardless of how many PRs you have open. Reseed a branch only when something has changed that the inherited data can't represent — typically a new migration introducing columns, tables, or FK changes.
How do I populate a schema-only Neon branch?#
Run the seeder against the branch's direct connection string after migrations. Static seed.sql breaks fast on schema-only branches because the file rarely stays current with migrations. Seedfast reads the branch's live schema on each run and generates valid rows against it: seedfast seed --scope "..." works the same way it does on a full branch, it just has more empty tables to fill.
Can I reset a Neon branch back to parent state mid-test?#
Yes. The Neon API's restore endpoint (POST /api/v2/projects/{project_id}/branches/{branch_id}/restore with a source_branch_id body) drops divergent writes and restores the parent's state in about a second. The tight feedback loop is useful when debugging flaky tests or accidentally polluting a preview branch.
How do I seed a Neon preview branch in GitHub Actions?#
If the parent is already seeded, you don't — the branch inherits the data when neondatabase/create-branch-action runs. If the PR changes the schema, detect the migration change (e.g., with tj-actions/changed-files) and run seedfast seed only on those branches. Use the unpooled (db_url) output from the action for the seed step, not the pooled URL.
What happens if I hit Neon's branch limit?#
Branch creation in CI fails. Pair create-branch-action on pull_request: opened with delete-branch-action on pull_request: closed in the same workflow repo. For extra safety, add a scheduled cleanup that deletes branches older than N days whose PR is closed — stale preview branches are the usual culprit.
Does this pattern work with Prisma / Drizzle / Kysely migrations?#
Yes. The create-branch → run-migrations → (optional seed) → run-tests pipeline is ORM-agnostic. Prisma uses prisma migrate deploy, Drizzle uses drizzle-kit migrate to apply committed migration files (use drizzle-kit push only for local schema prototyping, never in CI), Kysely uses whatever migrator you've wired up. seedfast talks to Postgres directly and doesn't care which ORM manages the schema.
Can I keep my existing seed.ts for fixture records and use Seedfast for bulk data?#
Yes, and most teams do. Seedfast is good at generating the bulk relational data (orders, events, activity history) that makes realistic tests possible but that no one wants to hand-write. Fixture records your tests reference by literal value (admin users, feature flags, country codes) stay in a short fixtures.sql or trimmed-down seed.ts. Run Seedfast first for volume, then the fixture file for specifics — they share the same Postgres connection and don't conflict.
Is the data Seedfast generates reproducible across runs?#
Not by default — the generator produces fresh values on each run, which is what you want when the goal is a realistic dataset. Tests that depend on specific literal values (an email, a product name) should stay in a fixture file that inserts those rows deterministically. For E2E baselines that need the exact same dataset every run, the standard pattern is to run Seedfast once, pg_dump the result, and restore the dump in CI — same approach any generator-based workflow uses.
Related guides#
- How to seed a Neon database — the three seeding methods (raw SQL, ORM, schema-aware) that this article builds on, including connection-string gotchas and per-error fixes
- How to seed a Supabase database — the Supabase counterpart covering
seed.sql,supabase db reset,auth.users, and Supabase preview branches (separate-DB-per-branch model, not CoW) - Seed file maintenance — why static
seed.sqland hand-writtenseed.tsdrift from the schema, and how that drift shows up faster on Neon preview branches - Database seeding in CI/CD — the framework for idempotent, fast, FK-valid seeding in a pipeline, applied to any Postgres (Neon-specific here; this covers the general case)
- Get started with Seedfast — connect the CLI to your Neon project and run the three patterns in under five minutes