How to Seed a Database: PostgreSQL Practical Guide
By the Seedfast team ·
Key Takeaways
- Seeding a PostgreSQL database starts with
psql -f seed.sql— fast and readable for stable reference data, but breaks every time a migration adds a required column - Every ORM has a seeding mechanism (Prisma's
db seed, Drizzle's custom scripts, TypeORM's data sources), but all of them require manual updates when the schema changes ON CONFLICT DO NOTHINGis the minimum for idempotent seeds;TRUNCATE ... RESTART IDENTITY CASCADEis the fast reset option for development- Different environments need different data volumes — what works for local dev is usually too small for staging and too slow for CI
- Schema-aware generators read the live database schema instead of requiring maintained seed files, which removes the recurring cost of keeping seeds in sync with migrations
You ran your migrations. The tables exist. Now you need data — users to log in with, products to browse, orders to see in the dashboard. Knowing how to seed database tables correctly is one of those skills you use constantly but rarely stop to think through.
This guide is Node.js and PostgreSQL-focused: psql commands, Prisma, Drizzle, TypeORM, and raw node-postgres. If you're on Laravel, EF Core, or want the conceptual background first, database seeding methods and tradeoffs covers that ground. Here we focus on the commands you actually run.
Seedfast is a PostgreSQL seeding tool that reads your schema and generates valid, connected data automatically. We'll get to that — but first, the basics.
The simplest way to seed a database in PostgreSQL
A seed file is just SQL with INSERT statements. Create a file, run it against your database.
-- seed.sql
INSERT INTO roles (id, name) VALUES
(1, 'admin'),
(2, 'editor'),
(3, 'viewer');
INSERT INTO teams (id, name, created_at) VALUES
(1, 'Engineering', NOW()),
(2, 'Design', NOW());
INSERT INTO users (id, email, team_id, role_id, created_at) VALUES
(1, 'alice@example.com', 1, 1, NOW()),
(2, 'bob@example.com', 2, 2, NOW());
Run it with psql:
psql -d mydb -f seed.sql
# or with a connection string
psql "$DATABASE_URL" -f seed.sql
For local development with Docker:
docker exec -i postgres_container psql -U postgres -d mydb -f seed.sql
That's it for five tables with stable reference data. The problems start when the schema has 20 tables, foreign keys four levels deep, and weekly migrations that break the INSERT order.
Seeding with your ORM
Most Node.js setups use an ORM or query builder with its own seeding convention. Pick whichever matches your stack.
Prisma
Create prisma/seed.ts and configure it in prisma.config.ts:
// prisma/seed.ts
import { PrismaClient } from '@prisma/client';
const prisma = new PrismaClient();
async function main() {
const team = await prisma.team.upsert({
where: { id: 1 },
update: {},
create: { name: 'Engineering' },
});
await prisma.user.upsert({
where: { email: 'alice@example.com' },
update: {},
create: {
email: 'alice@example.com',
teamId: team.id,
roleId: 1,
},
});
}
main()
.catch(console.error)
.finally(() => prisma.$disconnect());
// prisma.config.ts
import { defineConfig, env } from 'prisma/config';
export default defineConfig({
schema: 'prisma/schema.prisma',
migrations: {
path: 'prisma/migrations',
seed: 'tsx prisma/seed.ts',
},
datasource: { url: env('DATABASE_URL') },
});
Run it:
npx prisma db seed
# Reset and reseed in one step
npx prisma migrate reset
Drizzle
Drizzle has no built-in seed command — you write a Node.js script and run it directly:
// scripts/seed.ts
import { db } from '../src/db';
import { teams, users } from '../src/schema';
async function seed() {
const [team] = await db.insert(teams)
.values({ name: 'Engineering' })
.onConflictDoUpdate({ target: teams.id, set: { name: 'Engineering' } })
.returning();
await db.insert(users)
.values({ email: 'alice@example.com', teamId: team.id, roleId: 1 })
.onConflictDoNothing();
}
seed().catch(console.error);
tsx scripts/seed.ts
# Or add to package.json scripts: "seed": "tsx scripts/seed.ts"
npm run seed
TypeORM
// src/database/seed.ts
import { AppDataSource } from './data-source';
import { Team } from '../entities/Team';
import { User } from '../entities/User';
async function seed() {
await AppDataSource.initialize();
const teamRepo = AppDataSource.getRepository(Team);
const userRepo = AppDataSource.getRepository(User);
const team = await teamRepo.save({ id: 1, name: 'Engineering' });
await userRepo.save({ email: 'alice@example.com', team, roleId: 1 });
await AppDataSource.destroy();
}
seed().catch(console.error);
ts-node src/database/seed.ts
Raw SQL with node-postgres
When you're not using an ORM:
// scripts/seed.ts
import { Pool } from 'pg';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
async function seed() {
const client = await pool.connect();
try {
await client.query('BEGIN');
await client.query(`
INSERT INTO teams (id, name) VALUES (1, 'Engineering')
ON CONFLICT (id) DO NOTHING
`);
await client.query(`
INSERT INTO users (email, team_id, role_id)
VALUES ('alice@example.com', 1, 1)
ON CONFLICT (email) DO NOTHING
`);
await client.query('COMMIT');
} catch (e) {
await client.query('ROLLBACK');
throw e;
} finally {
client.release();
}
}
seed().catch(console.error).finally(() => pool.end());
Schema-aware seeding: no seed file to maintain
Seedfast connects to your PostgreSQL database, reads the live schema — tables, columns, types, constraints, foreign keys — and generates a complete, constraint-satisfying dataset. You describe the domain in plain English; Seedfast handles FK resolution and insert ordering.
# Install
npm install -g seedfast
# or: brew install seedfast
# Connect to your database
seedfast connect
# Generate data from the current schema
seedfast seed --scope "small engineering team with a few projects and task assignments"
When a migration adds a new column or table, the next seedfast seed picks it up. No seed file to update, no INSERT statements to fix, no 'ask Sarah what order to run things'.
# Different scopes for different environments
seedfast seed --scope "2 teams, 5 users, minimal product catalog" # dev
seedfast seed --scope "3 users with 5 completed orders each" # CI test
seedfast seed --scope "realistic e-commerce, 500 products, reviews" # staging
Seedfast works alongside popular ORMs — Prisma, Drizzle, TypeORM, Sequelize and more — because it talks directly to the database, not through ORM abstractions. Run your migrations first, then run Seedfast.
Making seeds idempotent
An idempotent seed runs successfully whether or not the data already exists. This matters because seeds run in CI on every pipeline, and you want them to succeed on retry without manual cleanup.
ON CONFLICT DO NOTHING
The simplest approach — skip the insert if the row already exists:
INSERT INTO roles (id, name) VALUES
(1, 'admin'),
(2, 'editor')
ON CONFLICT (id) DO NOTHING;
Works when you don't need to update existing rows.
ON CONFLICT DO UPDATE (upsert)
When you want the seed data to reflect the latest values:
INSERT INTO roles (id, name) VALUES
(1, 'admin'),
(2, 'editor')
ON CONFLICT (id) DO UPDATE
SET name = EXCLUDED.name;
TRUNCATE + RESTART IDENTITY
The nuclear option — wipe and reseed from scratch. Fast and reliable for development, but destructive:
TRUNCATE roles, teams, users RESTART IDENTITY CASCADE;
INSERT INTO roles (name) VALUES ('admin'), ('editor'), ('viewer');
INSERT INTO teams (name) VALUES ('Engineering'), ('Design');
INSERT INTO users (email, team_id, role_id) VALUES
('alice@example.com', 1, 1),
('bob@example.com', 2, 2);
RESTART IDENTITY resets sequences (auto-increment IDs back to 1). CASCADE truncates dependent tables in the right order so FK constraints don't block the truncate. Use this only when you're starting fresh and don't need to preserve existing data.
COPY FROM for large seed datasets
When you need to seed tens of thousands of rows, INSERT is slow. PostgreSQL's COPY protocol loads data from a file 5–10x faster:
COPY products (name, price, category_id) FROM '/tmp/products.csv' CSV HEADER;
Or pipe data directly without a file:
psql "$DATABASE_URL" -c "COPY products (name, price, category_id) FROM STDIN CSV" < products.csv
Useful for staging databases that need thousands of realistic rows. For schemas with FK constraints between tables, still seed referenced tables first.
Seeding by environment
Different environments need different data volumes. Trying to run the same seed everywhere usually means either too little data in staging or too long a runtime in CI.
Local development
Small, fast, focused on the features you're actively building:
# Enough to log in and navigate
seedfast seed --scope "2 teams, 5 users per team, 10 products"
# or a hand-written file for simple schemas
psql "$DATABASE_URL" -f seeds/dev.sql
Test / CI
Focused datasets for specific scenarios. Each test suite seeds what it needs:
# Minimal dataset for CI
psql "$DATABASE_URL" -f seeds/test-baseline.sql
Or per-test using a tool that resets between runs:
# GitHub Actions
- name: Run migrations
run: npx prisma migrate deploy
- name: Seed test database
run: npm run seed:test
- name: Run tests
run: npm test
Staging
Enough data to demo the application realistically. This is where hand-written seeds stop scaling — a staging database for an e-commerce app might need 500 products, realistic price ranges, reviews that reference real products, and users with order histories.
# Schema-aware staging seed (no seed file to maintain)
seedfast seed --scope "e-commerce with 500 electronics products, reviews, and 50 users with order histories"
When your seed breaks
The most common cause of a broken seed is a migration that adds a NOT NULL column. Your INSERT statements don't include the new column, the database rejects them, and suddenly nobody on the team can run the app locally until someone fixes the seed file.
This happens because hand-written seeds are static. They describe the schema at the moment you wrote them. Every subsequent migration makes them a little more wrong.
The fix is always the same: find the failing INSERT, figure out what the migration added or changed, update the seed, test it, commit it. On an active codebase with weekly migrations, this becomes a recurring task with no clear owner.
There are two ways out:
Keep the seed file and add discipline. Pair every migration PR with a seed file update. Make it part of your PR checklist. This works for teams with stable schemas and small seed files — and it stops scaling around 20 tables and monthly migrations.
Use Seedfast — a schema-aware generator that reads your live schema on every run. No seed file to maintain, no INSERT statements to fix after migrations.
Running seeds in CI/CD
In a pipeline, seeding should be one command after migrations. No "run these three scripts in this order" steps, no shared state between parallel test runs.
# .github/workflows/test.yml
jobs:
test:
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: testdb
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: npm ci
- name: Run migrations
run: npx prisma migrate deploy
env:
DATABASE_URL: postgres://postgres:postgres@localhost/testdb
- name: Seed database
run: npm run seed
# or: seedfast seed --scope "test baseline"
env:
DATABASE_URL: postgres://postgres:postgres@localhost/testdb
- name: Run tests
run: npm test
env:
DATABASE_URL: postgres://postgres:postgres@localhost/testdb
Keep seeds idempotent (ON CONFLICT DO NOTHING or upsert) so pipeline retries don't fail. For parallel test jobs, use isolated databases per job rather than a shared one — concurrent INSERTs from different test suites against the same tables produce inconsistent state.
For deeper CI/CD patterns, see CI/CD database seeding docs.
Frequently asked questions
How do I seed a PostgreSQL database quickly?
For a simple schema, psql -d mydb -f seed.sql with a file containing your INSERT statements is the fastest path. For Prisma projects, npx prisma db seed runs your prisma/seed.ts script. For any stack with more than 10-15 tables or complex FK relationships, Seedfast (seedfast seed --scope "...") generates valid data without a seed file.
What is the difference between seeding and migrating a database?
Migrations change database structure — creating tables, adding columns, setting constraints. Seeds populate the database with data. Migrations run in a fixed order and are versioned. Seeds should be idempotent and can run at any time after migrations. They solve different problems; keeping them separate is a best practice.
How do I make my database seed idempotent?
Use ON CONFLICT DO NOTHING or ON CONFLICT DO UPDATE in your INSERT statements so they don't fail when the row already exists. For Prisma, use upsert(). For Drizzle, use .onConflictDoNothing(). If you prefer a full reset, TRUNCATE ... RESTART IDENTITY CASCADE wipes the data and lets you re-insert from scratch.
How do I seed a database in a Docker container?
Pass the SQL file directly to psql inside the container:
docker exec -i <container_name> psql -U postgres -d mydb < seed.sql
Or mount the file and run it from inside:
docker exec -it <container_name> psql -U postgres -d mydb -f /docker-entrypoint-initdb.d/seed.sql
Why does my seed break after a migration?
Static seed files (INSERT statements with hardcoded values) describe the schema at the time they were written. When a migration adds a NOT NULL column your INSERT doesn't provide, PostgreSQL rejects the row. Fixes: add the missing column value to the seed file, or switch to a schema-aware generator that reads the current schema on every run.
How do I seed different amounts of data per environment?
Keep environment-specific seed scripts or use a generator with environment flags. A common pattern: seeds/dev.sql for local (small and fast), seeds/ci.sql for test pipelines (minimal, focused), and a generated seed for staging (larger, realistic). Seedfast's --scope flag lets you describe the dataset in plain English for each environment without separate files.
Related guides
- Database Seeding: From Your First seed.sql to Fully Automated — the conceptual companion to this guide, covering what seeding is, why seeds break, and the full spectrum of approaches
- Seed File Maintenance: When Your Seeds Fall Out of Sync — the long-term cost of maintaining static seeds on an active codebase
- Get started with Seedfast — connect to your PostgreSQL database and run your first schema-aware seed