Load Testing With an Empty Database? Here's Your Problem

By Mikhail Shytsko, Founder at Seedfast · March 22, 2026 · Updated June 5, 2026

Your k6 scripts are flawless. Your Gatling scenarios are tuned. Your Locust swarm is ready. But your database has 47 rows — and every result you're about to collect is fiction.

You spent two weeks building a load testing suite. The scenarios cover login, search, checkout, reporting. You've modeled think times, ramp-up curves, and error thresholds. You run it. The results look great: P99 under 200ms, zero errors, 3,000 requests per second. You present the numbers to the team, green-light the release, and deploy to production.

Monday morning, the dashboard is on fire. Search takes 8 seconds. The reporting endpoint is timing out. The checkout flow that handled 3,000 RPS in your test can barely sustain 400 in production.

Your load test wasn't wrong. Your database was.

Key Takeaways

An empty or near-empty database makes every load-test number fiction. Query plans, buffer cache behavior, and connection-pool dynamics are all data-dependent, so a system measured at 50 rows behaves nothing like the same system at 5 million.
The four numbers that move most: throughput typically drops 3–5x, P99 latency rises 10–50x, missing indexes surface, and the connection pool saturates — none of which an empty database reveals.
The fix isn't a better load-test script — it's realistic data volume before the test runs, so your existing k6/Gatling/Locust scenarios hit a production-shaped database.
Seedfast seeds foreign-key-valid data at production volume from your live schema in one command, so you can drop it in as a pipeline step before the load test with no script changes.

The Empty Database Illusion

Here's a dirty secret of performance engineering: most load tests run against databases with trivially small datasets. The reasons are understandable — nobody wants to spend a week building a data generation pipeline when the actual goal is testing application performance. So teams take shortcuts:

Use the default dev database with 20 rows per table
Run migrations and insert a handful of fixture records
Copy a small CSV extract from production
Skip data setup entirely and hope the application layer is the bottleneck

The problem is that databases don't behave the same way at 50 rows and 5 million rows. Not approximately the same — fundamentally different. Query planners make different decisions. Indexes that were ignored become critical. Disk I/O patterns change. Buffer cache hit rates collapse. Connection pool behavior shifts. Join strategies flip from nested loops to hash joins.

When you load test against an empty database, you're measuring the performance of a system that doesn't exist in production.

What Changes at Scale: A Technical Walkthrough

Query Plans Are Data-Dependent

PostgreSQL's query planner uses table statistics — row counts, value distributions, null fractions — to choose execution plans. With 100 rows, almost every query gets a sequential scan because reading the whole table sequentially costs less than an index lookup's random I/O. The planner correctly decides the index is not worth it.

At 10 million rows, the same query gets an index scan, a bitmap heap scan, or a parallel sequential scan — depending on selectivity. And if your indexes are missing, you get a sequential scan that reads gigabytes instead of kilobytes.

-- Illustrative timings (the shape of the change, not a benchmark of any hardware):
-- With 100 rows: Seq Scan, 0.2ms (whole table fits in cache)
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 42;

-- With 10M rows, no index: Seq Scan, 4,200ms
-- With 10M rows, with index: Index Scan, 0.3ms

Your load test at 100 rows will show 0.2ms. Production with 10M rows and a missing index will show 4,200ms. You've measured nothing useful.

Buffer Cache Behavior Changes Completely

PostgreSQL's shared buffer cache works well when the working set fits in memory. With a small database, everything is cached. Every query hits memory. Latency is sub-millisecond. Throughput is limited only by CPU.

At production scale, the working set exceeds cache size. Queries start hitting disk. Latency spikes. Throughput drops. And it doesn't degrade linearly — it falls off a cliff when the cache hit rate drops below a threshold. The mapping below is illustrative of that cliff shape, not a benchmark of any specific hardware:

Database size: 50MB   → Cache hit rate: 99.9%  → Avg query: 0.5ms
Database size: 5GB    → Cache hit rate: 97%    → Avg query: 2ms
Database size: 50GB   → Cache hit rate: 78%    → Avg query: 45ms

The 0.5ms you measured in your load test becomes 45ms in production. Your P99 of 200ms becomes P99 of 2,000ms. Your capacity model is off by an order of magnitude.

Connection Pool Saturation

With a small database, every query completes in microseconds. Connections return to the pool almost instantly. A pool of 20 connections can handle thousands of concurrent requests because each connection is occupied for a tiny fraction of a second.

At production scale, queries take longer. Connections stay checked out for milliseconds instead of microseconds. The pool starts queuing. Latency increases. The longer latency causes more concurrent requests to stack up, which increases pool pressure further. The system enters a feedback loop that doesn't exist at small scale. The figures below assume query time dominates the connection hold time — a simplification that shows the direction, not a literal throughput ceiling:

Small DB:  Query time 0.5ms  → 20 connections handle 40,000 req/s
Large DB:  Query time 50ms   → 20 connections handle 400 req/s
                              → Requests start queuing at 400 req/s
                              → Queue adds latency → more queuing → cascade

Your load test showed the system handling 3,000 RPS with headroom. Production hits a wall at 400 RPS.

Index Usage and Join Strategies

With small tables, the query planner ignores indexes entirely. It doesn't matter, because sequential scans on small tables are fast. But this means your load test never exercises index performance at all.

At scale, missing indexes become the dominant performance factor. And compound indexes matter — a query that filters on customer_id and status might use an index on customer_id but still scan thousands of rows for the status filter if there's no compound index. The bugs that only real test data catches walk through this same index-at-scale failure from the application side.

Join strategies change too. Small tables get nested loop joins. Large tables get hash joins or merge joins. These have completely different memory requirements, CPU profiles, and performance characteristics.

-- Small tables: Nested Loop Join, 0.5ms
-- Both tables fit in cache, inner loop is trivial
SELECT o.*, c.name
FROM orders o JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending';

-- Large tables: Hash Join, 180ms (or worse)
-- Hash table may spill to disk, changing performance profile entirely

The Anti-Patterns

Every performance engineering team has hit at least one of these. Some are hitting all four.

Anti-Pattern 1: The Empty Database

The most common and most dangerous. The database has whatever the migration scripts created — empty tables with correct schemas. The load test runs. Everything is fast because there's nothing to query.

# "We tested it with k6, performance is fine"
k6 run load-test.js
# ✓ P99 < 200ms
# ✓ Error rate < 0.1%
# ✓ Throughput > 2000 RPS
# (against a database with 0 rows in every table)

This is not a load test. This is a test of how fast your application can return empty result sets.

Anti-Pattern 2: The Dozen Rows

Slightly better — someone ran the seed script from the development setup. Now there are 10-20 rows per table. The queries actually return data. But the performance profile is identical to an empty database because every table fits in a single disk page.

Anti-Pattern 3: The CSV Import Marathon

The responsible engineer exports a CSV from production, scrubs it for PII (hopefully), and writes an import script. This takes three days. The data is a snapshot from last month. Half the foreign keys are broken because the export didn't capture all related tables. The import script breaks on the next schema change and nobody updates it.

# The "proper" approach that nobody maintains
psql staging_db -c "\COPY users FROM 'users_sanitized.csv' CSV HEADER"
psql staging_db -c "\COPY orders FROM 'orders_sanitized.csv' CSV HEADER"
# ERROR: insert or update on table "orders" violates foreign key constraint
# (because the users CSV was from Monday and orders CSV was from Wednesday)

Anti-Pattern 4: The "Application Layer Is the Bottleneck" Assumption

The most seductive anti-pattern. The reasoning goes: "Our bottleneck is CPU-bound application logic, not the database. So database size doesn't matter."

This is sometimes true, and it's always temporary. The moment you optimize the application layer, the database becomes the bottleneck. And by then your load tests have established baselines against an empty database, so you can't measure the improvement accurately.

What Realistic Data Actually Reveals

Teams that switch to realistic-data load testing keep hitting the same five issues. These are patterns we hear about repeatedly, not benchmarks from a lab.

Finding 1: Your Actual Throughput Is Several Times Lower

The most common discovery. The endpoint that handled 3,000 RPS with an empty database sustains 600-1,000 RPS with realistic data. The bottleneck was always the database — you just couldn't see it without data.

Finding 2: Your P99 Latency Is Far Higher

Average latency increases modestly, but tail latency explodes. The occasional query that hits an uncached page, takes a suboptimal execution plan, or waits for a connection adds seconds to the P99. Your SLA is on the P99, not the average.

Finding 3: You Have Missing Indexes

Nearly every schema with more than 20 tables has at least one missing index that only matters at scale. The load test with realistic data triggers slow query logs that were silent with 50 rows. Missing indexes are one of six bug classes that only real test data catches — alongside N+1 queries, pagination off-by-ones, and timeout cascades.

Finding 4: Your Connection Pool Is Undersized

Pool sizing calculated against sub-millisecond queries doesn't hold when queries take 10-50ms. The real fix is usually the queries themselves, not a bigger pool — adding connections often just moves the contention into PostgreSQL's backends.

Finding 5: Background Jobs Compete With Request Traffic

At small scale, background jobs (report generation, data exports, analytics queries) complete instantly and don't interfere with request-handling queries. At production scale, they compete for the same connections, buffer cache, and I/O bandwidth. Your load test never surfaced this because the background jobs also ran against an empty database.

The Fix: Seed Before You Test

The workflow is straightforward. Before running your load test, populate the database with realistic data at the volume you expect in production.

Step 1: Seed Realistic Data

# Seed production-scale data for a load test
seedfast seed --scope "seed 500,000 users with profiles, 2,000,000 orders spread across the last 12 months, payments for each order, and product reviews for 10% of orders"

Seedfast reads your schema, resolves foreign key dependencies into a valid insert order, and generates data with realistic distributions — timestamps clustered during business hours, status fields with realistic ratios, amounts following natural distributions. Because it re-reads the schema on every run, it adapts to migrations instead of breaking like the hand-written CSV importer above.

Seeding Plan:
  public.users            — 500,000 records
  public.profiles         — 500,000 records
  public.products         — 5,000 records
  public.orders           — 2,000,000 records
  public.order_items      — 5,800,000 records
  public.payments         — 2,000,000 records
  public.reviews          — 200,000 records

Total: 11,005,000 records across 7 tables

Approve? (Y/n)

Step 2: Run Your Load Test

Now run the same k6/Gatling/Locust scripts you already have. No changes needed. The application hits the same endpoints, executes the same queries — but now the database has data in it.

# Same test, dramatically different results
k6 run load-test.js

Step 3: Compare Results

The numbers will be different. That's the point. The new numbers are closer to what production will experience. The figures below are illustrative of the shape of the change — throughput down, tail latency up, pool saturated — not a benchmark of any specific application:

                    Empty DB        500K Users
P50 Latency         12ms            45ms
P99 Latency         85ms            890ms
Throughput           3,200 RPS      780 RPS
Error Rate           0.01%          2.3%
DB Connections Used  4/20           20/20 (saturated)

An error rate like that is the kind of thing pool saturation produces; a P99 that climbs like that is what a missing index on a hot query tends to look like. Load tests exist to surface those failures, but they only appear against a realistically sized database. Seedfast generates one from your own schema, so the test runs against production-shaped data instead of an empty table — run your first seed.

Integrating Into Your Load Testing Pipeline

One-off load tests are useful, but the real value comes from running them automatically. Seed the database as a pipeline step, then run the load test. The CI/CD database seeding guide covers the full setup, including ephemeral per-PR databases and non-interactive mode.

GitHub Actions

name: Load Test
on:
  schedule:
    - cron: '0 2 * * 1'  # Weekly Monday 2AM
  workflow_dispatch:

jobs:
  load-test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: loadtest
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
        ports:
          - 5432:5432

    steps:
      - uses: actions/checkout@v4

      - name: Run migrations
        run: npm run migrate
        env:
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/loadtest

      - name: Seed realistic data
        run: |
          seedfast seed \
            --scope "seed 200,000 users with orders and payments" \
            --output plain
        env:
          SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/loadtest

      - name: Run load test
        run: k6 run --out json=results.json load-tests/main.js
        env:
          BASE_URL: http://localhost:3000
          DATABASE_URL: postgresql://postgres:postgres@localhost:5432/loadtest

      - name: Check thresholds
        run: |
          # Fail the pipeline if P99 exceeds threshold
          python scripts/check-load-test-results.py results.json \
            --p99-max 500 \
            --error-rate-max 1.0

GitLab CI

load-test:
  stage: performance
  services:
    - postgres:16
  variables:
    POSTGRES_DB: loadtest
    DATABASE_URL: postgresql://postgres:postgres@postgres:5432/loadtest
  script:
    - npm run migrate
    - seedfast seed --scope "seed 200,000 users with orders and payments" --output plain
    - k6 run load-tests/main.js
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
    - if: $CI_PIPELINE_SOURCE == "web"

Nightly Performance Regressions

The most powerful pattern: seed fresh data and run load tests nightly. Compare results against previous runs. Catch performance regressions before they reach production.

#!/bin/bash
# nightly-load-test.sh

# Fresh database every run
dropdb --if-exists loadtest && createdb loadtest
npm run migrate

# Seed consistent volumes
seedfast seed \
  --scope "seed 100,000 users with orders, payments, and activity logs" \
  --output plain

# Run load test with consistent parameters
k6 run \
  --out json=results/$(date +%Y-%m-%d).json \
  load-tests/main.js

# Compare with yesterday's results
python scripts/compare-results.py \
  results/$(date -d yesterday +%Y-%m-%d).json \
  results/$(date +%Y-%m-%d).json \
  --regression-threshold 15

Sizing Your Test Data

Goal	Minimum volume	What it reveals
Basic query plans	10,000+ rows	Whether indexes are being used
Cache behavior	Data > 25% of shared_buffers	Real-world cache hit rates
Connection pool sizing	Enough to make queries take >5ms	Pool saturation points
Join performance	100,000+ rows per joined table	Hash join vs. nested loop thresholds
End-to-end realism	Match production row counts	Realistic full-pipeline performance

Start with 10x your current test data. If your dev database has 100 rows per table, seed 1,000. Run the load test. Then try 10,000. Then 100,000. Watch where the performance profile changes — that's where your production issues are hiding.

# Progressive scaling approach
seedfast seed --scope "seed 1,000 users with orders and payments"
k6 run load-test.js  # Baseline

# Drop and re-seed at higher volume
seedfast seed --scope "seed 10,000 users with orders and payments"
k6 run load-test.js  # Look for non-linear degradation

seedfast seed --scope "seed 100,000 users with orders and payments"
k6 run load-test.js  # This is usually where things get interesting

The point where latency stops scaling linearly with data size is the point where your architecture has a bottleneck. Finding that inflection point is the entire purpose of load testing with realistic data. Re-seeding between volume tiers takes minutes, not the three-day CSV marathon: in our internal runs, Seedfast generated about 1M foreign-key-valid rows on a 20-table SaaS schema in roughly 3.5 minutes (local Postgres, single run). For the highest volumes, the large-volume seeding guide covers batching and scope tuning.

The Uncomfortable Truth

Load testing without realistic data is performance theater. It produces charts, it generates reports, it checks a compliance box — but it doesn't predict production behavior. The gap between "P99 under 200ms" on an empty database and "P99 at 2 seconds" in production is the gap between a successful launch and a 2AM incident.

You already invest significant effort in writing load test scenarios, tuning parameters, and analyzing results. The missing piece is the data. One command, a few minutes, and your load tests start telling you the truth.

seedfast seed --scope "seed production-scale data across all tables"

Then run the test you already have. The results will be different. They'll be real.

Frequently asked questions

Why does an empty database make load test results inaccurate?

Database performance is data-dependent. With a near-empty table, PostgreSQL's planner picks a sequential scan because the table is cheaper to read whole, every read is served from the buffer cache, and connections return to the pool in microseconds. At production volume the same query may need an index scan or hash join, reads start hitting disk, and connections stay checked out long enough to saturate the pool. So an empty-database load test really measures how fast your application returns near-empty result sets, not how it handles production query load.

How much test data do I need for a realistic load test?

Match production row counts when you can; when you cannot, start at roughly 10x your current test volume and scale up in tiers (1,000 → 10,000 → 100,000 → 1,000,000), running the load test at each step. The thresholds differ by effect. Index usage shows up around 10,000 rows, cache behavior once the data exceeds about 25% of shared_buffers, and join-strategy changes around 100,000 rows per joined table. Watch for the volume where latency stops scaling linearly — that inflection is where your bottleneck lives.

Does database size really change PostgreSQL query plans?

Yes. The planner chooses a plan from table statistics — row counts, value distributions, null fractions. At 100 rows it favors sequential scans; at 10 million rows the same query may use an index scan, a bitmap heap scan, or a parallel scan depending on selectivity, and a missing index turns into a scan that reads gigabytes. Joins flip too: small tables get nested-loop joins, large tables get hash or merge joins with entirely different memory and CPU profiles. Run EXPLAIN ANALYZE on realistic volumes to see the plan production will actually run.

Can I just copy production data into my load test environment instead?

It's a common shortcut with two separate problems. First, whether you're even allowed to move real production records into a test environment is a compliance question — under HIPAA, PCI-DSS, GDPR, or SOC 2 it is often restricted — so check with your own compliance team, not a load-test guide. Second, it's brittle: production dumps drift from the schema and break on the next migration, and partial exports leave foreign keys dangling. Seedfast addresses the second problem by generating foreign-key-valid data from your schema, so no real customer records enter the load-test pipeline. It is not an anonymization or compliance tool — it doesn't make copying prod data safe, it removes the reason to copy it at all.

How do I add database seeding to a load testing CI pipeline?

To add database seeding to a load-testing CI pipeline, add a seed step after migrations and before the load test. Set SEEDFAST_API_KEY in the environment to run the CLI non-interactively (no confirmation prompt), pass your volume as a plain-English --scope, then run your existing k6, Gatling, or Locust scenario against the seeded database. The CI/CD database seeding guide shows the full GitHub Actions and GitLab CI setup.

Small Data, Big Lies: 6 Bugs Only Real Test Data Catches — the application-side cousin of this article: the bug classes that stay hidden until your tables hold real volume
PostgreSQL Test Data: A Syntax Cookbook — the raw SQL patterns for generating volumes and realistic distributions by hand
ALTER TABLE, 5 Million Rows, and the Deploy That Took Down the Site — what happens when a migration meets production-scale data
Enterprise Database Test Data: What It Actually Looks Like — realistic relational test data across a large, multi-table schema
Get started with Seedfast — connect your database and run your first schema-aware seed

Ready to see what your load tests are actually measuring?

Get Started | Documentation | Pricing

Seedfast generates production-scale test data from your schema. Realistic volumes, valid relationships, no seed scripts to maintain.