Seedfast

Seedfast

CI/CD Database Seeding

  • Use SEEDFAST_API_KEY environment variable for non-interactive CI/CD authentication
  • The --scope flag accepts natural language descriptions of what to seed
  • --output json enables machine-readable output for pipeline validation
  • Always run database migrations before seeding
  • Create separate API keys per environment (dev, staging, production)

CI/CD database seeding is generating fresh test data inside the pipeline on every run, from the database schema rather than a copied dump. Seedfast turns test data for a CI/CD pipeline into one step: it reads the live schema, resolves foreign keys, and writes valid rows. That gives you automated test data provisioning with no production data and no seed file to maintain. It builds on the broader database seeding workflow.

Most teams rely on one of these approaches, and each has drawbacks:

Static SQL dumps contain real or semi-anonymized production data, drift from live schemas, and require constant maintenance.

Custom seeding scripts become unmaintainable code that breaks with every schema change.

Shared test databases create race conditions and test pollution between concurrent runs.

There's also a deeper issue with any approach based on production data: even anonymized or masked records start from something real. That creates compliance exposure, because there's always something that needs protecting, auditing, or GDPR-scoping.

Seedfast generates data from scratch on every run. Your production records are never involved:

  • No production data: data is synthesized from your schema on every pipeline run, so there's nothing to anonymize or protect
  • Schema-adaptive: Seedfast reads your live schema, so you don't rewrite scripts after migrations
  • Isolated per run: each pipeline job seeds its own database, with no shared state and no race conditions

Make sure you have:

  • A Seedfast account (sign up at seedfa.st)
  • A PostgreSQL database accessible from your CI/CD environment
  • Access to configure secrets in your CI/CD platform

API keys authenticate the Seedfast CLI in non-interactive environments where browser-based login isn't possible.

  1. Log in and open the Seedfast Dashboard
  2. Click API Keys in the left sidebar
  3. Click Create new key and enter a name (e.g. "GitHub Actions")
  4. Copy the key immediately; it won't be shown again

Your API key format: sfk_live_a1b2c3d4e5f6...

Tip: Create separate keys for each environment (dev, staging, production). This enables independent rotation and revocation.

Store your API key and database credentials as encrypted secrets in your CI/CD platform.

GitHub Actions: Go to repository → Settings → Secrets and variables → ActionsNew repository secret:

SEEDFAST_API_KEY=sfk_live_...
SEEDFAST_DSN=postgres://user:pass@host:5432/db

Note: Use SEEDFAST_DSN rather than DATABASE_URL. Many CI environments already set DATABASE_URL for other tools, so using SEEDFAST_DSN avoids conflicts. The CLI checks SEEDFAST_DSN first, then falls back to DATABASE_URL.

GitLab CI: Go to project → Settings → CI/CD → Variables. Mark as Masked and Protected.

CircleCI: Go to Project Settings → Environment Variables.

The simplest way to run Seedfast in CI/CD is via npx:

npx seedfast seed --scope "Your scope description" --output json

The --scope flag describes what data to seed in plain English:

# Seed by schema
--scope "HR schema: employees, departments, payroll"

# Seed by use case
--scope "Customers and orders for checkout flow testing"

# Seed with specificity
--scope "Single user with 3 pending orders"

When --scope is provided, Seedfast automatically approves the seeding plan without prompting, which is what makes a fully automated pipeline possible.

name: E2E Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: testpass
        ports:
          - 5432:5432
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run db:migrate
      - name: Seed test data
        run: npx seedfast seed --scope "Users and orders for E2E tests" --output json
        env:
          SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}
          SEEDFAST_DSN: postgres://postgres:testpass@localhost:5432/postgres
      - run: npm run test:e2e
test:
  stage: test
  services:
    - postgres:15
  variables:
    POSTGRES_PASSWORD: testpass
    SEEDFAST_DSN: postgres://postgres:testpass@postgres:5432/postgres
  script:
    - npm ci
    - npm run db:migrate
    - npx seedfast seed --scope "Test dataset" --output plain
    - npm run test

The --output flag controls output format:

ModeUse case
interactiveColors and spinners (default, for local dev)
plainTimestamped logs (for CI log output)
jsonMachine-readable (for script validation)

Validate results with JSON output:

RESULT=$(npx seedfast seed --scope "..." --output json)

if [ "$(echo $RESULT | jq -r '.success')" != "true" ]; then
  echo "Seeding failed!"
  exit 1
fi

echo "Seeded $(echo $RESULT | jq -r '.rows') rows"
CodeMeaning
0Success
1Generic error
2Authentication failed (check API key)
3Database connection failed (check SEEDFAST_DSN)
4Quota exceeded
5Operation cancelled by user
  • Never commit API keys. Use CI/CD secret management.
  • Separate keys per environment to enable independent rotation.
  • Rotate regularly: generate a new key, update secrets, verify, then revoke the old key.
  • Limit scope: expose the API key only to the seeding step:
# Key only in the seeding step, not globally
- name: Seed database
  run: npx seedfast seed --scope "..."
  env:
    SEEDFAST_API_KEY: ${{ secrets.SEEDFAST_API_KEY }}

"Authentication failed" in CI but works locally: Verify the secret name matches exactly. Check for leading/trailing whitespace. Ensure the API key hasn't been revoked.

"Connection refused" to database: Wait for database health check to pass. Use the correct hostname (localhost for GitHub Actions, postgres for GitLab CI). Check that the CI runner can reach the database port.

Seeding takes too long in pipeline: Narrow the scope to only the tables your tests need. A focused scope like "users and orders for checkout tests" is faster than "all tables."

Run a schema-aware generator as a pipeline step. Authenticate the Seedfast CLI with SEEDFAST_API_KEY, run migrations, then call seedfast seed --scope "...". The generator reads the live schema on that run, resolves foreign keys, and writes valid rows, so a migration shipped in the same branch is picked up automatically. There is no static fixture file tracking the schema and no production dump to anonymize. One data-path detail worth noting for a security review: the schema definition (table and column names, types, constraints) is sent to an AI provider to generate the values. Row data never leaves your database, but if a table or column name is itself sensitive, confirm that path fits your policy.

A generated dataset is built from the schema and contains no real records, so it carries no production PII. That keeps personal data out of that environment and supports the confidentiality and privacy controls a GDPR or SOC 2 program cares about. (It doesn't remove the environment from audit scope by itself, since scope is the boundary your auditor defines, but it removes the real-data-handling obligations that a production copy would drag in.) A dump, even masked, starts from production and brings those obligations with it. The same generate-don't-copy logic applies to longer-lived environments. See building a staging environment without copying production.

The same seedfast seed step works for unit, integration, and E2E jobs. Point it at each job's own database so runs stay isolated. The pipeline pattern is the same on managed Postgres: see how to seed a Supabase database and how to seed a Neon database for the per-branch specifics. When you're comparing generators for a Postgres pipeline, the best Postgres test data generator guide weighs the options, and data seeding tools covers the wider category including the masking-versus-synthetic tradeoff. For the buyer's case — why generate in the pipeline at all instead of copying production or maintaining a seed file — see synthetic data for CI/CD.

CI/CD database seeding with Seedfast takes minutes to set up. Start with one test job, refine your scope, then expand to your full test suite. Try Seedfast and start seeding in minutes.