Database Setup and GraphileWorker Contract

This document explains how the backfill library handles database setup and the contract between your application and GraphileWorker.

Automatic Schema Management

No manual database migrations required!

The backfill library uses GraphileWorker, which automatically creates and manages its own database schema. When you initialize a BackfillClient, the following happens automatically:

Schema Creation: GraphileWorker creates a graphile_worker schema (or your custom schema name)
Table Creation: All necessary tables are created automatically
Migration Management: GraphileWorker handles its own schema versioning and migrations

Database Schema Structure

GraphileWorker creates the following tables automatically:

Core Tables

_private_jobs - Stores individual jobs with their payload, priority, and execution state
_private_tasks - Stores task type definitions (task identifiers)
_private_job_queues - Manages named queues for organizing work
_private_known_crontabs - Handles recurring/scheduled tasks
migrations - Tracks GraphileWorker's own schema migrations

Key Schema Details

-- Jobs table (simplified view)
_private_jobs:
  - id: bigint (primary key)
  - task_id: integer (references _private_tasks.id)
  - job_queue_id: integer (references _private_job_queues.id) 
  - payload: json
  - priority: smallint (-32768 to 32767, lower = higher priority)
  - run_at: timestamptz
  - attempts: smallint
  - max_attempts: smallint
  - key: text (unique identifier for deduplication)
  - created_at: timestamptz
  - updated_at: timestamptz

-- Task definitions
_private_tasks:
  - id: integer (primary key)
  - identifier: text (unique, your task name like \"send_email\")

-- Queue definitions  
_private_job_queues:
  - id: integer (primary key)
  - queue_name: text (unique, like \"fast\", \"bulk\", \"emails\")

Database Requirements

Minimum Requirements

PostgreSQL 11+ (GraphileWorker requirement)
Connection permissions: Your database user must be able to:
- Create schemas (CREATE SCHEMA)
- Create tables, indexes, and functions
- Insert, update, delete, and select data

Recommended Setup

-- Create a dedicated database for your application
CREATE DATABASE myapp;

-- Create a dedicated user with appropriate permissions
CREATE USER myapp_worker WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE myapp TO myapp_worker;

-- Grant schema creation permissions
GRANT CREATE ON DATABASE myapp TO myapp_worker;

Connection String Examples

use backfill::BackfillClient;

// Basic connection (GraphileWorker will create the schema)
let client = BackfillClient::new(\"postgresql://user:password@localhost/mydb\").await?;

// With custom schema name
let client = BackfillClient::new_with_schema(
    \"postgresql://user:password@localhost/mydb\",
    \"custom_worker_schema\"
).await?;

// With existing connection pool
let pool = PgPoolOptions::new()
    .max_connections(10)
    .connect(\"postgresql://user:password@localhost/mydb\")
    .await?;
let client = BackfillClient::with_pool(pool).await?;

What Happens on First Run

When you call any of the BackfillClient constructors:

Schema Check: GraphileWorker checks if its schema exists
Automatic Migration: If not, it creates the schema and all tables
Version Tracking: Future runs will apply any necessary schema updates automatically
Ready to Use: Your client is immediately ready to enqueue and process jobs

Development vs Production

Development

GraphileWorker can create the schema on-demand
No separate migration step needed
Great for rapid iteration

Production

Same automatic behavior, but consider:
- Database Permissions: Ensure your app has schema creation rights, or
- Pre-initialization: Run schema setup during deployment with elevated permissions
- Monitoring: Monitor the graphile_worker.migrations table for schema changes

Production Deployment Options

Option 1: App-managed (Recommended)

// Your application automatically handles schema setup
let client = BackfillClient::new(&database_url).await?;

Option 2: Deployment-managed

# During deployment, run schema initialization separately
cargo run --bin setup_database  # Your custom setup binary
cargo run --bin my_app          # Your main application

Schema Customization

Custom Schema Name

let client = BackfillClient::new_with_schema(
    &database_url, 
    \"my_job_queue\"  // Custom schema name
).await?;

Multiple Environments

// Development
let client = BackfillClient::new_with_schema(&url, \"worker_dev\").await?;

// Staging  
let client = BackfillClient::new_with_schema(&url, \"worker_staging\").await?;

// Production
let client = BackfillClient::new_with_schema(&url, \"worker_prod\").await?;

Troubleshooting

Permission Errors

Error: permission denied to create schema \"graphile_worker\"

Solution: Grant CREATE permission to your database user:

GRANT CREATE ON DATABASE mydb TO myuser;

Connection Issues

Error: connection to server on socket failed

Solution: Verify your connection string and PostgreSQL service:

# Test direct connection
psql \"postgresql://user:password@localhost/mydb\"

# Check if PostgreSQL is running
pg_isready -h localhost -p 5432

Schema Conflicts

Error: relation \"graphile_worker._private_jobs\" already exists

Solution: This usually indicates a partial schema. Clean up and retry:

DROP SCHEMA IF EXISTS graphile_worker CASCADE;

Integration with Existing Applications

Alongside Other Tables

GraphileWorker uses its own schema, so it won't interfere with your existing application tables:

mydb:
├── public schema (your app tables)
│   ├── users
│   ├── orders  
│   └── products
└── graphile_worker schema (job queue)
    ├── _private_jobs
    ├── _private_tasks
    └── _private_job_queues

Shared Database

Multiple applications can share the same PostgreSQL instance by using different schema names:

// App 1
let client = BackfillClient::new_with_schema(&url, \"app1_jobs\").await?;

// App 2  
let client = BackfillClient::new_with_schema(&url, \"app2_jobs\").await?;

SQLx Compile-Time Query Verification

This library uses SQLx for all database operations, which provides excellent compile-time verification of SQL queries. This is one of Rust's greatest strengths for database applications.

Overview

SQLx macros like sqlx::query!() and sqlx::query_as!() verify your SQL queries at compile time by:

Connecting to your database during compilation
Parsing and validating SQL syntax
Checking table/column existence
Verifying parameter types and counts
Generating type-safe Rust code

Setup Options

Option 1: Online Mode (Recommended for Development)

Set the DATABASE_URL environment variable to enable compile-time verification:

# Local development
export DATABASE_URL="postgresql://localhost:5432/backfill"

# Or in .env file
DATABASE_URL=postgresql://localhost:5432/backfill

Pros:

Always up-to-date with current schema
Works great in CI with test databases
No additional files to maintain

Cons:

Requires database connection during compilation
Slower compilation times

Option 2: Offline Mode (Recommended for Distribution)

Generate query metadata offline using:

# First, ensure DATABASE_URL is set
export DATABASE_URL="postgresql://localhost:5432/backfill"

# Generate the metadata
cargo sqlx prepare

# This creates .sqlx/sqlx-data.json

Pros:

No database required for compilation
Faster compilation
Can be committed to version control
Good for distributing crates

Cons:

Must remember to regenerate after schema changes
Metadata can become stale

Option 3: Hybrid Approach (Best of Both Worlds)

Use online mode in development/CI and offline mode for releases:

# Development: Use online mode
export DATABASE_URL="postgresql://localhost:5432/backfill"
cargo build

# Before release: Generate offline metadata
cargo sqlx prepare
git add .sqlx/sqlx-data.json
git commit -m "Update SQLx query metadata"

# CI: Check metadata is current
cargo sqlx prepare --check

CI/CD Integration

Example GitHub Actions workflow:

name: Test

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:17
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: backfill
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
          - 5432:5432

    steps:
    - uses: actions/checkout@v4
    - uses: actions-rs/toolchain@v1
      with:
        toolchain: stable

    - name: Check SQLx queries
      run: cargo sqlx prepare --check
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/backfill

    - name: Run tests
      run: cargo test
      env:
        DATABASE_URL: postgresql://postgres:postgres@localhost:5432/backfill

Best Practices

Always regenerate metadata after schema changes with cargo sqlx prepare
Commit .sqlx/sqlx-data.json for reproducible builds
Test both online and offline modes in CI
Keep DATABASE_URL in .env for local development

Performance Impact

Compile time: +10-30% with online mode, minimal with offline
Runtime: Zero overhead - queries are pre-compiled
Binary size: Slightly larger due to generated code
Safety: Eliminates entire classes of runtime errors

The safety benefits far outweigh the small compilation cost!

Summary

✅ No manual migrations needed - GraphileWorker handles everything
✅ Automatic schema creation - Just provide a connection string
✅ Production ready - Built-in migration management
✅ Flexible deployment - Works in any environment setup
✅ Non-intrusive - Uses separate schema from your app tables
✅ Type-safe queries - SQLx compile-time verification for safety

The backfill library makes job queues as easy as connecting to a database!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database Setup and GraphileWorker Contract

Automatic Schema Management

Database Schema Structure

Core Tables

Key Schema Details

Database Requirements

Minimum Requirements

Recommended Setup

Connection String Examples

What Happens on First Run

Development vs Production

Development

Production

Production Deployment Options

Schema Customization

Custom Schema Name

Multiple Environments

Troubleshooting

Permission Errors

Connection Issues

Schema Conflicts

Integration with Existing Applications

Alongside Other Tables

Shared Database

SQLx Compile-Time Query Verification

Overview

Setup Options

Option 1: Online Mode (Recommended for Development)

Option 2: Offline Mode (Recommended for Distribution)

Option 3: Hybrid Approach (Best of Both Worlds)

CI/CD Integration

Best Practices

Performance Impact

Summary

FilesExpand file tree

01-database-setup.md

Latest commit

History

01-database-setup.md

File metadata and controls

Database Setup and GraphileWorker Contract

Automatic Schema Management

Database Schema Structure

Core Tables

Key Schema Details

Database Requirements

Minimum Requirements

Recommended Setup

Connection String Examples

What Happens on First Run

Development vs Production

Development

Production

Production Deployment Options

Schema Customization

Custom Schema Name

Multiple Environments

Troubleshooting

Permission Errors

Connection Issues

Schema Conflicts

Integration with Existing Applications

Alongside Other Tables

Shared Database

SQLx Compile-Time Query Verification

Overview

Setup Options

Option 1: Online Mode (Recommended for Development)

Option 2: Offline Mode (Recommended for Distribution)

Option 3: Hybrid Approach (Best of Both Worlds)

CI/CD Integration

Best Practices

Performance Impact

Summary