Skip to main content

Overview

By default, pipelines run as long-running deployments that continuously process data. Job mode allows you to run a pipeline as a one-time task that runs to completion and then exits.

When to Use Job Mode

Use job mode for:
  • Historical data backfills: Process a specific range of historical data once
  • One-time data migrations: Move data from one system to another
  • Scheduled batch processing: Run as a cron job for periodic processing
  • Testing and development: Quick runs without maintaining a long-running deployment

Configuration

Add job: true to your pipeline configuration:
name: solana-token-backfill
resource_size: m
job: true  # Run as a one-time job

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_at: earliest  # Process all historical data

transforms:
  filtered_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM token_transfers
      WHERE slot BETWEEN 250000000 AND 250000002

sinks:
  postgres_output:
    type: postgres
    from: filtered_transfers
    schema: public
    table: token_transfers
    secret_name: MY_POSTGRES
    primary_key: id

Job Behavior

When job: true is set:
  1. Runs to Completion: The pipeline processes all available data and then exits
  2. No Restarts: Failed jobs do not automatically restart (unlike deployments)
  3. Auto-Cleanup: Jobs are automatically deleted 1 hour after termination (success or failure)
  4. Cannot Switch: Once deployed as a job, you cannot update to a deployment without deleting it first
Job mode pipelines must be deleted before redeploying. If you try to deploy a pipeline that exists as a job, you’ll receive a conflict error. Use goldsky turbo delete <pipeline-name> first.

Example: EVM Historical Backfill

Process a specific range of Ethereum blocks:
name: ethereum-backfill
resource_size: l
job: true

sources:
  ethereum_transfers:
    type: dataset
    dataset_name: ethereum.erc20_transfers
    version: 1.0.0
    start_at: earliest

transforms:
  filtered_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM ethereum_transfers
      WHERE block_number BETWEEN 15000000 AND 16000000

sinks:
  postgres_output:
    type: postgres
    from: filtered_transfers
    schema: public
    table: historical_transfers
    secret_name: MY_POSTGRES
    primary_key: id

Example: Solana Block Range Processing

Process a specific range of Solana blocks:
name: solana-backfill
resource_size: l
job: true

sources:
  solana_txs:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000
    end_block: 312100000  # Job will stop after processing this block

transforms:
  processed_txs:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM solana_txs

sinks:
  clickhouse_archive:
    type: clickhouse
    from: processed_txs
    table: solana_historical_txs
    primary_key: id
    secret_name: MY_CLICKHOUSE

Monitoring Job Status

View job status with:
# Check job status
goldsky turbo list

# View job logs
goldsky turbo logs <pipeline-name>

# Check if job completed
goldsky turbo status <pipeline-name>

Job vs Deployment

FeatureJob (job: true)Deployment (job: false, default)
DurationRuns to completionContinuous processing
Restart on failureNoYes (automatic)
Resource cleanupAuto-deleted 1 hour after terminationPersists until deleted
Use caseOne-time tasks, backfillsReal-time streaming
Update behaviorMust delete firstCan update in-place

Best Practices

When using job mode, define clear start and end points for your data:
transforms:
  bounded_data:
    type: sql
    sql: |
      SELECT * FROM source
      WHERE block_number BETWEEN 15000000 AND 16000000
For large backfills, use larger resource sizes to process data faster:
resource_size: l  # Use large for big historical jobs
Jobs auto-delete after 1 hour of termination. Monitor logs before cleanup:
goldsky turbo logs <pipeline-name>
Always delete completed jobs before redeploying:
goldsky turbo delete <pipeline-name>
goldsky turbo apply -f <pipeline-file>

Next Steps