Skip to main content

Overview

By default, pipelines run as long-running deployments that continuously process data. Job mode allows you to run a pipeline as a one-time task that runs to completion and then exits.
Automatic pipeline termination is currently only supported for Solana dataset sources using the end_block configuration. For other dataset sources (EVM chains, etc.), the pipeline will not terminate on its own — see limitations below.

When to use job mode

Use job mode for:
  • Historical Solana backfills: Process a specific range of Solana blocks using end_block
  • One-time data migrations: Move data from one system to another (requires manual termination for non-Solana sources)
  • Testing and development: Quick runs without maintaining a long-running deployment

Configuration

Add job: true to your pipeline configuration along with end_block on your Solana source:
name: solana-token-backfill
resource_size: m
job: true

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 250000000
    end_block: 250000002

transforms:
  filtered_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM token_transfers

sinks:
  postgres_output:
    type: postgres
    from: filtered_transfers
    schema: public
    table: token_transfers
    secret_name: MY_POSTGRES
    primary_key: id

How termination works

For Solana sources with end_block set:
  1. The source processes blocks up to end_block
  2. The engine waits for all checkpoints to finalize, ensuring data is fully persisted
  3. The pipeline terminates cleanly
Setting job: true deploys the pipeline as a Kubernetes Job instead of a Deployment, which means:
  • No automatic restarts on failure
  • Auto-cleanup 1 hour after the process exits

Job behavior

When job: true is set:
  1. No restarts: Failed jobs do not automatically restart (unlike deployments)
  2. Auto-cleanup: Jobs are automatically deleted 1 hour after termination (success or failure)
  3. Cannot switch: Once deployed as a job, you cannot update to a deployment without deleting it first
Job mode pipelines must be deleted before redeploying. If you try to deploy a pipeline that exists as a job, you’ll receive a conflict error. Use goldsky turbo delete <pipeline-name> first.

Example: Solana block range processing

name: solana-backfill
resource_size: l
job: true

sources:
  solana_txs:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000
    end_block: 312100000

transforms:
  processed_txs:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM solana_txs

sinks:
  clickhouse_archive:
    type: clickhouse
    from: processed_txs
    table: solana_historical_txs
    primary_key: id
    secret_name: MY_CLICKHOUSE

Limitations

Job mode automatic termination (end_block) is currently only supported for Solana dataset sources. For other sources (EVM chains like Ethereum, Arbitrum, etc.):
  • The end_block configuration is not recognized by EVM dataset sources
  • SQL-level filtering (e.g., WHERE block_number BETWEEN ...) filters data but does not cause the pipeline to terminate — the source continues consuming data indefinitely
  • Setting job: true deploys as a Kubernetes Job, but the pipeline process itself will not exit
If you need to run a bounded job on a non-Solana dataset, you must manually stop the pipeline after it has processed the data you need:
goldsky turbo delete <pipeline-name>

Monitoring job status

View job status with:
# Check job status
goldsky turbo list

# View job logs
goldsky turbo logs <pipeline-name>

Job vs deployment

FeatureJob (job: true)Deployment (job: false, default)
DurationRuns to completion (Solana only)Continuous processing
Restart on failureNoYes (automatic)
Resource cleanupAuto-deleted 1 hour after terminationPersists until deleted
Use caseOne-time Solana backfillsReal-time streaming
Update behaviorMust delete firstCan update in-place

Best practices

For Solana dataset sources, always define start_block and end_block on the source to ensure the pipeline terminates cleanly:
sources:
  solana_data:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 250000000
    end_block: 251000000
For large backfills, use larger resource sizes to process data faster:
resource_size: l  # Use large for big historical jobs
Jobs auto-delete after 1 hour of termination. Monitor logs before cleanup:
goldsky turbo logs <pipeline-name>
Always delete completed jobs before redeploying:
goldsky turbo delete <pipeline-name>
goldsky turbo apply -f <pipeline-file>