> ## Documentation Index
> Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Job mode

> Run pipelines as one-time tasks instead of continuous streams

## Overview

By default, pipelines run as long-running deployments that continuously process data. Job mode allows you to run a pipeline as a one-time task that runs to completion and then exits.

<Note>
  Job mode requires sources that can signal completion. **Solana datasets** always support this, as do **EVM datasets configured for [fast scan](/turbo-pipelines/sources/evm#fast-scan)** (a `filter:` expression with `start_at: earliest` or omitted). Plain EVM datasets without fast scan cannot be used with `job: true` — see [limitations](#limitations) below.
</Note>

## When to use job mode

Use job mode for:

* **Historical Solana backfills**: Process a specific range of Solana blocks using `end_block` — the pipeline self-terminates when the range completes.
* **Historical EVM backfills**: Backfill filtered EVM data using [fast scan](/turbo-pipelines/sources/evm#fast-scan). Include an upper bound on `block_number` inside the `filter:` expression (e.g., `... AND block_number <= 20000000`) and the pipeline self-terminates once the bounded scan is complete. `end_block` is **not** supported on EVM sources.
* **One-time data migrations**: Move data from one system to another.
* **Testing and development**: Quick runs without maintaining a long-running deployment.

## Configuration

Add `job: true` to your pipeline configuration along with `end_block` on your Solana source:

```yaml theme={null}
name: solana-token-backfill
resource_size: m
job: true

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 250000000
    end_block: 250000002

transforms:
  filtered_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM token_transfers

sinks:
  postgres_output:
    type: postgres
    from: filtered_transfers
    schema: public
    table: token_transfers
    secret_name: MY_POSTGRES
    primary_key: id
```

## How termination works

Job mode requires every source in the pipeline to be **bounded** — to have a finite end so the pipeline knows when it's done. The engine rejects `job: true` at deploy time otherwise with an error naming the offending sources.

Two ways to produce a bounded source:

* **Solana datasets** are bounded by `start_block` + `end_block` (or `block_ranges` for multiple windows).
* **EVM datasets with [fast scan](/turbo-pipelines/sources/evm#fast-scan)** — a `filter:` expression with `start_at: earliest` (or omitted) — are bounded by including an upper limit on `block_number` inside the `filter:` (e.g., `... AND block_number <= 20000000`). The top-level `end_block` field is **not** supported on EVM dataset sources; it is silently ignored.

For a bounded source:

1. The source processes blocks up to and including the upper bound (Solana `end_block` or the `block_number` limit in an EVM `filter:`).
2. The engine waits for the checkpoint covering that bound to finalize, ensuring data is fully persisted to sinks.
3. The pipeline process exits cleanly.

Setting `job: true` deploys the pipeline as a Kubernetes Job instead of a Deployment, which means:

* **No automatic restarts** on failure (`backoff_limit: 0`).
* **Auto-cleanup** 1 hour after the process exits (success or failure).

## Job behavior

When `job: true` is set:

1. **No restarts**: Failed jobs do not automatically restart (unlike deployments)
2. **Auto-cleanup**: Jobs are automatically deleted 1 hour after termination (success or failure)
3. **No restart command**: `goldsky turbo restart` is not supported for jobs — delete and re-apply instead
4. **Cannot switch modes in place**: A pipeline name deployed as a job cannot be redeployed as a deployment (or vice versa) — you must delete it first

<Warning>
  Switching between job and deployment mode requires deleting the existing pipeline first. If you try to deploy a pipeline with `job: false` over an existing job (or vice versa), you'll receive a conflict error. Run `goldsky turbo delete <pipeline-name>` before redeploying with the new mode.
</Warning>

## Example: Solana block range processing

```yaml theme={null}
name: solana-backfill
resource_size: l
job: true

sources:
  solana_txs:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000
    end_block: 312100000

transforms:
  processed_txs:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM solana_txs

sinks:
  clickhouse_archive:
    type: clickhouse
    from: processed_txs
    table: solana_historical_txs
    primary_key: id
    secret_name: MY_CLICKHOUSE
```

<Tip>
  To backfill multiple disjoint Solana slot windows in a single job, use [`block_ranges`](/turbo-pipelines/sources/solana#multiple-block-ranges) instead of `start_block`/`end_block`. It accepts a JSON array of `[start, end]` pairs and terminates cleanly once every range has been processed.
</Tip>

## Example: EVM fast-scan backfill

```yaml theme={null}
name: base-usdc-backfill
resource_size: m
job: true

sources:
  base_usdc_logs:
    type: dataset
    dataset_name: base.raw_logs
    version: 1.0.0
    start_at: earliest
    filter: address = '0x833589fcd6edb6e08f4c7c32d4f71b54bda02913' AND block_number <= 20000000

transforms:
  parsed_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT *
      FROM base_usdc_logs

sinks:
  postgres_output:
    type: postgres
    from: parsed_transfers
    schema: public
    table: base_usdc_transfers_backfill
    primary_key: id
    secret_name: MY_POSTGRES
```

The `filter` + `start_at: earliest` combination enables [fast scan](/turbo-pipelines/sources/evm#fast-scan), which bounds the source to a backfill over the Goldsky data lake. The `block_number <= 20000000` clause inside the filter makes the backfill finite, so the pipeline terminates cleanly once block 20,000,000 is processed.

<Note>
  EVM dataset sources do **not** support the top-level `end_block` field — it is silently ignored. Always express the upper block bound inside the `filter:` expression. `end_block` works for Solana sources only.
</Note>

## Limitations

Job mode requires every source in the pipeline to be bounded. In practice that means:

* **Plain EVM datasets cannot use job mode.** Without a `filter:`, an EVM dataset is a continuous stream from the chain tip with no end. Applying it with `job: true` is rejected at deploy time with a validation error.
* **Not every chain supports fast scan.** Even with a `filter:`, EVM job mode only works on chains where the underlying datasets support fast scan. Check the **Fast Scan** column in [supported chains](/chains/supported-networks) — chains marked `✗`, or specific datasets marked with an asterisk `*`, do not support fast scan and cannot be used with `job: true`.
* **SQL-level filtering does not bound a source.** `WHERE block_number BETWEEN ...` in a transform filters rows but does not cause the source to stop — the pipeline keeps consuming data indefinitely. To terminate, the bound must be on the source itself: a `block_number` clause inside an EVM `filter:` (fast scan), or `end_block` on a Solana source.
* **`end_block` is Solana-only.** The top-level `end_block` field on EVM dataset sources is silently ignored. Express the upper bound inside the `filter:` expression.

To run a bounded EVM job, configure fast scan on the source and include a `block_number` upper bound in the filter:

```yaml theme={null}
sources:
  filtered_base_logs:
    type: dataset
    dataset_name: base.raw_logs
    version: 1.0.0
    start_at: earliest
    filter: address = '0x21552aeb494579c772a601f655e9b3c514fda960' AND block_number <= 25000000
```

If your use case can't be expressed with fast scan (e.g., you need the full, unfiltered dataset), run the pipeline as a streaming deployment and delete it once the data you need is ingested:

```bash theme={null}
goldsky turbo delete <pipeline-name>
```

## Monitoring job status

View job status with:

```bash theme={null}
# Check job status (shows JOB_MODE column)
goldsky turbo list

# View job logs (only available while the job is running or after it has Succeeded)
goldsky turbo logs <pipeline-name>
```

<Note>
  `--follow` is not useful for job logs — for a completed job the stream returns immediately, and for a failed job logs are unavailable (status is `Failed`, which rejects log requests). Logs are also deleted when the job is auto-cleaned up 1 hour after termination, so retrieve any logs you need before that window closes.
</Note>

## Job vs deployment

| Feature                  | Job (`job: true`)                         | Deployment (`job: false`, default) |
| ------------------------ | ----------------------------------------- | ---------------------------------- |
| Duration                 | Runs to completion (bounded sources)      | Continuous processing              |
| Restart on failure       | No                                        | Yes (automatic)                    |
| Resource cleanup         | Auto-deleted 1 hour after termination     | Persists until deleted             |
| `goldsky turbo restart`  | Not supported                             | Supported                          |
| Use case                 | Solana backfills; EVM fast-scan backfills | Real-time streaming                |
| Switch to the other mode | Must delete first                         | Must delete first                  |

## Best practices

<AccordionGroup>
  <Accordion title="Bound the run so the pipeline terminates cleanly">
    For Solana sources, always define `start_block` and `end_block`:

    ```yaml theme={null}
    sources:
      solana_data:
        type: dataset
        dataset_name: solana.transactions
        version: 1.0.0
        start_block: 250000000
        end_block: 251000000
    ```

    For EVM sources, use fast scan (`filter` + `start_at: earliest`) and put the upper `block_number` bound inside the `filter:` expression. `end_block` is **not** supported on EVM dataset sources — it is silently ignored.

    ```yaml theme={null}
    sources:
      evm_data:
        type: dataset
        dataset_name: base.raw_logs
        version: 1.0.0
        start_at: earliest
        filter: address = '0x21552aeb494579c772a601f655e9b3c514fda960' AND block_number <= 25000000
    ```
  </Accordion>

  <Accordion title="Use appropriate resource sizes">
    For large backfills, use larger resource sizes to process data faster:

    ```yaml theme={null}
    resource_size: l  # Use large for big historical jobs
    ```
  </Accordion>

  <Accordion title="Monitor completion">
    Jobs auto-delete 1 hour after termination — fetch logs before the cleanup window closes:

    ```bash theme={null}
    goldsky turbo logs <pipeline-name>
    ```
  </Accordion>

  <Accordion title="Delete before redeployment">
    Always delete completed jobs before redeploying:

    ```bash theme={null}
    goldsky turbo delete <pipeline-name>
    goldsky turbo apply -f <pipeline-file>
    ```
  </Accordion>
</AccordionGroup>
