> ## Documentation Index
> Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Solana Sources

> Build real-time pipelines with Solana blockchain data

## Overview

Goldsky provides curated Solana datasets with full historical data, making it easy to build pipelines for blocks, transactions, instructions, and token activity. All datasets are pre-processed and optimized for common use cases.

<Note>
  **Turbo only**: Solana datasets are exclusively available on Turbo. They are not supported in Mirror v1 pipelines.
</Note>

<Note>
  **Solana configuration differs from EVM chains:**

  * Solana uses `start_block` (slot number) instead of `start_at`. Omit `start_block` to start from the latest slot.
  * Use `end_block` (not SQL `WHERE` clauses) to bound a Solana job mode pipeline.
  * `in_order` mode is available for Solana sources (not available for EVM)
  * Batch settings are not available for Solana sources
</Note>

## Quick Start

Get started with Solana data - choose the dataset that fits your use case:

**For token transfers:**

```yaml theme={null}
sources:
  solana_token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000
```

**For transaction + instruction analysis:**

```yaml theme={null}
sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000
```

**Available datasets:**

* `solana.blocks` - Block data with leader info
* `solana.transactions` - Transaction data with accounts and balances
* `solana.transactions_with_instructions` - Transactions with nested instruction arrays
* `solana.instructions` - Individual instructions (one row per instruction)
* `solana.token_transfers` - SPL token transfers
* `solana.native_balances` - SOL balance changes
* `solana.token_balances` - SPL token balance changes
* `solana.rewards` - Records of rewards distributed to validators for securing and validating the Solana network.

## Starting position

Solana sources use `start_block` to specify a starting slot number. This differs from EVM chains which use `start_at: latest` or `start_at: earliest`.

```yaml theme={null}
sources:
  # Start from a specific slot
  solana_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

  # Start from the latest slot (omit start_block)
  solana_live:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    # No start_block = start from latest
```

<Info>
  To start from the latest slot on Solana, simply omit the `start_block` parameter. This is different from EVM chains where you would use `start_at: latest`.
</Info>

## Multiple block ranges

Solana sources accept an optional `block_ranges` field so a single pipeline can process several disjoint slot windows. This is useful for backfilling specific historical ranges without replaying everything in between, or for splitting a large backfill into several sharded pipelines.

**Syntax:** `block_ranges` takes a JSON-encoded string (not a YAML list) of `[start, end]` pairs. Both bounds are inclusive.

```yaml theme={null}
sources:
  targeted_backfill:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    block_ranges: "[[300000000, 300000099], [300100000, 300100099]]"
```

**Rules:**

* Ranges must be non-empty, non-overlapping, and strictly increasing. The engine panics at startup if these invariants are violated.
* A range's `start` must be at or after the network's earliest available block.
* `block_ranges` takes precedence over `start_block` / `end_block`. If you set both, the legacy fields are ignored and a warning is logged.
* After a checkpoint restore, the engine skips ahead to the next slot that falls inside any remaining range.

**Use with `job: true`** to run a bounded backfill that terminates cleanly once every range is processed — the pipeline exits after the epoch covering the final range's end slot finalizes:

```yaml theme={null}
name: solana-range-backfill
job: true
resource_size: m

sources:
  disjoint_ranges:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    block_ranges: "[[300000000, 300000099], [300100000, 300100099]]"

sinks:
  output:
    type: postgres
    from: disjoint_ranges
    schema: public
    table: solana_backfill
    secret_name: MY_POSTGRES
    primary_key: id
```

<Warning>
  Changing `block_ranges` on a running pipeline triggers a rewind, just like changing `start_block` or `end_block`. See [Source checkpoints and rewinds](#source-checkpoints-and-rewinds).
</Warning>

## Guide: Track Specific SPL Tokens

Monitor transfers for specific tokens like USDC:

```yaml theme={null}
name: solana-usdc-tracker
resource_size: s

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to USDC transfers only
  usdc_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        token_mint_address,
        from_token_account,
        to_token_account,
        amount,
        block_slot,
        block_timestamp,
        signature as transaction_signature,
        _gs_op
      FROM token_transfers
      WHERE token_mint_address = 'EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v'

sinks:
  postgres_usdc:
    type: postgres
    from: usdc_transfers
    schema: public
    table: solana_usdc_transfers
    secret_name: MY_POSTGRES
    primary_key: id
```

## Guide: Track Large SOL Balance Changes

Monitor accounts with significant SOL movement:

```yaml theme={null}
name: sol-whale-tracker
resource_size: s

sources:
  native_balances:
    type: dataset
    dataset_name: solana.native_balances
    version: 1.0.0
    start_block: 312000000

transforms:
  # Find large balance changes (>100 SOL)
  large_changes:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        account,
        amount_before,
        amount_after,
        amount_after - amount_before as net_change,
        block_slot,
        _gs_op
      FROM native_balances
      WHERE ABS(amount_after - amount_before) > 100

sinks:
  postgres_whales:
    type: postgres
    from: large_changes
    schema: public
    table: sol_whale_activity
    secret_name: MY_POSTGRES
    primary_key: id
```

## Guide: Decode Program Instructions

Decode Solana program instructions using IDL (Interface Definition Language):

```yaml theme={null}
name: raydium-swap-tracker
resource_size: m

sources:
  instructions:
    type: dataset
    dataset_name: solana.instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Decode Raydium instructions using IDL
  decoded_instructions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        program_id,
        _gs_decode_instruction_data(
          _gs_fetch_abi('https://gist.githubusercontent.com/jeffling/a5fbae53f47570c0e66980f9229fc83d/raw/02f3bd30b742fb1b1af0fbb40897aeeb77c7b941/raydium-swap-idl.json', 'raw'),
          data
        ) as decoded,
        accounts,
        block_slot,
        block_timestamp,
        signature
      FROM instructions
      WHERE program_id = 'CPMMoo8L3F4NbTegBCKVNunggL7H1ZpdTHKxQB5qKP1C'

  # Extract decoded fields from the name and result JSON
  parsed_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        decoded.name as instruction_name,
        decoded.value as instruction_value,
        accounts[5] as token_in_account,
        accounts[6] as token_out_account
      FROM decoded_instructions
      where decoded.name = 'swap_base_input'
      or decoded.name = 'swap_base_output'

sinks:
  postgres_swaps:
    type: postgres
    from: parsed_swaps
    schema: public
    table: raydium_swaps
    secret_name: MY_POSTGRES
    primary_key: id
```

<Info>
  **Decoding functions:**

  * `_gs_decode_instruction_data(idl, data)` - Decode instruction data using an IDL
  * `_gs_decode_log_message(idl, log_messages)` - Decode program log messages using an IDL
  * `_gs_fetch_abi(url, 'raw')` - Fetch IDL from a URL

  The decoded result includes the instruction/event name and parameters.
</Info>

## Guide: Decode Program Log Messages

Decode Solana program log messages to extract structured event data. This is useful for tracking program events like swaps, liquidations, or other on-chain actions that emit logs.

```yaml theme={null}
name: drift-decoded-logs
resource_size: m

sources:
  transactions_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 380000000

transforms:
  # Decode Drift protocol log messages
  decoded_logs:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        block_slot,
        block_timestamp,
        signature,
        fee,
        log_messages,
        _gs_decode_log_message(
          _gs_fetch_abi('https://raw.githubusercontent.com/drift-labs/protocol-v2/master/sdk/src/idl/drift.json', 'raw'),
          log_messages
        ) as log_messages_decoded,
        _gs_op
      FROM transactions_with_instructions
      WHERE
        -- Filter to transactions with Drift program instructions
        array_length(
          array_filter(instructions, 'program_id', 'dRiftyHA39MWEi3m9aunc5MzRF1JYuBsbn6VPcn33UH')
        ) > 0
        AND array_length(log_messages) > 0
        AND status = 1

sinks:
  postgres_logs:
    type: postgres
    from: decoded_logs
    schema: public
    table: drift_decoded_logs
    secret_name: MY_POSTGRES
    primary_key: id
```

<Tip>
  Log message decoding works best with Anchor-based programs that emit structured events. The IDL must match the program version to decode correctly.
</Tip>

## Guide: Analyze Transaction Success Rates

Track transaction patterns and success rates using SQL:

```yaml theme={null}
name: transaction-analytics
resource_size: m

sources:
  transactions:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Categorize transactions by fee and success
  tx_analysis:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_slot,
        block_timestamp,
        status = 1 as is_successful,
        fee,
        compute_units_consumed,
        CASE
          WHEN fee < 5000 THEN 'low'
          WHEN fee < 50000 THEN 'medium'
          ELSE 'high'
        END as fee_category,
        CAST(compute_units_consumed AS DOUBLE) / CAST(fee AS DOUBLE) as compute_efficiency,
        _gs_op
      FROM transactions

sinks:
  postgres_analytics:
    type: postgres
    from: tx_analysis
    schema: public
    table: solana_tx_analytics
    secret_name: MY_POSTGRES
    primary_key: id
```

## Guide: Track Specific Programs

Monitor all instructions for a specific program using SQL:

```yaml theme={null}
name: jupiter-tracker
resource_size: m

sources:
  instructions:
    type: dataset
    dataset_name: solana.instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to Jupiter program instructions
  jupiter_instructions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        program_id,
        data,
        accounts,
        block_slot,
        block_timestamp,
        signature,
        _gs_op
      FROM instructions
      WHERE program_id = 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4'

sinks:
  postgres_jupiter:
    type: postgres
    from: jupiter_instructions
    schema: public
    table: jupiter_instructions
    secret_name: MY_POSTGRES
    primary_key: id
```

## Guide: Multi-Account Monitoring with Dynamic Tables

Track transfers involving specific accounts:

```yaml theme={null}
name: account-tracker
resource_size: m

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

transforms:
  # Dynamic table for tracked accounts
  tracked_accounts:
    type: dynamic_table
    backend_type: Postgres
    backend_entity_name: tracked_accounts
    secret_name: MY_POSTGRES

  # Filter to tracked accounts
  monitored_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        token_mint_address,
        from_token_account,
        to_token_account,
        amount,
        decimals,
        CASE
          WHEN dynamic_table_check('tracked_accounts', from_token_account) THEN 'outgoing'
          WHEN dynamic_table_check('tracked_accounts', to_token_account) THEN 'incoming'
        END as direction,
        block_slot,
        block_timestamp,
        _gs_op
      FROM token_transfers
      WHERE dynamic_table_check('tracked_accounts', from_token_account)
         OR dynamic_table_check('tracked_accounts', to_token_account)

sinks:
  postgres_monitored:
    type: postgres
    from: monitored_transfers
    schema: public
    table: monitored_transfers
    secret_name: MY_POSTGRES
    primary_key: id
```

**Add accounts to track:**

```sql theme={null}
-- Add Solana account to monitor
INSERT INTO turbo.tracked_accounts VALUES
  ('7xKXtg2CW87d97TXJSDpbD5jBkheTqA83TZRuJosgAsU');
```

## Guide: Working with Transactions and Instructions Together

The `transactions_with_instructions` dataset provides a transaction-centric view with all instructions nested in an array. This is ideal when you need both transaction-level data and instruction details without joining separate datasets.

### When to Use transactions\_with\_instructions

<CardGroup cols={2}>
  <Card title="Use transactions_with_instructions" icon="check">
    * Analyzing multi-instruction transactions - Counting instructions per
      transaction - Transaction-level aggregations with instruction filtering -
      Examining instruction sequences within transactions
  </Card>

  <Card title="Use separate datasets" icon="circle-xmark">
    * Simple instruction filtering by program - Individual instruction analysis
    * Better SQL performance for instruction-only queries - Joining instructions
      with other data
  </Card>
</CardGroup>

### Schema Overview

Each row represents one transaction with nested arrays:

**Transaction fields:** `id`, `index`, `block_slot`, `block_hash`, `block_timestamp`, `accounts`, `balance_changes`, `pre_token_balances`, `post_token_balances`, `recent_block_hash`, `signature`, `err`, `status`, `compute_units_consumed`, `fee`, `log_messages`

**Nested instruction array:** `instructions` - Array of instruction structs, each containing:

* `id`, `index`, `parent_index` (null for top-level, set for inner instructions), `signature`, `block_slot`, `block_timestamp`, `block_hash`, `tx_fee`, `tx_index`
* `program_id`, `data`, `accounts`
* `status`, `err`

You can see a sample of the transactions\_with\_instructions dataset [here](https://gist.githubusercontent.com/JavierTrujilloG/4cae29a7d993d9c0f476de2cb0cea437/raw/38c8268afd2a18d79d83deedc788d72f025cb3fe/transactions-with-instructions-sample)

### Example: Analyze Multi-Step Swap Transactions

Find transactions with multiple Jupiter swap instructions:

```yaml theme={null}
name: jupiter-multi-hop-swaps
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to Jupiter transactions and count instructions
  jupiter_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_slot,
        block_timestamp,
        fee,
        status,
        compute_units_consumed,
        -- Count Jupiter instructions in this transaction
        array_length(
          array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')
        ) as jupiter_instruction_count,
        -- Get all instructions for analysis
        instructions,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(
        array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')
      ) > 0

  # Filter to multi-hop swaps (more than 1 Jupiter instruction)
  multi_hop_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        jupiter_instruction_count,
        fee,
        compute_units_consumed,
        CAST(compute_units_consumed AS DOUBLE) / CAST(fee AS DOUBLE) as compute_efficiency,
        _gs_op
      FROM jupiter_swaps
      WHERE jupiter_instruction_count > 1

sinks:
  postgres_swaps:
    type: postgres
    from: multi_hop_swaps
    schema: public
    table: jupiter_multi_hop_swaps
    secret_name: MY_POSTGRES
    primary_key: signature
```

### Example: Examine Inner Instructions

Analyze transactions with inner instructions (Cross-Program Invocations):

```yaml theme={null}
name: inner-instruction-analyzer
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Identify transactions with inner instructions
  tx_with_cpi:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        -- Total instruction count
        array_length(instructions) as total_instructions,
        -- Count top-level instructions (parent_index is null)
        array_length(
          array_filter(instructions, 'parent_index', null)
        ) as top_level_count,
        -- Calculate inner instruction count
        array_length(instructions) -
        array_length(
          array_filter(instructions, 'parent_index', null)
        ) as inner_instruction_count,
        fee,
        status,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(instructions) >
            array_length(
              array_filter(instructions, 'parent_index', null)
            )

sinks:
  postgres_cpi:
    type: postgres
    from: tx_with_cpi
    schema: public
    table: transactions_with_cpi
    secret_name: MY_POSTGRES
    primary_key: id
```

### Example: Transaction Success Analysis by Program

Analyze transaction success rates grouped by the programs involved:

```yaml theme={null}
name: program-success-analysis
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Extract program involvement
  program_transactions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        status = 1 as is_successful,
        -- Get first instruction's program (typically the main program)
        instructions[1]['program_id'] as primary_program,
        -- Count unique programs in transaction
        array_length(instructions) as instruction_count,
        fee,
        compute_units_consumed,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(instructions) > 0

sinks:
  postgres_analysis:
    type: postgres
    from: program_transactions
    schema: public
    table: program_transaction_analysis
    secret_name: MY_POSTGRES
    primary_key: id
```

### Working with Nested Instructions

To access individual instructions from the `instructions` array:

**Array indexing (1-based):**

```sql theme={null}
instructions[1]           -- First instruction
instructions[1].program_id -- First instruction's program
instructions[1].data      -- First instruction's data
```

**Filtering arrays:**

```sql theme={null}
-- Get only Jupiter instructions
array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')

-- Get only top-level instructions
array_filter(instructions, 'parent_index', null)

-- Get first token program instruction
array_filter_first(instructions, 'program_id', 'TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA')
```

**Counting:**

```sql theme={null}
-- Total instructions in transaction
array_length(instructions)

-- Instructions for specific program
array_length(array_filter(instructions, 'program_id', 'JUP6...'))
```

<Tip>
  Use `array_filter()` and `array_filter_first()` SQL functions to work
  efficiently with the nested instruction arrays. See the [SQL Functions
  Reference](/turbo-pipelines/reference/sql-functions#array-processing-functions)
  for more details.
</Tip>

## Guide: Recreating Solana's Transaction Counter

Stream every Solana transaction in real-time to power a live counter, similar to the `Total transactions to date` counter on [Solana's homepage](https://solana.com/). This example includes any and all transactions, including failed ones.

```yaml theme={null}
name: solana-transaction-counter
resource_size: m

sources:
  transactions:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0

transforms:
  # Filter transactions to reduce the payload
  minimal_txns:
    type: sql
    primary_key: signature
    sql: |
      SELECT
        signature,
        block_slot,
        block_timestamp
      FROM transactions

sinks:
  postgres_transactions: 
    type: postgres
    from: minimal_txns
    schema: public
    table: solana_transactions_counter
    secret_name: MY_POSTGRES
    primary_key: signature
```

## Source checkpoints and rewinds

Turbo pipelines use checkpoints to track processing progress. Understanding how checkpoints work helps you avoid unintended rewinds when updating pipeline configurations.

### How checkpoints work

Checkpoints use a hash of your source configuration (`start_block`, `end_block`, and `block_ranges`) to identify the user's intent:

* **Same hash**: The pipeline resumes from the existing checkpoint
* **Different hash**: The pipeline treats this as a "source change" and rewinds to the new `start_block`

### Preventing unintended rewinds

A common issue occurs when re-applying a pipeline with a different `start_block` than the currently running configuration. For example:

```yaml theme={null}
# Original configuration (pipeline already running at slot 315000000)
sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000  # Original start block
```

If you re-apply with an earlier `start_block`, the pipeline will rewind:

```yaml theme={null}
# Re-applied with earlier start block - triggers rewind!
sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 310000000  # Earlier block triggers source change
```

<Warning>
  Changing `start_block`, `end_block`, or `block_ranges` triggers a deliberate rewind. The pipeline will restart from the new `start_block`, reprocessing all data.
</Warning>

### Best practices

<Steps>
  <Step title="Fetch the current configuration before re-applying">
    Always check the current pipeline definition before making changes:

    **Using the CLI:**

    ```bash theme={null}
    goldsky turbo get <pipeline-name> --output yaml
    ```

    **Using the UI:**
    Navigate to your pipeline in the dashboard to view the current configuration.
  </Step>

  <Step title="Keep start_block consistent">
    When updating other pipeline settings (transforms, sinks, resource size), keep the `start_block` the same as the running configuration to preserve your checkpoint.
  </Step>
</Steps>

<Info>
  If you intentionally want to reprocess data from a specific block, changing the `start_block` is the correct approach. The rewind behavior is by design to give you control over reprocessing.
</Info>

## Filtering by Account or Program

When you only need data for specific accounts or programs, add a `filter` to your source configuration. This enables **fast scan mode**, which skips irrelevant slots during backfills by querying an index of which accounts and programs are active in each slot. This can dramatically speed up historical data processing.

```yaml theme={null}
sources:
  token_balances:
    type: dataset
    dataset_name: solana.token_balances
    version: 1.0.0
    start_block: 312000000
    filter:
      account_ids: "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v"
```

### Filter syntax

The `filter` parameter is a YAML mapping with two optional fields:

| Field         | Description                                            |
| ------------- | ------------------------------------------------------ |
| `account_ids` | Comma-separated list of account addresses to filter by |
| `program_ids` | Comma-separated list of program addresses to filter by |

You can specify one or both fields. When both are provided, slots containing activity from **either** the specified accounts **or** the specified programs are processed (the conditions are OR'd together).

```yaml theme={null}
# Filter by accounts only
filter:
  account_ids: "addr1,addr2,addr3"

# Filter by program only
filter:
  program_ids: "progAddr"

# Filter by both accounts and programs
filter:
  account_ids: "addr1,addr2"
  program_ids: "progAddr1,progAddr2"
```

### Example: Backfill token balances for specific accounts

```yaml theme={null}
name: account-balance-backfill
resource_size: m

sources:
  token_balances:
    type: dataset
    dataset_name: solana.token_balances
    version: 1.0.0
    start_block: 312000000
    filter:
      account_ids: "5iEFdmkmdEbzfNmBdRioRUQJiPVNwsJaX4Nmh4HBZH9d,px1rbjiEWwwcq1epXSsTMJERyQy7h4vg4VopqFz2HwH"

transforms: {}

sinks:
  postgres_balances:
    type: postgres
    from: token_balances
    schema: public
    table: tracked_token_balances
    secret_name: MY_POSTGRES
    primary_key: id
```

### Example: Backfill transactions for a specific program

```yaml theme={null}
name: program-transaction-backfill
resource_size: m

sources:
  transactions:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000
    filter:
      program_ids: "6EF8rrecthR5Dkzon8Nwu78hRvfCKubJ14M5uBEwF6P"

transforms: {}

sinks:
  postgres_txns:
    type: postgres
    from: transactions
    schema: public
    table: program_transactions
    secret_name: MY_POSTGRES
    primary_key: id
```

<Info>
  The `filter` parameter speeds up **backfills only** — when processing historical data from a `start_block`. During real-time processing (when the pipeline has caught up to the chain tip), all slots are processed regardless of the filter.
</Info>

## Performance Tips

<AccordionGroup>
  <Accordion title="Choose the right dataset">
    Use the most specific dataset:

    * Token transfers? Use `solana.token_transfers`
    * Balance changes? Use `solana.native_balances` or `solana.token_balances`
    * Transaction-level analysis with instruction details? Use `solana.transactions_with_instructions`
    * Individual instruction analysis? Use `solana.instructions`
    * Transaction metadata only? Use `solana.transactions`
  </Accordion>

  <Accordion title="Use source filters for backfills">
    If you only need data for specific accounts or programs, use the [`filter` parameter](#filtering-by-account-or-program) on your source to skip irrelevant slots during backfills:

    ```yaml theme={null}
    sources:
      instructions:
        type: dataset
        dataset_name: solana.instructions
        version: 1.0.0
        start_block: 312000000
        filter:
          program_ids: "JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4"
    ```

    This is much faster than filtering in SQL transforms alone, as it skips entire slots that don't contain relevant data.
  </Accordion>

  <Accordion title="Filter early in SQL transforms">
    Apply filters in SQL as early as possible:

    ```yaml theme={null}
    transforms:
      # SQL filter first
      filtered:
        type: sql
        sql: SELECT * FROM instructions WHERE program_id = '...'

      # Then decode
      decoded:
        type: sql
        from: filtered
        sql: |
          SELECT
            *,
            _gs_decode_instruction_data(
              _gs_fetch_abi('https://api.example.com/idl.json', 'raw'),
              instruction_data
            ) as decoded
          FROM filtered
    ```
  </Accordion>

  <Accordion title="Use appropriate resource sizes">
    Solana has high throughput. Start with medium or large:

    ```yaml theme={null}
    resource_size: m  # or l for high-volume
    ```
  </Accordion>
</AccordionGroup>
