How to build real-time Solana data pipelines for tokens, transactions, and instructions

Prerequisites: Before starting, install the Turbo CLI extension and authenticate with goldsky login.

Solana: Solana datasets are exclusively available for Turbo pipelines and not available for Standard pipelines (Mirror v1 pipelines).

Overview

Goldsky provides curated Solana datasets with full historical data, making it easy to build pipelines for blocks, transactions, instructions, and token activity. All datasets are pre-processed and optimized for common use cases.

Quick Start

Get started with Solana data - choose the dataset that fits your use case: For token transfers:

sources:
  solana_token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

For transaction + instruction analysis:

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

Available datasets:

solana.blocks - Block data with leader info
solana.transactions - Transaction data with accounts and balances
solana.transactions_with_instructions - Transactions with nested instruction arrays
solana.instructions - Individual instructions (one row per instruction)
solana.token_transfers - SPL token transfers
solana.native_balances - SOL balance changes
solana.token_balances - SPL token balance changes

Guide: Track Specific SPL Tokens

Monitor transfers for specific tokens like USDC:

name: solana-usdc-tracker
resource_size: s

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to USDC transfers only
  usdc_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        token_mint_address,
        from_token_account,
        to_token_account,
        amount,
        block_slot,
        block_timestamp,
        signature as transaction_signature,
        _gs_op
      FROM token_transfers
      WHERE token_mint_address = 'EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v'

sinks:
  postgres_usdc:
    type: postgres
    from: usdc_transfers
    schema: public
    table: solana_usdc_transfers
    secret_name: MY_POSTGRES
    primary_key: id

Guide: Track Large SOL Balance Changes

Monitor accounts with significant SOL movement:

name: sol-whale-tracker
resource_size: s

sources:
  balance_changes:
    type: dataset
    dataset_name: solana.native_balances
    version: 1.0.0
    start_block: 312000000

transforms:
  # Find large balance changes (>100 SOL)
  large_changes:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        account,
        amount_before,
        amount_after,
        amount_after - amount_before as net_change,
        block_slot,
        _gs_op
      FROM balance_changes
      WHERE ABS(amount_after - amount_before) > 100

sinks:
  postgres_whales:
    type: postgres
    from: large_changes
    schema: public
    table: sol_whale_activity
    secret_name: MY_POSTGRES
    primary_key: id

Guide: Decode Program Instructions

Decode Solana program instructions using IDL (Interface Definition Language):

name: raydium-swap-tracker
resource_size: m

sources:
  instructions:
    type: dataset
    dataset_name: solana.instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Decode Raydium instructions using IDL
  decoded_instructions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        program_id,
        _gs_decode_instruction_data(
          _gs_fetch_abi('https://gist.githubusercontent.com/jeffling/a5fbae53f47570c0e66980f9229fc83d/raw/02f3bd30b742fb1b1af0fbb40897aeeb77c7b941/raydium-swap-idl.json', 'raw'),
          data
        ) as decoded,
        accounts,
        block_slot,
        block_timestamp,
        signature
      FROM instructions
      WHERE program_id = 'CPMMoo8L3F4NbTegBCKVNunggL7H1ZpdTHKxQB5qKP1C'

  # Extract decoded fields from the name and result JSON
  parsed_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        decoded.name as instruction_name,
        decoded.value as instruction_value,
        accounts[5] as token_in_account,
        accounts[6] as token_out_account
      FROM decoded_instructions
      where decoded.name = 'swap_base_input'
      or decoded.name = 'swap_base_output'

sinks:
  postgres_swaps:
    type: postgres
    from: parsed_swaps
    schema: public
    table: raydium_swaps
    secret_name: MY_POSTGRES
    primary_key: id

Decoding functions:

_gs_decode_instruction_data(idl, data) - Decode instruction data using an IDL
_gs_fetch_abi(url, 'raw') - Fetch IDL from a URL

The decoded result includes the instruction name and parameters.

Guide: Analyze Transaction Success Rates

Track transaction patterns and success rates using SQL:

name: transaction-analytics
resource_size: m

sources:
  transactions:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Categorize transactions by fee and success
  tx_analysis:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_slot,
        block_timestamp,
        status = 1 as is_successful,
        fee,
        compute_units_consumed,
        CASE
          WHEN fee < 5000 THEN 'low'
          WHEN fee < 50000 THEN 'medium'
          ELSE 'high'
        END as fee_category,
        CAST(compute_units_consumed AS DOUBLE) / CAST(fee AS DOUBLE) as compute_efficiency,
        _gs_op
      FROM transactions

sinks:
  postgres_analytics:
    type: postgres
    from: tx_analysis
    schema: public
    table: solana_tx_analytics
    secret_name: MY_POSTGRES
    primary_key: id

Guide: Track Specific Programs

Monitor all instructions for a specific program using SQL:

name: jupiter-tracker
resource_size: m

sources:
  instructions:
    type: dataset
    dataset_name: solana.instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to Jupiter program instructions
  jupiter_instructions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        program_id,
        data,
        accounts,
        block_slot,
        block_timestamp,
        signature,
        _gs_op
      FROM instructions
      WHERE program_id = 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4'

sinks:
  postgres_jupiter:
    type: postgres
    from: jupiter_instructions
    schema: public
    table: jupiter_instructions
    secret_name: MY_POSTGRES
    primary_key: id

Guide: Multi-Account Monitoring with Dynamic Tables

Track transfers involving specific accounts:

name: account-tracker
resource_size: m

sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000

transforms:
  # Dynamic table for tracked accounts
  tracked_accounts:
    type: dynamic_table
    backend_type: Postgres
    backend_entity_name: tracked_accounts
    secret_name: MY_POSTGRES

  # Filter to tracked accounts
  monitored_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        token_mint_address,
        from_token_account,
        to_token_account,
        amount,
        decimals,
        CASE
          WHEN dynamic_table_check('tracked_accounts', from_token_account) THEN 'outgoing'
          WHEN dynamic_table_check('tracked_accounts', to_token_account) THEN 'incoming'
        END as direction,
        block_slot,
        block_timestamp,
        _gs_op
      FROM token_transfers
      WHERE dynamic_table_check('tracked_accounts', from_token_account)
         OR dynamic_table_check('tracked_accounts', to_token_account)

sinks:
  postgres_monitored:
    type: postgres
    from: monitored_transfers
    schema: public
    table: monitored_transfers
    secret_name: MY_POSTGRES
    primary_key: id

Add accounts to track:

-- Add Solana account to monitor
INSERT INTO turbo.tracked_accounts VALUES
  ('7xKXtg2CW87d97TXJSDpbD5jBkheTqA83TZRuJosgAsU');

Guide: Working with Transactions and Instructions Together

The transactions_with_instructions dataset provides a transaction-centric view with all instructions nested in an array. This is ideal when you need both transaction-level data and instruction details without joining separate datasets.

When to Use transactions_with_instructions

Use transactions_with_instructions

Analyzing multi-instruction transactions - Counting instructions per transaction - Transaction-level aggregations with instruction filtering - Examining instruction sequences within transactions

Use separate datasets

Simple instruction filtering by program - Individual instruction analysis
Better SQL performance for instruction-only queries - Joining instructions with other data

Schema Overview

Each row represents one transaction with nested arrays: Transaction fields: id, index , block_slot, block_hash ,block_timestamp, accounts, balance_changes, pre_token_balances, post_token_balances, recent_block_hash, signature, err, status, compute_units_consumed ,fee, log_messages Nested instruction array: instructions - Array of instruction structs, each containing:

id, index, parent_index (null for top-level, set for inner instructions), block_slot, block_timestamp,block_hash , tx_fee, tx_index
program_id, data, accounts
status, err

You can see a sample of the transactions_with_instructions dataset here

Example: Analyze Multi-Step Swap Transactions

Find transactions with multiple Jupiter swap instructions:

name: jupiter-multi-hop-swaps
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Filter to Jupiter transactions and count instructions
  jupiter_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_slot,
        block_timestamp,
        fee,
        status,
        compute_units_consumed,
        -- Count Jupiter instructions in this transaction
        array_length(
          array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')
        ) as jupiter_instruction_count,
        -- Get all instructions for analysis
        instructions,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(
        array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')
      ) > 0

  # Filter to multi-hop swaps (more than 1 Jupiter instruction)
  multi_hop_swaps:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        jupiter_instruction_count,
        fee,
        compute_units_consumed,
        CAST(compute_units_consumed AS DOUBLE) / CAST(fee AS DOUBLE) as compute_efficiency,
        _gs_op
      FROM jupiter_swaps
      WHERE jupiter_instruction_count > 1

sinks:
  postgres_swaps:
    type: postgres
    from: multi_hop_swaps
    schema: public
    table: jupiter_multi_hop_swaps
    secret_name: MY_POSTGRES
    primary_key: signature

Example: Examine Inner Instructions

Analyze transactions with inner instructions (Cross-Program Invocations):

name: inner-instruction-analyzer
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Identify transactions with inner instructions
  tx_with_cpi:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        -- Total instruction count
        array_length(instructions) as total_instructions,
        -- Count top-level instructions (parent_index is null)
        array_length(
          array_filter(instructions, 'parent_index', null)
        ) as top_level_count,
        -- Calculate inner instruction count
        array_length(instructions) -
        array_length(
          array_filter(instructions, 'parent_index', null)
        ) as inner_instruction_count,
        fee,
        status,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(instructions) >
            array_length(
              array_filter(instructions, 'parent_index', null)
            )

sinks:
  postgres_cpi:
    type: postgres
    from: tx_with_cpi
    schema: public
    table: transactions_with_cpi
    secret_name: MY_POSTGRES
    primary_key: id

Example: Transaction Success Analysis by Program

Analyze transaction success rates grouped by the programs involved:

name: program-success-analysis
resource_size: m

sources:
  tx_with_instructions:
    type: dataset
    dataset_name: solana.transactions_with_instructions
    version: 1.0.0
    start_block: 312000000

transforms:
  # Extract program involvement
  program_transactions:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        signature,
        block_timestamp,
        status = '1' as is_successful,
        -- Get first instruction's program (typically the main program)
        instructions[1]['program_id'] as primary_program,
        -- Count unique programs in transaction
        array_length(instructions) as instruction_count,
        fee,
        compute_units_consumed,
        _gs_op
      FROM tx_with_instructions
      WHERE array_length(instructions) > 0

sinks:
  postgres_analysis:
    type: postgres
    from: program_transactions
    schema: public
    table: program_transaction_analysis
    secret_name: MY_POSTGRES
    primary_key: id

Working with Nested Instructions

To access individual instructions from the instructions array: Array indexing (1-based):

instructions[1]           -- First instruction
instructions[1].program_id -- First instruction's program
instructions[1].data      -- First instruction's data

Filtering arrays:

-- Get only Jupiter instructions
array_filter(instructions, 'program_id', 'JUP6LkbZbjS1jKKwapdHNy74zcZ3tLUZoi5QNyVTaV4')

-- Get only top-level instructions
array_filter(instructions, 'parent_index', null)

-- Get first token program instruction
array_filter_first(instructions, 'program_id', 'TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA')

Counting:

-- Total instructions in transaction
array_length(instructions)

-- Instructions for specific program
array_length(array_filter(instructions, 'program_id', 'JUP6...'))

Use array_filter() and array_filter_first() SQL functions to work efficiently with the nested instruction arrays. See the SQL Functions Reference for more details.

Guide: Recreating Solana’s Transaction Counter

Stream every Solana transaction in real-time to power a live counter, similar to the Total transactions to date counter on Solana’s homepage. This example includes any and all transactions, including failed ones.

name: solana-transaction-counter
resource_size: m

sources:
  transactions:
    type: dataset
    dataset_name: solana.transactions
    version: 1.0.0

transforms:
  # Filter transactions to reduce the payload
  minimal_txns:
    type: sql
    primary_key: signature
    sql: |
      SELECT
        signature,
        block_slot,
        block_timestamp
      FROM transactions

sinks:
  postgres_transactions: 
    type: postgres
    from: minimal_txns
    schema: public
    table: solana_transactions_counter
    secret_name: MY_POSTGRES
    primary_key: signature

Source checkpoints and rewinds

Turbo pipelines use checkpoints to track processing progress. Understanding how checkpoints work helps you avoid unintended rewinds when updating pipeline configurations.

How checkpoints work

Checkpoints use a hash of your source configuration (start_block, end_block, and block_range) to identify the user’s intent:

Same hash: The pipeline resumes from the existing checkpoint
Different hash: The pipeline treats this as a “source change” and rewinds to the new start_block

Preventing unintended rewinds

A common issue occurs when re-applying a pipeline with a different start_block than the currently running configuration. For example:

# Original configuration (pipeline already running at slot 315000000)
sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 312000000  # Original start block

If you re-apply with an earlier start_block, the pipeline will rewind:

# Re-applied with earlier start block - triggers rewind!
sources:
  token_transfers:
    type: dataset
    dataset_name: solana.token_transfers
    version: 1.0.0
    start_block: 310000000  # Earlier block triggers source change

Changing start_block, end_block, or block_range triggers a deliberate rewind. The pipeline will restart from the new start_block, reprocessing all data.

Best practices

Fetch the current configuration before re-applying

Always check the current pipeline definition before making changes:Using the CLI:

goldsky turbo get <pipeline-name> --output yaml

Using the UI: Navigate to your pipeline in the dashboard to view the current configuration.

Keep start_block consistent

When updating other pipeline settings (transforms, sinks, resource size), keep the start_block the same as the running configuration to preserve your checkpoint.

If you intentionally want to reprocess data from a specific block, changing the start_block is the correct approach. The rewind behavior is by design to give you control over reprocessing.

Performance Tips

Choose the right dataset

Use the most specific dataset:

Token transfers? Use solana.token_transfers
Balance changes? Use solana.native_balances or solana.token_balances
Transaction-level analysis with instruction details? Use solana.transactions_with_instructions
Individual instruction analysis? Use solana.instructions
Transaction metadata only? Use solana.transactions

Filter early

Apply filters in SQL as early as possible:

transforms:
  # SQL filter first
  filtered:
    type: sql
    sql: SELECT * FROM instructions WHERE program_id = '...'

  # Then decode
  decoded:
    type: sql
    from: filtered
    sql: |
      SELECT
        *,
        _gs_decode_instruction_data(
          _gs_fetch_abi('https://api.example.com/idl.json', 'raw'),
          instruction_data
        ) as decoded
      FROM filtered

Use appropriate resource sizes

Solana has high throughput. Start with medium or large:

resource_size: m  # or l for high-volume

Next Steps

SQL Transforms

Filter and transform Solana data

Dynamic Tables

Track accounts and tokens dynamically

Sinks

Write processed data to destinations

EVM Guide

Learn about EVM blockchain sources

​Overview

​Quick Start

​Guide: Track Specific SPL Tokens

​Guide: Track Large SOL Balance Changes

​Guide: Decode Program Instructions

​Guide: Analyze Transaction Success Rates

​Guide: Track Specific Programs

​Guide: Multi-Account Monitoring with Dynamic Tables

​Guide: Working with Transactions and Instructions Together

​When to Use transactions_with_instructions

Use transactions_with_instructions

Use separate datasets

​Schema Overview

​Example: Analyze Multi-Step Swap Transactions

​Example: Examine Inner Instructions

​Example: Transaction Success Analysis by Program

​Working with Nested Instructions

​Guide: Recreating Solana’s Transaction Counter

​Source checkpoints and rewinds

​How checkpoints work

​Preventing unintended rewinds

​Best practices

​Performance Tips

​Next Steps

SQL Transforms

Dynamic Tables

Sinks

EVM Guide

Overview

Quick Start

Guide: Track Specific SPL Tokens

Guide: Track Large SOL Balance Changes

Guide: Decode Program Instructions

Guide: Analyze Transaction Success Rates

Guide: Track Specific Programs

Guide: Multi-Account Monitoring with Dynamic Tables

Guide: Working with Transactions and Instructions Together

When to Use transactions_with_instructions

Schema Overview

Example: Analyze Multi-Step Swap Transactions

Example: Examine Inner Instructions

Example: Transaction Success Analysis by Program

Working with Nested Instructions

Guide: Recreating Solana’s Transaction Counter

Source checkpoints and rewinds

How checkpoints work

Preventing unintended rewinds

Best practices

Performance Tips

Next Steps