Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

In this quickstart, you’ll create a simple Turbo pipeline that:
  1. Reads ERC-20 transfer data from a Goldsky dataset
  2. Writes the results to a blackhole sink
  3. Inspect the data live
  4. [Optional] Write the data into a PostgreSQL Database
You have two options to create a Turbo pipeline:
  1. Goldsky Flow: A guided visual canvas editor in the dashboard
  2. CLI: Using YAML configuration files

Goldsky Flow

Flow allows you to deploy Turbo pipelines by dragging and dropping components onto a visual canvas. Open Flow by going to the Pipelines page and clicking New pipeline.
Turbo pipelines in Flow work similarly to Mirror pipelines but with some differences. See Turbo vs Mirror in Flow for details.

1. Select a data source

Drag a Data Source card onto the canvas. Select the chain and dataset you want to use. For blockchain data, Turbo supports:
  • EVM chains: Ethereum, Base, Polygon, and more
  • Solana: Transactions, instructions, and token transfers
  • Stellar: Ledgers, transactions, and operations
  • Bitcoin: Blocks and transactions
Select your chain and dataset type (e.g., ERC-20 Transfers for Base).

2. Add transforms (optional)

Click the + button on your source card to add transforms:
  • SQL Transform: Filter and project data using SQL queries
  • Script Transform: Execute custom TypeScript code for complex transformations
  • Dynamic Table: Create real-time lookup tables for filtering and enrichment
Dynamic Table and Script transforms are unique to Turbo pipelines and provide powerful capabilities not available in Mirror.

3. Select a sink

Click the + button to add a sink. Turbo supports:
  • PostgreSQL
  • ClickHouse
  • Kafka
  • Webhook
  • S3
  • Blackhole (for testing)

4. Deploy

Name your pipeline and click Deploy. Select a resource size and your pipeline will start processing data.

Switching between Flow and YAML

You can toggle between the visual canvas and YAML view using the switcher in the top-left corner. This lets you:
  • See the YAML configuration generated from your visual design
  • Copy the YAML for version control or CI/CD deployment
  • Make advanced edits directly in YAML

Turbo vs Mirror in Flow

When creating Turbo pipelines in Flow, note these differences from Mirror:
FeatureTurboMirror
Script transformsYes (TypeScript)No
Dynamic Table transformsYesNo
Kafka sourceYAML onlySupported
ClickHouse sourceYAML onlySupported
Hybrid sourceYAML onlySupported
Live inspectYesNo
Some source types (Kafka, ClickHouse, and Hybrid) are only available via YAML configuration. Use the CLI workflow for these advanced sources.

Creating Turbo pipelines with the CLI

Prerequisites

Step 1: Create your pipeline

Create a file named erc20-pipeline.yaml:
erc20-pipeline.yaml
name: erc20-transfers
resource_size: s

sources:
  base_erc20_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

transforms: {}

sinks:
  blackhole_sink:
    type: blackhole
    from: base_erc20_transfers
Sources:
  • We’re using the base.erc20_transfers dataset (version 1.2.0)
  • start_at: latest means we’ll only process new transfers going forward
Sinks:
  • The blackhole sink discards the data but allows you to test the pipeline
  • Perfect for development and testing before adding real outputs

Step 2: Deploy Your Pipeline

Apply your pipeline configuration:
goldsky turbo apply erc20-pipeline.yaml
You should see output confirming the deployment:
✓ Pipeline 'erc20-transfers' applied successfully

Step 3: Inspect the Data Live

Now that the pipeline is running, you can pass the pipeline name erc20-transfers instead of erc20-pipeline.yaml to any of the commands below.
Open the live inspect TUI to watch data flow through your pipeline:
goldsky turbo inspect erc20-pipeline.yaml
Use the logs command to stream the pipeline’s runtime output:
goldsky turbo logs erc20-pipeline.yaml
You should see output showing the pipeline processing ERC-20 transfers in real-time.

Step 4: Add a Filter for USDC Only

Now let’s add a SQL transform to filter only USDC transfers. Update your erc20-pipeline.yaml:
erc20-pipeline.yaml
name: erc20-transfers
resource_size: s

sources:
  base_erc20_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

transforms:
  # Filter to only USDC transfers
  usdc_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        sender,
        recipient,
        amount,
        to_timestamp(block_timestamp) as block_time
      FROM base_erc20_transfers
      WHERE address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')

sinks:
  blackhole_sink:
    type: blackhole
    from: usdc_transfers
Transforms:
  • Added a SQL transform named usdc_transfers that filters for the USDC contract on Base
  • The contract address 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913 is USDC on Base
  • We use lower() to ensure consistent case-insensitive matching
  • Selected only essential columns: sender, recipient, amount, and block_time
  • Converted the Unix timestamp to a human-readable format using to_timestamp()
Sinks:
  • Updated the sink to read from usdc_transfers instead of the raw source
  • Now only USDC transfers will flow through the pipeline
You can achieve the same filtering using a TypeScript transform instead of SQL. Return null to filter out records:
name: erc20-transfers
resource_size: s

sources:
  base_erc20_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

transforms:
  usdc_transfers:
    type: script
    primary_key: id
    language: typescript
    from: base_erc20_transfers
    schema:
      id: string
      sender: string
      recipient: string
      amount: string
      block_time: string
    script: |
      function invoke(data) {
        const USDC_ADDRESS = '0x833589fcd6edb6e08f4c7c32d4f71b54bda02913';

        // Return null to filter out non-USDC transfers
        if (!data?.address || data.address.toLowerCase() !== USDC_ADDRESS) {
          return null;
        }

        // Return a custom schema with only the fields we need
        return {
          id: data.id,
          sender: data.sender,
          recipient: data.recipient,
          amount: data.amount,
          block_time: new Date(data.block_timestamp * 1000).toISOString()
        };
      }

sinks:
  blackhole_sink:
    type: blackhole
    from: usdc_transfers
Key features:
  • Return null to filter: Records that don’t match your criteria can be filtered out by returning null
  • Custom output schema: Use the schema field to define a different output schema than the input. This lets you reshape data, rename fields, or include only specific columns
  • Flexible transformations: Perform calculations, string manipulation, and conditional logic
TypeScript Benefits:
  • More flexible data transformations and complex logic
  • Familiar syntax for developers
  • Type safety and autocompletion support
  • Can perform calculations, string manipulation, and conditional logic
SQL Benefits:
  • Generally faster for simple filtering and aggregations
  • More concise for straightforward queries
  • Better for set-based operations
Choose TypeScript when you need complex transformations or custom logic. Choose SQL for simple filters and aggregations.

Redeploy and Inspect

Apply the updated configuration:
goldsky turbo apply erc20-pipeline.yaml
Now inspect the output of your new usdc_transfers transform.
# the -n flag lets you target one or more sources or transforms
goldsky turbo inspect erc20-pipeline.yaml -n usdc_transfers
You should now see only USDC transfers! Compare this to the original source data:
# View the filtered USDC data
goldsky turbo inspect erc20-pipeline.yaml -n usdc_transfers

# View all ERC-20 transfers (before filtering)
goldsky turbo inspect erc20-pipeline.yaml -n base_erc20_transfers

# View multiple stages at once
goldsky turbo inspect erc20-pipeline.yaml -n base_erc20_transfers,usdc_transfers
The -n, --topology-node-keys parameter lets you inspect data at any point in your pipeline. Use transform names to see filtered/transformed data, or source names to see raw data. You can specify multiple keys separated by commas to view multiple stages simultaneously.

Optional: Write to PostgreSQL

To persist your data to a PostgreSQL database, update your pipeline configuration:

1. Create a Secret

Store your PostgreSQL credentials:
goldsky secret create MY_POSTGRES_SECRET
When prompted, enter your PostgreSQL connection string:
postgres://username:password@host:port/database

2. Update Your Pipeline

Modify erc20-pipeline.yaml to add a PostgreSQL sink:
erc20-pipeline.yaml
name: erc20-transfers
resource_size: s

sources:
  base_erc20_transfers:
    type: dataset
    dataset_name: base.erc20_transfers
    version: 1.2.0
    start_at: latest

transforms:
  # Filter to only USDC transfers
  usdc_transfers:
    type: sql
    primary_key: id
    sql: |
      SELECT
        id,
        sender,
        recipient,
        amount,
        to_timestamp(block_timestamp) as block_time
      FROM base_erc20_transfers
      WHERE address = lower('0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913')

sinks:
  postgres_sink:
    type: postgres
    schema: public
    table: usdc_transfers
    secret_name: MY_POSTGRES_SECRET
    from: usdc_transfers
    primary_key: id

3. Redeploy

Apply the updated configuration:
goldsky turbo apply erc20-pipeline.yaml

4. Query Your Data

Connect to PostgreSQL and query the USDC transfers:
SELECT
  sender,
  recipient,
  amount,
  block_time
FROM public.usdc_transfers
ORDER BY block_time DESC
LIMIT 10;