Turbo

Overview

Sinks are the final destination for data in your Turbo pipelines. They write processed data to external systems like databases, data warehouses, or HTTP endpoints.

Available Sinks

PostgreSQL

Write to PostgreSQL databases

PostgreSQL aggregation

Real-time aggregations in PostgreSQL

ClickHouse

Write to ClickHouse for analytics

Webhook

Send data to HTTP endpoints

Kafka

Publish to Kafka topics

S3

Write to S3-compatible storage

SQS

Send to Amazon SQS queues

S2

Publish to S2.dev streams

Blackhole

Discard data for testing purposes

Common Parameters

All sinks share these common parameters:

type

string

required

The sink type (postgres, clickhouse, webhook, etc.)

from

string

required

The transform or source to read data from

secret_name

string

Name of the secret containing connection credentials (required for database sinks)

Multiple Sinks

You can write the same data to multiple destinations:

transforms:
  processed_data:
    type: sql
    primary_key: id
    sql: SELECT * FROM source

sinks:
  # Write to PostgreSQL
  postgres_archive:
    type: postgres
    from: processed_data
    schema: public
    table: archive
    secret_name: MY_POSTGRES

  # Send to webhook
  webhook_notification:
    type: webhook
    from: processed_data
    url: https://api.example.com/notify

  # Publish to Kafka
  kafka_downstream:
    type: kafka
    from: processed_data
    topic: processed.events

Each sink operates independently - failures in one don’t affect others.

Sink Behavior

Checkpointing

All sinks participate in Turbo’s checkpointing system:

Data is buffered until a checkpoint completes
Only acknowledged data is committed to the sink
Ensures exactly-once delivery semantics

Backpressure

Sinks apply backpressure to the pipeline:

If a sink can’t keep up, the entire pipeline slows down
Prevents data loss and memory overflow
Monitor sink performance to identify bottlenecks

Error Handling

Sink errors are handled with retries:

Transient errors (network issues) are retried with exponential backoff
Permanent errors (invalid data, schema mismatches) fail the pipeline
Check logs for detailed error messages

Best Practices

Choose the right sink for your use case

PostgreSQL: Transactional data, updates, relational queries
PostgreSQL aggregation: Real-time aggregations like balances, totals, counts
ClickHouse: High-volume analytics, aggregations, time-series
Webhook: Real-time notifications, integrations with external systems
Kafka: Downstream processing, event sourcing, decoupling systems
SQS: Event-driven architectures, decoupled integrations, message queuing
S2: Decoupled processing, large number of readers, serverless architectures

Use appropriate primary keys

For upsert behavior in databases, choose a stable primary key:

sinks:
  postgres_sink:
    primary_key: id  # Use a unique, stable identifier

Monitor sink performance

Use logs and metrics to track:

Write throughput
Error rates
Latency

goldsky turbo logs my-pipeline

Secure credentials

Always use secrets for database credentials:

# Create secret
goldsky secret create MY_DB_SECRET

# Reference in pipeline
sinks:
  my_sink:
    secret_name: MY_DB_SECRET

Reference

​Overview

​Available Sinks

PostgreSQL

PostgreSQL aggregation

ClickHouse

Webhook

Kafka

S3

SQS

S2

Blackhole

​Common Parameters

​Multiple Sinks

​Sink Behavior

​Checkpointing

​Backpressure

​Error Handling

​Best Practices

Overview

Available Sinks

Common Parameters

Multiple Sinks

Sink Behavior

Checkpointing

Backpressure

Error Handling

Best Practices