Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Sinks are the final destination for data in your Turbo pipelines. They write processed data to external systems like databases, data warehouses, or HTTP endpoints.

Available Sinks

PostgreSQL

Write to PostgreSQL databases

PostgreSQL aggregation

Real-time aggregations in PostgreSQL

ClickHouse

Write to ClickHouse for analytics

MySQL

Write to MySQL databases

Webhook

Send data to HTTP endpoints

Kafka

Publish to Kafka topics

S3

Write to S3-compatible storage

SQS

Send to Amazon SQS queues

S2

Publish to S2.dev streams

Blackhole

Discard data for testing purposes

Common Parameters

Every sink accepts at least these parameters. See each sink’s page for type-specific fields.
type
string
required
The sink type — for example postgres, clickhouse, kafka, webhook, mysql_sink, s3_sink, sqs_sink, s2_sink, blackhole.
from
string
required
Name of the transform or source to read data from.
secret_name
string
Name of the Goldsky secret containing connection credentials. Recommended for any sink that connects to an external system (databases, Kafka, S3, SQS, webhooks, etc.). Some sinks also accept inline credentials — see each sink’s page for details.
primary_key
string
Column (or comma-separated list of columns) used to identify unique rows. Required for upserts in database sinks and required (not optional) on the ClickHouse sink.

Multiple Sinks

You can write the same data to multiple destinations:
transforms:
  processed_data:
    type: sql
    primary_key: id
    sql: SELECT * FROM source

sinks:
  # Write to PostgreSQL
  postgres_archive:
    type: postgres
    from: processed_data
    schema: public
    table: archive
    secret_name: MY_POSTGRES

  # Send to webhook
  webhook_notification:
    type: webhook
    from: processed_data
    url: https://api.example.com/notify
    secret_name: NOTIFY_API_SECRET

  # Publish to Kafka
  kafka_downstream:
    type: kafka
    from: processed_data
    topic: processed.events
Each sink writes its own copy of the data. Because all sinks share the same checkpoint barrier, a sink that is slow or stalled will apply backpressure to the whole pipeline (see below).

Sink Behavior

Checkpointing

All sinks participate in Turbo’s checkpointing system:
  • Each sink buffers writes and acknowledges only after it has flushed them
  • Sources only commit their position after every sink has acknowledged
  • This gives at-least-once delivery — duplicates are possible after a restart, but no committed data is lost

Backpressure

Sinks apply backpressure to the pipeline:
  • If any sink can’t keep up, the entire pipeline slows down
  • Prevents data loss and memory overflow
  • Monitor sink performance to identify bottlenecks

Error Handling

Sink errors are retried with exponential backoff. Transient errors (network blips, connection resets) recover automatically. Persistent errors (bad credentials, schema mismatches, unreachable destination) will keep retrying and show up as stalled checkpoints — check the pipeline logs for the underlying error message.

Best Practices

  • PostgreSQL: Transactional data, updates, relational queries
  • PostgreSQL aggregation: Real-time aggregations like balances, totals, counts
  • ClickHouse: High-volume analytics, aggregations, time-series
  • MySQL: Transactional data, updates, relational queries
  • Webhook: Real-time notifications, integrations with external systems
  • Kafka: Downstream processing, event sourcing, decoupling systems
  • SQS: Event-driven architectures, decoupled integrations, message queuing
  • S2: Decoupled processing, large number of readers, serverless architectures
For upsert behavior in databases, choose a stable primary key:
sinks:
  postgres_sink:
    primary_key: id  # Use a unique, stable identifier
Use logs and metrics to track:
  • Write throughput
  • Error rates
  • Latency
goldsky turbo logs my-pipeline
Always use secrets for database credentials:
# Create secret
goldsky secret create MY_DB_SECRET

# Reference in pipeline
sinks:
  my_sink:
    secret_name: MY_DB_SECRET