Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The throttle transform caps the throughput of a stream by buffering records into batches and emitting each batch on a fixed minimum interval. Use it to:
  • Stay under rate limits of downstream sinks or external APIs
  • Smooth out bursty sources into a steady, predictable rate
  • Test sink behavior at a controlled records-per-second rate
  • Reduce pressure on small resource sizes during development
Throttle does not modify the data — every input record passes through unchanged. It only controls when records are emitted.

Configuration

transforms:
  my_throttle:
    type: throttle
    from: <source-or-transform>
    max_batch_size: 100
    min_batch_interval: 10s

Parameters

type
string
required
Must be throttle
from
string
required
The source or transform to read data from
max_batch_size
integer
required
Maximum number of records to emit per batch. The throttle accumulates up to this many records before flushing.
min_batch_interval
duration
required
Minimum time to wait between batches (e.g., 10s, 500ms, 1m). The next batch will not be emitted until this interval has elapsed since the previous batch.

How throttling works

Records are buffered as they arrive from the upstream source or transform. A batch is flushed downstream when both conditions are met:
  • max_batch_size records have accumulated, and
  • min_batch_interval has elapsed since the last batch was emitted
The effective maximum throughput is approximately:
max_batch_size / min_batch_interval = records per second
For example, max_batch_size: 100 with min_batch_interval: 10s caps throughput at roughly 10 records per second.
Throttle limits the maximum rate, not the minimum. If the upstream is slow, batches will be smaller and arrive less frequently.

Example

Throttle a high-volume ERC-20 transfer stream down to ~10 rps before sending it to a sink:
name: throttle_example
resource_size: s
use_dedicated_ip: false
job: false

sources:
  erc20s:
    type: dataset
    dataset_name: matic.erc20_transfers
    version: 1.2.0
    start_at: latest

transforms:
  throttled_erc20s:
    type: throttle
    from: erc20s
    max_batch_size: 100 # ~10 rps with a 10s interval
    min_batch_interval: 10s

sinks:
  sink_1:
    type: blackhole
    from: throttled_erc20s

When to use throttle

  • Rate-limited sinks — Stay under per-second write quotas on downstream APIs or databases.
  • External handler protection — Pace records into an HTTP handler so the receiving service is not overwhelmed.
  • Cost control during development — Slow down processing while iterating on a pipeline against a live source.
  • Testing — Reproduce sink behavior under a known, fixed input rate.

Best Practices

1

Place throttle close to the bottleneck

Throttle the stream just before the rate-limited sink or handler so upstream transforms still process at full speed.
2

Tune batch size to your sink

Larger max_batch_size reduces per-batch overhead but increases latency per record. Pick a size that matches your sink’s preferred batch size.
3

Remove throttle in production where possible

Throttle caps throughput by design. Once rate-limit concerns are addressed, remove the transform to let the pipeline run at full speed.