Kafka is a distributed streaming platform that is used to build real-time data pipelines and streaming applications. It is designed to be fast, scalable, and durable. You can use Kafka to deeply integrate into your existing data ecosystem. Goldsky supplies a message format that allows you to handle blockchain forks and reorganizations with your downstream data pipelines. Kafka has a rich ecosystem of SDKs and connectors you can make use of to do advanced data processing. Full configuration details for Kafka sink is available in the reference page.Documentation Index
Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
Use this file to discover all available pages before exploring further.
Configuration options
Parallelism
Theparallelism option controls the number of parallel Kafka producers used to write data. Increasing parallelism can significantly improve throughput for high-volume pipelines.
Batching
Configure batching behavior to optimize throughput and latency:batch_size: Number of messages to batch before sending (maps to Kafka’sbatch.num.messages). Default:10000batch_flush_interval: Maximum time in milliseconds to wait before flushing a batch (maps to Kafka’slinger.ms). Default:100
Message size
message_max_bytes: Maximum size in bytes for a Kafka request (maps to Kafka’smessage.max.bytes). Default:1000000(1MB)
Secrets
Partitioning
Goldsky uses Kafka’s default partitioning strategy based on message key hashes. The message key is constructed from the primary key column(s) of your data. Key behavior:- Key format: Primary key values joined with
_(e.g.,enriched_transaction_v2_0x6a7b...789d_1) - Partitioner: Kafka’s DefaultPartitioner (murmur2 hash)
- Partition assignment:
murmur2(keyBytes) % numPartitions
- Records with the same key always go to the same partition, ensuring ordering per key
- Increasing partitions will cause key redistribution — existing keys may map to different partitions
- Global ordering is not guaranteed; only per-key ordering is maintained