Skip to main content
Kafka is a distributed streaming platform that is used to build real-time data pipelines and streaming applications. It is designed to be fast, scalable, and durable. You can use Kafka to deeply integrate into your existing data ecosystem. Goldsky supplies a message format that allows you to handle blockchain forks and reorganizations with your downstream data pipelines. Kafka has a rich ecosystem of SDKs and connectors you can make use of to do advanced data processing.
The Kafka integration is less end to end - while Goldsky will handle topic partitioning balancing and other details, using Kafka is a bit more involved compared to getting data directly mirrored into a database.
Full configuration details for Kafka sink is available in the reference page.

Secrets


goldsky secret create --name A_KAFKA_SECRET --value '{
  "type": "kafka",
  "bootstrapServers": "Type.String()",
  "securityProtocol": "Type.Enum(SecurityProtocol)",
  "saslMechanism": "Type.Optional(Type.Enum(SaslMechanism))",
  "saslJaasUsername": "Type.Optional(Type.String())",
  "saslJaasPassword": "Type.Optional(Type.String())",
  "schemaRegistryUrl": "Type.Optional(Type.String())",
  "schemaRegistryUsername": "Type.Optional(Type.String())",
  "schemaRegistryPassword": "Type.Optional(Type.String())"
}'

Partitioning

Goldsky uses Kafka’s default partitioning strategy based on message key hashes. The message key is constructed from the primary key column(s) of your data. Key behavior:
  • Key format: Primary key values joined with _ (e.g., enriched_transaction_v2_0x6a7b...789d_1)
  • Partitioner: Kafka’s DefaultPartitioner (murmur2 hash)
  • Partition assignment: murmur2(keyBytes) % numPartitions
Implications for increasing partitions:
  • Records with the same key always go to the same partition, ensuring ordering per key
  • Increasing partitions will cause key redistribution — existing keys may map to different partitions
  • Global ordering is not guaranteed; only per-key ordering is maintained