Kafka is a distributed streaming platform that is used to build real-time data pipelines and streaming applications. It is designed to be fast, scalable, and durable.
You can use Kafka to deeply integrate into your existing data ecosystem. Goldsky supplies a message format that allows you to handle blockchain forks and reorganizations with your downstream data pipelines.
Kafka has a rich ecosystem of SDKs and connectors you can make use of to do advanced data processing.
Less Magic Here
The Kafka integration is less end to end - while Goldsky will handle a ton of the topic partitioning balancing and other details, using Kafka is a bit more involved compared to getting data directly mirrored into a database.
Pipeline configuration
sources: {}
transforms: {}
sinks:
my_kafka_sink:
description: Type.Optional(Type.String())
type: kafka
from: Type.String()
topic: Type.String({ max_length: 255 })
secret_name: Type.String()
topic_partitions: Type.Optional(Type.Integer())
upsert_mode: Type.Optional(Type.Boolean())
data_format: Type.Optional(Type.Union([Type.Literal("avro"), Type.Literal("json")]))
Secrets
goldsky secret create --name A_KAFKA_SECRET --value '{
"type": "kafka",
"bootstrapServers": "Type.String()",
"securityProtocol": "Type.Enum(SecurityProtocol)",
"saslMechanism": "Type.Optional(Type.Enum(SaslMechanism))",
"saslJaasUsername": "Type.Optional(Type.String())",
"saslJaasPassword": "Type.Optional(Type.String())",
"schemaRegistryUrl": "Type.Optional(Type.String())",
"schemaRegistryUsername": "Type.Optional(Type.String())",
"schemaRegistryPassword": "Type.Optional(Type.String())"
}'