Kafka is a distributed streaming platform that is used to build real-time data pipelines and streaming applications. It is designed to be fast, scalable, and durable.

You can use Kafka to deeply integrate into your existing data ecosystem. Goldsky supplies a message format that allows you to handle blockchain forks and reorganizations with your downstream data pipelines.

Kafka has a rich ecosystem of SDKs and connectors you can make use of to do advanced data processing.

Less Magic Here

The Kafka integration is less end to end - while Goldsky will handle a ton of the topic partitioning balancing and other details, using Kafka is a bit more involved compared to getting data directly mirrored into a database.

Pipeline configuration

sources: {}
transforms: {}
sinks:
  my_kafka_sink:
    description: Type.Optional(Type.String())
    type: kafka
    from: Type.String()
    topic: Type.String({ max_length: 255 })
    secret_name: Type.String()
    topic_partitions: Type.Optional(Type.Integer())
    upsert_mode: Type.Optional(Type.Boolean())
    data_format: Type.Optional(Type.Union([Type.Literal("avro"), Type.Literal("json")]))

Secrets


goldsky secret create --name A_KAFKA_SECRET --value '{
  "type": "kafka",
  "bootstrapServers": "Type.String()",
  "securityProtocol": "Type.Enum(SecurityProtocol)",
  "saslMechanism": "Type.Optional(Type.Enum(SaslMechanism))",
  "saslJaasUsername": "Type.Optional(Type.String())",
  "saslJaasPassword": "Type.Optional(Type.String())",
  "schemaRegistryUrl": "Type.Optional(Type.String())",
  "schemaRegistryUsername": "Type.Optional(Type.String())",
  "schemaRegistryPassword": "Type.Optional(Type.String())"
}'