The sub-second realtime and reorg-aware advantages of mirror are greatly diminished when using our S3 connector due to the constraints of file-based storage. If possible, it’s highly recommended to use one of the other channels or sinks instead!

The files are created in Parquet format.

Files will be emitted on an interval, essentially mimicing a mini-batch system.

Data will also be append-only, so if there is a reorg, data with the same id will be emitted. It’s up to the downstream consumers of this data to deduplicate the data.

Full configuration details for this sink is available in the reference page.

Secrets

Create an AWS S3 secret with the following CLI command:

goldsky secret create --name AN_AWS_S3_SECRET --value '{
  "accessKeyId": "Type.String()",
  "secretAccessKey": "Type.String()",
  "region": "Type.String()",
  "type": "s3"
}'

Partitioning

This sink supports folder-based partitioning through the partition_columns option.

In this example, it will store files in a different file for each day, based on the block_timestamp of each Base transfer.

s3://test-bucket/base/transfers/erc20/<yyyy-MM-dd>

name: example-partition
apiVersion: 3
sources:
  base.transfers:
    dataset_name: base.erc20_transfers
    version: 1.2.0
    type: dataset
    start_at: latest
transforms:
  transform_transactions:
    type: sql
    primary_key: id
    sql: |-
      select *, from_unixtime(block_timestamp/1000, 'yyyy-MM-dd') as dt
      from base.transfers
sinks:
  filesink_transform_transactions:
    secret_name: S3_SECRET
    path: s3://test-bucket/base/transfers/erc20/
    type: file
    format: parquet
    partition_columns: dt
    from: transform_transactions