> ## Documentation Index
> Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Object Storage (S3/GCS/R2)

<Warning>
  The sub-second realtime and reorg-aware advantages of mirror are greatly
  diminished when using object storage due to the constraints of file-based
  storage. If possible, it's recommended to use one of the other sinks
  instead!
</Warning>

The files are created in [Parquet](https://parquet.apache.org/) format.

Files will be emitted on an interval, essentially mimicing a mini-batch system.

Data will also be append-only, so if there is a reorg, data with the same id will be emitted. It's up to the downstream consumers of this data to deduplicate the data.

Full configuration details for this sink is available in the [reference](/mirror/reference/config-file/pipeline#file) page.

## Secrets

Create an AWS S3 secret with the following CLI command:

```shell theme={null}
goldsky secret create --name AN_AWS_S3_SECRET --value '{
  "accessKeyId": "Type.String()",
  "secretAccessKey": "Type.String()",
  "region": "Type.String()",
  "type": "s3"
}'
```

## Partitioning

This sink supports folder-based partitioning through the `partition_columns` option.

In this example, it will store files in a different file for each day, based on the `block_timestamp` of each Base transfer.

`s3://test-bucket/base/transfers/erc20/<yyyy-MM-dd>`

```yaml theme={null}
name: example-partition
apiVersion: 3
sources:
  base.transfers:
    dataset_name: base.erc20_transfers
    version: 1.2.0
    type: dataset
    start_at: latest
transforms:
  transform_transactions:
    type: sql
    primary_key: id
    sql: |-
      select *, from_unixtime(block_timestamp/1000, 'yyyy-MM-dd') as dt
      from base.transfers
sinks:
  filesink_transform_transactions:
    secret_name: S3_SECRET
    path: s3://test-bucket/base/transfers/erc20/
    type: file
    format: parquet
    partition_columns: dt
    from: transform_transactions
```