Documentation Index
Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Goldsky provides curated datasets for EVM chains, making it easy to build real-time data pipelines without managing Kafka topics or schemas. All datasets follow consistent schemas across chains, so you can write multi-chain pipelines with minimal code changes. See the supported chains page for the full list. The Datasource explorer contains all the chain and dataset names. You can also see datasets withgoldsky dataset list using the CLI.
All Mirror datasets work with Turbo. For complete field schemas, see the EVM Data Schemas and Curated Data
Schemas references.
Quick Start
The simplest way to get started is with curated token transfer datasets:matic for Polygon PoS, not polygon). See each chain’s page in supported chains for the correct slug.
ethereum.raw_logsfor Ethereum mainnet event logsmatic.erc20_transfersfor Polygon ERC-20 transfersbase.raw_logsfor Base event logs
erc20_transfers- Fungible token transfers (curated)erc721_transfers- NFT transfers (curated)erc1155_transfers- Multi-token transfers (curated)raw_blocks- Block headersraw_transactions- Transactions including receipt fieldsraw_logs- Event logsraw_traces- Internal transaction traces
Fast scan
Processing full datasets (starting fromearliest) requires the pipeline to process a significant amount of data, which affects how quickly it reaches the chain tip. This is especially true for larger chains.
When you only need a subset of historical data, you can enable fast scan by adding a filter to your source configuration. The filter is pre-applied at the source level, making initial ingestion of historical data much faster by skipping irrelevant blocks.
When defining a filter, use attributes that exist in the dataset. You can get the schema of a dataset by running goldsky dataset get <dataset_name>.
WHERE clause:
Fast scan speeds up backfills only — when processing historical data with
start_at: earliest. During real-time processing (when the pipeline has caught up to the chain tip), all blocks are processed regardless of the filter.filter: together with start_at: earliest (or omit start_at), the source becomes bounded and job mode can terminate it cleanly. To make the backfill finite, include an upper bound on block_number in the filter: expression:
job: true and the pipeline will self-terminate once block 20,000,000 is processed. Without a filter:, an EVM dataset becomes a streaming Kafka source and cannot be used with job: true.
end_block is not supported on EVM dataset sources — it is silently ignored. To bound an EVM job, put the block range in the filter: expression (e.g., block_number <= 20000000 or block_number BETWEEN 18000000 AND 20000000). end_block is Solana-only.Guide: Track Specific Tokens
Use dynamic tables to monitor transfers for tokens you care about:streamling schema with a value column. See Dynamic Tables for how to customize this.
Guide: Decode Custom Contract Events
Use_gs_log_decode to decode events from any contract with raw logs:
_gs_log_decode(abi, topics, data)- Decode event logs_gs_fetch_abi(url, 'etherscan')- Fetch ABI from Etherscan_gs_fetch_abi(url, 'raw')- Fetch raw JSON ABI from URL
Guide: Multi-Chain Monitoring
Track the same event across multiple chains:Guide: High-Value Transfer Alerts
Send real-time alerts for large transfers:Guide: UniswapV3 Swap Detection with Log Decoding
Track UniswapV3-like swaps on Base on specific pools of interest.- Uses the
base.raw_logsdataset for full event coverage and custom decoding - Swap event signature (
0xc4207...) filters for UniswapV3 Swap events _gs_log_decodedecodes the raw log using inline ABI- Dynamic table pattern allows adding/removing pools without redeploying
- Extracts all swap parameters: amounts, price, liquidity, and tick
- Factory pattern automatically discovers new pools as they’re created