ERC-20 Transfers
Create a table containing ERC-20 Transfers for several or all token contracts
ERC-20 tokens provide a standardized format for fungible digital assets within EVM ecosystems. The process of transferring ERC-20 tokens into a database is fundamental, unlocking opportunities for data analysis, tracking, and the development of innovative solutions.
This guide is part of a series of tutorials on how you can stream transfer data into your datawarehouse using Mirror pipelines. Here we will be focusing on ERC-20 Transfers, visit the following guides for other types of Transfers:
What you’ll need
- A Goldky account and the CLI installed
- A basic understanding of the Mirror product
- A destination sink to write your data to. In this example, we will use the PostgreSQL Sink
Introduction
In order to stream all the ERC-20 Transfers of a chain there are two potential methods available:
- Use the readily available ERC-20 dataset for the chain you are interested in: this is the easiest and quickest method to get you streaming token transfers into your sink of choice with minimum code.
- Build the ERC-20 Transfers pipeline from scratch using raw or decoded logs: this method takes more code and time to implement but it’s a great way to learn about how you can use decoding functions in case you want to build more customized pipelines.
Let’s explore both method below with more detail:
Using the ERC-20 Transfers Source Dataset
Every EVM chain has its own ERC-20 dataset available for you to use as source in your pipelines. You can check this by running the goldsky dataset list
command and finding the EVM of your choice.
For this example, let’s use apex
chain and create a simple pipeline definition using its ERC-20 dataset that writes the data into a PostgreSQL instance:
If you copy and use this configuration file, make sure to update:
- Your
secretName
. If you already created a secret, you can find it via the CLI commandgoldsky secret list
. - The schema and table you want the data written to, by default it writes to
public.apex_erc20_transfers
.
You can start above pipeline by running:
Or
That’s it! You should soon start seeing ERC-20 token transfers in your database.
Building ERC-20 Transfers from scratch using logs
In the previous method we just explored, the ERC-20 datasets that we used as source to the pipeline encapsulates all the decoding logic that’s explained in this section. Read on if you are interested in learning how it’s implemented in case you want to consider extending or modifying this logic yourself.
There are two ways that we can go about building this token transfers pipeline from scratch:
- Use the
raw_logs
Direct Indexing dataset for that chain in combination with Decoding Transform Functions using the ABI of a specific ERC-20 Contract. - Use the
decoded_logs
Direct Indexing dataset for that chain in which the decoding process has already been done by Goldsky. This is only available for certain chains as you can check in this list.
We’ll primarily focus on the first decoding method using raw_logs
and decoding functions as it’s the default and most used way of decoding; we’ll also present an example using decoded_logs
and highlight the differences between the two.
Building ERC-20 Tranfers using Decoding Transform Functions
In this example, we will stream all the Transfer
events of all the ERC-20 Tokens for the Scroll chain. To that end, we will dinamically fetch the ABI of the USDT token from the Scrollscan API (available here)
and use it to identify all the same events for the tokens in the chain. We have decided to use the ABI of USDT token contract for this example but any other ERC-20 compliant token would also work.
We need to differentiate ERC-20 token transfers from ERC-721 (NFT) transfers since they have the same event signature in decoded data: Transfer(address,address,uint256)
.
However, if we look closely at their event definitions we can appreciate that the number of topics differ:
- ERC-20:
event Transfer(address indexed _from, address indexed _to, uint256 _value)
- ERC-721:
event Transfer(address indexed _from, address indexed _to, uint256 indexed _tokenId)
ERC-20 Transfer events have three topics (one topic for event signature + 2 topics for the indexed params). NFTs on the other hand have four topics as they have one more indexed param in the event signature. We will use this as a filter in our pipeline transform to only transfer ERC-20 Transfer events.
Let’s now see all these concepts applied in an example pipeline definition:
Pipeline Definition
If you copy and use this configuration file, make sure to update:
- Your
secretName
. If you already created a secret, you can find it via the CLI commandgoldsky secret list
. - The schema and table you want the data written to, by default it writes to
mirror.erc20_transfers
.
There are three transforms in this pipeline definition which we’ll explain how they work:
As explained in the Decoding Contract Events guide we first make use of the _gs_fetch_abi
function to get the ABI from Scrollscan and pass it as first argument
to the function _gs_log_decode
to decode its topics and data. We store the result in a decoded
ROW which we unnest on the next transform.
We also limit the decoding to the relevent events using the topic filter and SPLIT_INDEX
to only include ERC-20 transfers.
topics LIKE '0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef%'
:topics
is a comma separated string. Each value in the string is a hash. The first is the hash of the full event_signature (including arguments), in our caseTransfer(address,address,uint256)
for ERC-20, which is hashed to0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef
. We useLIKE
to only consider the first signature, with a%
at the end, which acts as a wildcard.SPLIT_INDEX(topics, ',', 3) IS NULL
: as mentioned in the introduction, ERC-20 transfers share the sameevent_signature
as ERC-721 transfers. The difference between them is the number of topics associated with the event. ERC-721 transfers have four topics, and ERC-20 transfers have three.
In this second transform, we take the event_params
and event_signature
from the result of the decoding. We then filter the query on:
decoded IS NOT NULL
: to leave out potential null results from the decoderdecoded.event_signature = 'Transfer'
: the decoder will output the event name as event_signature, excluding its arguments. We use it to filter only for Transfer events.
In this last transform we are essentially selecting all the Transfer information we are interested in having in our database.
We’ve included a number of columns that you may or may not need, the main columns needed for most purposes are: id
, address
(if you are syncing multiple contract addresses), sender
, recipient
, token_id
, and value
.
id
: This is the Goldsky providedid
, it is a string composed of the dataset name, block hash, and log index, which is unique per event, here’s an example:log_0x60eaf5a2ab37c73cf1f3bbd32fc17f2709953192b530d75aadc521111f476d6c_18
address AS token_id
: We use thelower
function here to lower-case the address to make using this data simpler downstream, we also rename the column totoken_id
to make it more explicit.lower(event_params[1]) AS sender
: Here we continue to lower-case values for consistency. In this case we’re using the first element of theevent_params
array (using a 1-based index), and renaming it tosender
. Each event parameter maps to an argument to theevent_signature
.lower(event_params[2]) AS recipient
: Like the previous column, we’re pulling the second element in theevent_params
array and renaming it torecipient
.lower(event_params[3]) AS value
: We’re pulling the third element in theevent_params
array and renaming it tovalue
to represent the amount of the token_id sent in the transfer.
Lastly, we are also adding more block metadata to the query to add context to each transaction:
It’s worth mentioning that in this example we are interested in all the ERC-20 Transfer events but if you would like to filter for specific contract addresses you could simply add a WHERE
filter to this query with address you are interested in, like: WHERE address IN ('0xBC4CA0EdA7647A8aB7C2061c2E118A18a936f13D', '0xdac17f958d2ee523a2206206994597c13d831ec7')
Deploying the pipeline
Our last step is to deploy this pipeline and start sinking ERC-20 transfer data into our database. Assuming we are using the same file name for the pipeline configuration as in this example, we can use the CLI pipeline create command like this:
goldsky pipeline create scroll-erc20-transfers --definition-path scroll-erc20-transfers.yaml
After some time, you should see the pipeline start streaming Transfer data into your sink.
Remember that you can always speed up the streaming process by updating the resourceSize of the pipeline
Here’s an example transfer record from our sink:
id | token_id | sender | recipient | value | event_name | block_number | block_hash | log_index | transaction_hash | transaction_index |
---|---|---|---|---|---|---|---|---|---|---|
log_0x666622ad5c04eb5a335364d9268e24c64d67d005949570061d6c150271b0da12_2 | 0x5300000000000000000000000000000000000004 | 0xefeb222f8046aaa032c56290416c3192111c0085 | 0x8c5c4595df2b398a16aa39105b07518466db1e5e | 22000000000000006 | Transfer | 5136 | 0x666622ad5c04eb5a335364d9268e24c64d67d005949570061d6c150271b0da12 | 2 | 0x63097d8bd16e34caacfa812d7b608c29eb9dd261f1b334aa4cfc31a2dab2f271 | 0 |
We can find this transaction in Scrollscan. We see that it corresponds to the second internal transfers of Wrapped ETH (WETH):
This concludes our successful deployment of a Mirror pipeline streaming ERC-20 Tokens from Scroll chain into our database using inline decoders. Congrats! 🎉
ERC-20 Transfers using decoded datasets
As explained in the Introduction, Goldsky provides decoded datasets for Raw Logs and Raw Traces for a number of different chains. You can check this list to see if the chain you are interested in has these decoded datasets. In these cases, there is no need for us to run Decoding Transform Functions as the dataset itself will already contain the event signature and event params decoded.
Click on the button below to see an example pipeline definition for streaming ERC-20 tokens on the Ethereum chain using the decoded_logs
dataset.
You can appreciate that it’s pretty similar to the inline decoding pipeline method but here we simply create a transform which does the filtering based on the raw_log.topics
just as we did on the previous method.
Assuming we are using the same filename for the pipeline configuration as in this example and that you have added your own secret we can deploy this pipeline with the CLI pipeline create command:
goldsky pipeline create ethereum-erc20-transfers --definition-path ethereum-decoded-logs-erc20-transfers.yaml
Conclusion
In this guide, we have learnt how Mirror simplifies streaming ERC-20 Transfer events into your database.
We have first looked into the easy way of achieving this, simply by making use of the readily available ERC-20 dataset of the EVM chaina and using its as the source to our pipeline.
Next, we have deep dived into the standard decoding method using Decoding Transform Functions, implementing an example on Scroll chain. We have also looked into an example implementation using the decoded_logs dataset for Ethereum. Both are great decoding methods and depending on your use case and dataset availability you might prefer one over the other.
With Mirror, developers gain flexibility and efficiency in integrating blockchain data, opening up new possibilities for applications and insights. Experience the transformative power of Mirror today and redefine your approach to blockchain data integration.
Can't find what you're looking for? Reach out to us at support@goldsky.com for help.
Was this page helpful?