Choosing a path

You have two options to create a Goldsky Mirror pipeline:

  1. Pipeline Builder: With a guided web experience in the dashboard
  2. CLI: interactively or by providing a pipeline configuration

Creating Mirror pipelines with the Pipeline Builder

  1. Go to the Pipelines page on the dashboard
  2. Click on the New pipeline button
  1. Select your data source: You can choose between subgraph source or dataset source

Let’s look at both options with an example implementation.

Pipeline Builder: Subgraph source

When you select Subgraph as a source in the wizard you are presented with a dropdown from which you can choose among one of your existing subgraphs as well as community subgraphs. The dropdown has chain filters and a look up search bar to help you find your subgraph faster. In this example we are filtering on Base chain as we want to create a pipeline for the friendtech-base subgraph we deployed in the no-code subgraph tutorial

In the next step you will be presented with an overview about the selected subgraph, including example data and entities stored over time.

Click on the Create Pipeline button on the top right corner to proceed to the next step.

In the next screen we have the a 3-step process to deploy this subgraph-powered pipeline:

  1. Select data source
  • Name: Use a unique name for your pipeline. Pipeline name must only contain lowercase letters, numbers and hyphens. In our example we’ll call it base-friendtech-pipeline
  • Description: (optional) add some context to what this pipeline is used for
  • Resource Size: in many situations, the default (S) resource size is sufficient for the tasks at hand. However, for more demanding tasks, selecting larger resource sizes can significantly speed up the job execution. In this case, as we are ingesting data from a regular subgraph we are choosing S
  • Sources: select the specific entities you are interested in ingesting or all of them.

Once ready, click on the next button.

  1. Select data sink

In the next step, you can configure the destination of your data as a sink. If you already have configured any sinks previously you can choose them from the list (for more information, see Mirror Secrets). Alternatively, you can configure a new sink as part of this process. In our example, we’ll use an existing Neon DB secret configured under the name JDBC_JAVI_NEON.

  1. Confirm and deploy

At this point, we are ready to deploy our pipeline! You have two options:

  • Click on the “Deploy Pipeline” button
  • Use the pipeline yaml definition on the right side of the screen and deploy it yourself using the CLI. This definition has been added in each step as you input information in each form.

To finish off this example, we’ll use the Deploy Pipeline button. As a result, we are redirected to the pipeline page where we can see it’s live and started

Congrats in deploying your Subgraph-powered pipeline!

Pipeline Builder: Direct Indexing source

When you select Direct Indexing as a source, the first step is to choose the dataset you want to ingest. Click here for a full reference on the different datasets available In this example, we are going to select Decoded Logs.

After you click on your preferred dataset you are presented with an overview of the data within it. In this view you can select the chain you are interested in. For this example, we are choosing Base.

In the next screen we have the a 3-step process to deploy your pipeline:

  1. Select data source
  • Name: Use a unique name for your pipeline. Pipeline name must only contain lowercase letters, numbers and hyphens. In our example we’ll call it base-decoded-logs
  • Description: (optional) add some context to what this pipeline is used for
  • Resource Size: in many situations, the default (S) resource size is sufficient for the tasks at hand. However, for more demanding tasks, selecting larger resource sizes can significantly speed up the job execution. In this case, as we are ingesting data from we are choosing S but if you wanted to speed up the backfilling process you may want to opt for a bigger resource size.
  • Sources: here you can choose if you want to start backfill data from the oldest block or start ingesting edge data from the earliest block. You can also filter by contract address like we do in the ERC-Transfers guides.
  1. Select data sink

In the next step, you can configure the destination of your data as a sink. If you already have configured any sinks previously you can choose them from the list (for more information, see Mirror Secrets). Alternatively, you can configure a new sink as part of this process. In our example, we’ll use an existing Neon DB secret configured under the name JDBC_JAVI_NEON.

  1. Confirm and deploy

At this point, we are ready to deploy our pipeline! You have two options:

  • Click on the “Deploy Pipeline” button
  • Use the pipeline yaml definition on the right side of the screen and deploy it yourself using the CLI. This definition has been added in each step as you input information in each form.

To finish off this example, we’ll use the Deploy Pipeline button. As a result, we are redirected to the pipeline page where we can see it’s live and started

Congrats in deploying your pipeline!

Creating Mirror pipelines with the CLI

The CLI allows for more customization than the Pipeline Builder, which uses defaults for most parameters. There are two ways in which you can create pipelines with the CLI:

  • Interactive
  • Non-Interactive

Interactive: Guided CLI experience

Just like with the Pipeline Builder, this is a simple and guided way to create pipelines via the CLI. Run goldsky pipeline create <your-pipeline-name> in your terminal and follow the prompts.

In short, the CLI guides you through the following process:

  1. Select one or more source(s)
  2. Depending on the selected source(s), define transforms
  3. Configure one or more sink(s)

Non-interactive: Pipeline configuration

This is an advanced way to create a new pipeline. Instead of using the guided CLI experience (see above), you create the pipeline configuration on your own.

Writing a configuration file

A pipeline configuration is a YAML structure with the following top-level properties:

name: <your-pipeline-name>
apiVersion: 3
sources: {}
transforms: {}
sinks: {}

Both sources and sinks are required with a minimum of one entry each. transforms is optional and an empty object ({}) can be used if no transforms are needed.

Click through here for the full reference documentation on Mirror pipeline configuration files, or to the source and sink-specific documentation. Note that to create a pipeline from definition, you need to have a database secret already configured on your Goldsky project.

As an example, see below a pipeline configuration which uses the Ethereum Decoded Logs dataset as source, uses a transform to select specific data fields and sinks that data into a Postgres database whose connection details are stored within the A_POSTGRESQL_SECRET secret:

Run goldsky pipeline apply <your-pipeline-config-file-path> in your terminal to create a pipeline.

Once your pipeline is created, run goldsky pipeline start <your_pipeline_name> to start your pipeline.

Monitor a pipeline

When you create a new pipeline, the CLI automatically starts to monitor the status and outputs it in a table format.

If you want to monitor an existing pipeline at a later time, use the goldsky pipeline monitor <your-pipeline-name> CLI command. It refreshes every ten seconds and gives you insights into how your pipeline performs.