> ## Documentation Index
> Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Prometheus integration

> Scrape Turbo pipeline metrics into Datadog, Grafana Cloud, PagerDuty, New Relic, or any Prometheus-compatible system

<Note>
  **Enterprise only.** A dedicated Prometheus endpoint is available to enterprise customers who want to route Turbo pipeline metrics into their own observability stack. If you would like access, contact [support@goldsky.com](mailto:support@goldsky.com) or reach out to your account manager.
</Note>

## Overview

If you already run your own observability stack (Datadog, Grafana Cloud, New Relic, etc.), you don't need to build dashboards twice. Every enterprise [Turbo](/turbo-pipelines/introduction) project can be provisioned with a scrape-ready Prometheus endpoint that exposes the same `streamling_*` metrics that power the [health dashboard](/turbo-pipelines/health-dashboard) — letting your existing alerting, dashboarding, and incident-response tools see Turbo pipeline health.

```text theme={null}
Goldsky  ──HTTPS /metrics──▶  your Prometheus-compatible agent  ──▶  Datadog / Grafana / New Relic / PagerDuty / …
```

## Getting access

Contact [support@goldsky.com](mailto:support@goldsky.com) to request the integration. You will receive:

* A project-scoped scrape URL.
* Credentials for scraping (bearer token or basic auth).
* The list of metric names and label conventions available for your project.

All metrics are scoped to your project — you only see metrics for pipelines in the project associated with your credentials.

### Prerequisites

* A Goldsky API key (generated via the Goldsky CLI or in the Dashboard under **Settings → API Keys**).
* Your project must have Prometheus API access enabled — contact your Goldsky account manager.

### Endpoint

| Environment             | URL                                  |
| ----------------------- | ------------------------------------ |
| Production              | `https://prometheus.goldsky.com`     |
| Dev (internal use only) | `https://prometheus.dev.goldsky.com` |

All requests require an `Authorization` header with your Goldsky API key:

```http theme={null}
Authorization: Bearer <your-goldsky-api-key>
```

## Testing with curl

### Instant query

Run a simple query to verify connectivity:

```bash theme={null}
curl -s 'https://prometheus.goldsky.com/api/v1/query' \
  --data-urlencode 'query=streamling_input_rows_total' \
  -H 'Authorization: Bearer <your-goldsky-api-key>' | jq .
```

Expected response:

```json theme={null}
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {
          "__name__": "streamling_input_rows_total",
          "namespace": "streamling-prod",
          "...": "..."
        },
        "value": [1714800000, "1234567"]
      }
    ]
  }
}
```

### Range query

Query metrics over a time range (last 1 hour, 5-minute steps):

```bash theme={null}
curl -s 'https://prometheus.goldsky.com/api/v1/query_range' \
  --data-urlencode 'query=rate(streamling_input_rows_total[5m])' \
  --data-urlencode 'start='$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ) \
  --data-urlencode 'end='$(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --data-urlencode 'step=300' \
  -H 'Authorization: Bearer <your-goldsky-api-key>' | jq .
```

### List available metrics

Discover which label names are available:

```bash theme={null}
curl -s 'https://prometheus.goldsky.com/api/v1/labels' \
  -H 'Authorization: Bearer <your-goldsky-api-key>' | jq .
```

Get all series matching a selector:

```bash theme={null}
curl -s 'https://prometheus.goldsky.com/api/v1/series' \
  --data-urlencode 'match[]=streamling_input_rows_total' \
  -H 'Authorization: Bearer <your-goldsky-api-key>' | jq .
```

### Available API endpoints

| Endpoint                      | Description     |
| ----------------------------- | --------------- |
| `/api/v1/query`               | Instant queries |
| `/api/v1/query_range`         | Range queries   |
| `/api/v1/series`              | Series metadata |
| `/api/v1/labels`              | Label names     |
| `/api/v1/label/<name>/values` | Label values    |

## Setting up in Grafana

### Step 1: Add a Prometheus data source

1. In your Grafana instance, go to **Connections → Data sources → Add data source**.
2. Select **Prometheus**.

### Step 2: Configure the connection

Fill in the following fields:

| Field                     | Value                                         |
| ------------------------- | --------------------------------------------- |
| **Name**                  | `Goldsky Prometheus` (or any name you prefer) |
| **Prometheus server URL** | `https://prometheus.goldsky.com`              |

### Step 3: Add an authentication header

Scroll down to the **Custom HTTP Headers** section:

1. Click **Add header**.
2. Set **Header** to: `Authorization`.
3. Set **Value** to: `Bearer <your-goldsky-api-key>`.

<Tip>
  The value field is treated as a secret — it will be masked after saving.
</Tip>

### Step 4: Save & Test

Click **Save & Test**. You should see:

> ✅ Successfully queried the Prometheus API.

If you see a 401 or 403 error, verify your API key is correct and that your project has Prometheus access enabled.

### Step 5: Explore your metrics

1. Go to **Explore** in Grafana.
2. Select your **Goldsky Prometheus** data source.
3. Try a query like:

   ```promql theme={null}
   streamling_input_rows_total
   ```

You can use the metrics browser to discover all available metrics for your project.

## Example PromQL queries

The exact metrics available depend on your pipeline type (Turbo, Edge, Subgraph, Compose).

### Turbo pipelines

**Input row throughput** — rows ingested per second over the last 5 minutes:

```promql theme={null}
rate(streamling_input_rows_total[5m])
```

**Output row throughput** — rows written to sinks per second:

```promql theme={null}
rate(streamling_output_rows_total[5m])
```

**Block lag** — how far behind the chain head (seconds):

```promql theme={null}
streamling_block_lag_max_seconds
```

**Checkpoint epoch duration** — time to complete a checkpoint epoch:

```promql theme={null}
rate(streamling_checkpoint_epoch_duration_milliseconds_sum[5m]) / rate(streamling_checkpoint_epoch_duration_milliseconds_count[5m])
```

**Failed checkpoints** — checkpoint failures (should be 0):

```promql theme={null}
rate(streamling_checkpoint_epochs_failed_total[5m])
```

### General

**CPU usage** per pod:

```promql theme={null}
rate(container_cpu_usage_seconds_total{namespace=~"turbo-prod|compose-prod"}[5m])
```

**Memory usage** per pod:

```promql theme={null}
container_memory_working_set_bytes{namespace=~"turbo-prod|compose-prod"}
```

## Alerting in Grafana

Once the data source is configured, you can set up alerts in Grafana:

1. Go to **Alerting → Alert rules → New alert rule**.
2. Select your **Goldsky Prometheus** data source.
3. Define your query and threshold condition.
4. Configure a **contact point** (Slack webhook, email, PagerDuty, etc.).
5. Save the alert rule.

Example: alert when input row throughput drops to zero for 5 minutes:

```promql theme={null}
rate(streamling_input_rows_total[5m]) == 0
```

## Troubleshooting

| Issue                         | Solution                                                                                                                                   |
| ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `401 Unauthorized`            | Check that your API key is correct and includes the `Bearer` prefix.                                                                       |
| `403 Forbidden`               | Your project may not have Prometheus access enabled — contact your account manager.                                                        |
| No metrics returned           | Your project may not have active pipelines emitting metrics. Try `curl .../api/v1/labels` to see what's available.                         |
| Grafana **Save & Test** fails | Ensure the URL has no trailing slash and the `Authorization` header is set correctly under **Custom HTTP Headers** (not under Basic Auth). |

## Where the metrics can go

The endpoint uses the standard Prometheus exposition format, so it works with anything that speaks Prometheus. Common destinations fall into three categories:

### Observability platforms (push-based)

These ingest the metrics and give you dashboards, monitors, and alerting on top.

* **Datadog** — use the [OpenMetrics / Prometheus check](https://docs.datadoghq.com/integrations/openmetrics/) on the Datadog Agent. Metrics appear as Datadog metrics and can be used in monitors, dashboards, and SLOs.
* **New Relic** — forward via the [Prometheus OpenMetrics integration](https://docs.newrelic.com/docs/infrastructure/prometheus-integrations/install-configure-openmetrics/).
* **Splunk Observability Cloud / SignalFx** — use the Splunk OpenTelemetry Collector with the Prometheus receiver.
* **AWS CloudWatch / Managed Prometheus** — use the [AWS Distro for OpenTelemetry](https://aws-otel.github.io/) to send metrics to Amazon Managed Service for Prometheus or CloudWatch.

### Prometheus-compatible storage

If you already run Prometheus or a long-term-storage backend, point it at the endpoint as another scrape target.

* **Grafana Cloud** — scrape via [Grafana Alloy](https://grafana.com/docs/alloy/latest/) or a Grafana Agent.
* **Self-hosted Prometheus** — add the endpoint as a [scrape target](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) in `prometheus.yml`.
* **VictoriaMetrics, Thanos, Cortex, Mimir** — any Prometheus-compatible store can scrape the endpoint directly.

### Alerting destinations (indirect)

Paging tools like **PagerDuty**, **Opsgenie**, or **VictorOps** don't scrape metrics directly. Instead, define alert rules in whatever system is scraping the endpoint (Datadog monitors, Prometheus Alertmanager, Grafana alerts) and route those alerts into your paging tool via its native integration.

## Example scrape config

A minimal Prometheus scrape configuration looks like this:

```yaml theme={null}
scrape_configs:
  - job_name: goldsky-turbo
    scrape_interval: 30s
    metrics_path: /metrics
    scheme: https
    static_configs:
      - targets:
          - metrics.goldsky.com # provided by support
    authorization:
      type: Bearer
      credentials: ${GOLDSKY_METRICS_TOKEN}
```

Exact host, path, and auth method will be confirmed when the endpoint is provisioned for your project.

## Available metrics

The endpoint exposes the same `streamling_*` metrics used in the [health dashboard](/turbo-pipelines/health-dashboard). The most commonly alerted on:

| Metric                                                     | Type      | What it tells you                                                                                 |
| ---------------------------------------------------------- | --------- | ------------------------------------------------------------------------------------------------- |
| `streamling_block_lag_max_seconds`                         | Gauge     | End-to-end lag behind the chain tip, in seconds.                                                  |
| `streamling_kafka_consumer_messages_lag`                   | Gauge     | Kafka consumer lag per partition for Kafka-backed sources.                                        |
| `streamling_output_rows_total`                             | Counter   | Cumulative rows emitted by each node (filter by `topology_node_type="sink"` for sink throughput). |
| `streamling_input_rows_total`                              | Counter   | Cumulative rows consumed by each node.                                                            |
| `streamling_elapsed_compute_milliseconds_bucket`           | Histogram | Per-batch compute time, in milliseconds.                                                          |
| `streamling_checkpoint_epochs_failed_total`                | Counter   | Checkpoint epochs that failed to finalize. Any non-zero value is critical.                        |
| `streamling_checkpoint_epochs_succeeded_total`             | Counter   | Checkpoint epochs that finalized successfully.                                                    |
| `streamling_checkpoint_sink_flush_milliseconds_bucket`     | Histogram | Per-sink flush latency on checkpoint markers.                                                     |
| `streamling_checkpoint_epoch_duration_milliseconds_bucket` | Histogram | End-to-end duration between consecutive checkpoint epochs.                                        |

Every metric carries these labels:

* `service_instance_id` — unique pipeline identifier; use this to scope a query to a single pipeline.
* `project_id` — your Goldsky project identifier.
* `id` — the node's `reference_name` (source, transform, or sink name from your pipeline YAML).
* `topology_node_type` — `source`, `transform`, or `sink`.
* `operator_type` — component implementation (e.g. `kafka`, `postgres`, `clickhouse`, `sql`).
* `image_tag` — streamling engine version.

Solana sources additionally emit `streamling_solana_blocks_per_sec`, `streamling_solana_buffer_len`, `streamling_solana_next_slot_to_serve`, and `streamling_solana_fetch_duration_bucket`. Kafka nodes emit `streamling_kafka_consumer_msg_decode_latency_bucket`, `streamling_kafka_producer_batch_send_latency_bucket`, `streamling_kafka_producer_msg_encode_latency_bucket`, and `streamling_kafka_consumer_row_kind_count_total` (with a `kind` label of `Insert`, `Update`, or `Delete`).

For concrete query examples, see the [starter PromQL queries](/turbo-pipelines/custom-alerts#starter-promql-queries) on the custom alerts page.

## Next steps

* Browse the available metrics inside the Goldsky-hosted Grafana workspace first — see the [health dashboard guide](/turbo-pipelines/health-dashboard).
* If you don't need to bring your own stack, use [custom alerts](/turbo-pipelines/custom-alerts) in the Goldsky Grafana workspace for Slack or email notifications without additional setup.
