Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.goldsky.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The TypeScript transform lets you execute custom TypeScript (or JavaScript) code on each record in your pipeline. This is useful for:
  • Complex data transformations not supported by SQL
  • Custom business logic with type safety
  • Data parsing and formatting
  • Conditional transformations based on complex rules
  • Expanding one input row into many output rows, or filtering rows out
Code runs in a sandboxed WebAssembly environment. TypeScript is transpiled to ES2020 JavaScript (via SWC) when the pipeline is built, then executed inside a QuickJS interpreter shipped as WebAssembly (Extism’s js-pdk).
TypeScript transforms are significantly slower than SQL transforms. Use SQL to filter and project first, and only pass the minimum data required into a TypeScript transform.

Configuration

transforms:
  my_script:
    type: script
    from: <source-or-transform>
    language: typescript
    primary_key: <column-name>
    script: |
      function invoke(data) {
        // Your TypeScript code here
        return data;
      }

Parameters

type
string
required
Must be script
from
string
required
The source or transform to read data from
language
string
required
One of typescript, ts, javascript, or js. TypeScript values are transpiled to JavaScript at pipeline build time; plain JavaScript is passed through unchanged.
primary_key
string
required
The column that uniquely identifies each row
schema
object
Mapping of output field name to Arrow type. Required whenever invoke returns a shape different from the input — any field you return that isn’t declared here will be dropped. If omitted, the output schema is inherited from the input. Supported types: string (alias for utf8), int8, int16, int32, int64, uint8, uint16, uint32, uint64, float16, float32, float64, boolean, binary, date32, date64, timestamp, time32, time64, duration, interval, null.
script
string
required
Your TypeScript code. Must define a top-level invoke(data) function that receives a single record and returns one of: the transformed record object, null to filter the record out, or an array of record objects to expand one input row into many output rows (empty array emits nothing).
parallelism
integer
default:"4"
Number of sandboxed script instances that process rows in parallel. Higher values improve throughput on CPU-bound scripts at the cost of more memory. Use 1 if your script relies on rows being processed in order.
batch_size
integer
default:"0"
Minimum number of rows to accumulate before invoking the script. Smaller upstream batches are combined until this threshold is reached, which reduces per-call overhead on high-volume streams with tiny batches. 0 disables accumulation and processes each batch immediately.

Script structure

Your script must define a top-level invoke function that:
  • Accepts a single data parameter (a plain JS object representing one row)
  • Returns one of:
    • A record object — the transformed row
    • null — filter this row out of the output
    • An array of record objects — expand this input row into many output rows (return [] to emit nothing, include null entries in the array to skip specific rows)
  • Can return a different shape than the input when using the schema configuration
The _gs_op column (insert/update/delete marker) is automatically copied from the input row to each output row, so your script does not need to set it.

Basic example

Declare a schema whenever invoke adds new fields. Without one, Turbo infers the output schema from the input and any new keys you return are dropped.
transforms:
  add_timestamp:
    type: script
    from: source
    language: typescript
    primary_key: id
    schema:
      id: string
      processed_at: int64
      processed: boolean
    script: |
      function invoke(data) {
        data.processed_at = Date.now();
        data.processed = true;
        return data;
      }

Filtering records

Return null to filter out records that don’t match your criteria:
transforms:
  high_value_only:
    type: script
    from: token_balances
    language: typescript
    primary_key: id
    schema:
      id: string
      amount: float64
    script: |
      function invoke(data) {
        // Filter out records with amount <= 1
        if (data.amount <= 1) {
          return null;
        }
        return { id: data.id, amount: data.amount };
      }

Custom output schema

If you omit the schema field, the output schema is inherited from the input. Any new fields you add in invoke will be dropped because they aren’t in the schema. Declare a schema whenever your output differs from the input:
transforms:
  reshape_data:
    type: script
    from: transfers
    language: typescript
    primary_key: transfer_id
    schema:
      transfer_id: string
      sender: string
      receiver: string
      value_eth: float64
      timestamp: string
    script: |
      function invoke(data) {
        return {
          transfer_id: data.id,
          sender: data.from_address,
          receiver: data.to_address,
          value_eth: Number(data.value) / 1e18,
          timestamp: new Date(data.block_timestamp * 1000).toISOString()
        };
      }

Input/output format

The data parameter is a plain JavaScript object with your record’s fields. Values use native JS types — strings stay strings, integers/floats become numbers, booleans stay booleans, list columns become arrays, struct columns become nested objects.
// Input object example
{
  "id": "abc123",
  "address": "0x742d35cc6634c0532925a3b844bc9e7595f0beb",
  "value": "1000000000000000000",
  "block_number": 12345678
}
Return a modified object:
{
  "id": "abc123",
  "address": "0x742d35cc6634c0532925a3b844bc9e7595f0beb",
  "value": "1000000000000000000",
  "block_number": 12345678,
  "value_eth": "1.0",  // Added field
  "is_large": true      // Added field
}

Expanding one row into many

Return an array of objects from invoke to emit multiple output rows for a single input row. This is useful for unpacking nested arrays or cross-joining with a lookup list. Returning [] drops the row entirely; null entries inside the array are skipped.
transforms:
  expand_items:
    type: script
    from: orders
    language: typescript
    primary_key: item_id
    schema:
      item_id: string
      order_id: string
      sku: string
    script: |
      interface Order {
        id: string;
        items: Array<{ sku: string }>;
      }

      function invoke(data: Order) {
        return data.items.map((item, i) => ({
          item_id: `${data.id}-${i}`,
          order_id: data.id,
          sku: item.sku,
        }));
      }

Examples

Example: Type-safe value formatting

Convert wei to ETH with TypeScript type safety:
transforms:
  format_values:
    type: script
    from: ethereum_transfers
    language: typescript
    primary_key: id
    schema:
      id: string
      from_address: string
      to_address: string
      value: string
      block_number: int64
      value_eth: string
      size_label: string
    script: |
      interface Transfer {
        id: string;
        from_address: string;
        to_address: string;
        value: string;
        block_number: number;
      }

      type SizeLabel = "whale" | "large" | "normal";

      function invoke(data: Transfer): Transfer & {
        value_eth: string;
        size_label: SizeLabel;
      } {
        // Convert wei to ETH
        const valueWei = BigInt(data.value);
        const valueEth = Number(valueWei) / 1e18;

        // Determine size label
        let size_label: SizeLabel;
        if (valueEth > 1000) {
          size_label = "whale";
        } else if (valueEth > 10) {
          size_label = "large";
        } else {
          size_label = "normal";
        }

        return {
          ...data,
          value_eth: valueEth.toFixed(6),
          size_label
        };
      }

Example: Parse JSON fields

Extract data from JSON strings with type safety:
transforms:
  parse_metadata:
    type: script
    from: nft_transfers
    language: typescript
    primary_key: id
    schema:
      id: string
      token_id: string
      metadata: string
      nft_name: string
      nft_description: string
      image_url: string
      attributes_count: int64
      parse_error: boolean
    script: |
      interface NFTMetadata {
        name?: string;
        description?: string;
        image?: string;
        attributes?: Array<{ trait_type: string; value: string }>;
      }

      interface NFTTransfer {
        id: string;
        token_id: string;
        metadata?: string;
      }

      function invoke(data: NFTTransfer): NFTTransfer & {
        nft_name: string;
        nft_description: string;
        image_url: string;
        attributes_count: number;
        parse_error?: boolean;
      } {
        let nft_name = "Unknown";
        let nft_description = "";
        let image_url = "";
        let attributes_count = 0;
        let parse_error: boolean | undefined;

        if (data.metadata) {
          try {
            const meta: NFTMetadata = JSON.parse(data.metadata);
            nft_name = meta.name || "Unknown";
            nft_description = meta.description || "";
            image_url = meta.image || "";
            attributes_count = meta.attributes?.length || 0;
          } catch (e) {
            parse_error = true;
          }
        }

        return {
          ...data,
          nft_name,
          nft_description,
          image_url,
          attributes_count,
          ...(parse_error && { parse_error })
        };
      }

Example: Complex conditional logic

Apply different transformations based on conditions:
transforms:
  categorize_transfers:
    type: script
    from: transfers
    language: typescript
    primary_key: id
    schema:
      id: string
      from_address: string
      to_address: string
      value: string
      category: string
      exchange_from: boolean
      exchange_to: boolean
      risk_score: float64
    script: |
      interface Transfer {
        id: string;
        from_address: string;
        to_address: string;
        value: string;
      }

      type TransferCategory = "exchange_withdrawal" | "exchange_deposit" | "whale_transfer" | "normal_transfer";

      function invoke(data: Transfer): Transfer & {
        category: TransferCategory;
        exchange_from?: boolean;
        exchange_to?: boolean;
        risk_score: number;
      } {
        const value = BigInt(data.value);
        const from = data.from_address.toLowerCase();
        const to = data.to_address.toLowerCase();

        // Known exchange addresses
        const exchanges: string[] = [
          "0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be",
          "0xd551234ae421e3bcba99a0da6d736074f22192ff"
        ];

        let category: TransferCategory;
        let exchange_from: boolean | undefined;
        let exchange_to: boolean | undefined;
        let risk_score = 0;

        // Categorize transfer
        if (exchanges.includes(from)) {
          category = "exchange_withdrawal";
          exchange_from = true;
          risk_score += 0.3;
        } else if (exchanges.includes(to)) {
          category = "exchange_deposit";
          exchange_to = true;
          risk_score += 0.3;
        } else if (value > BigInt("1000000000000000000000")) {
          category = "whale_transfer";
          risk_score += 0.5;
        } else {
          category = "normal_transfer";
        }

        return {
          ...data,
          category,
          ...(exchange_from && { exchange_from }),
          ...(exchange_to && { exchange_to }),
          risk_score
        };
      }

Example: String manipulation

Clean and format text data:
transforms:
  clean_data:
    type: script
    from: source
    language: typescript
    primary_key: id
    schema:
      id: string
      address: string
      from_address: string
      to_address: string
      symbol: string
      short_address: string
    script: |
      interface TokenData {
        id: string;
        address?: string;
        from_address?: string;
        to_address?: string;
        symbol?: string;
      }

      function invoke(data: TokenData): TokenData & {
        short_address?: string;
      } {
        // Normalize addresses to lowercase
        if (data.address) {
          data.address = data.address.toLowerCase();
        }
        if (data.from_address) {
          data.from_address = data.from_address.toLowerCase();
        }
        if (data.to_address) {
          data.to_address = data.to_address.toLowerCase();
        }

        // Trim and clean strings
        if (data.symbol) {
          data.symbol = data.symbol.trim().toUpperCase();
        }

        // Extract short address for display
        const short_address = data.address
          ? data.address.substring(0, 10) + "..."
          : undefined;

        return {
          ...data,
          ...(short_address && { short_address })
        };
      }

Example: Array and object manipulation

Work with complex data structures:
transforms:
  process_array_data:
    type: script
    from: solana_blocks
    language: typescript
    primary_key: slot
    schema:
      slot: int64
      transaction_count: int64
      successful_txs: int64
      success_rate: string
    script: |
      interface Transaction {
        meta?: {
          err: any;
        };
      }

      interface SolanaBlock {
        slot: number;
        transactions?: Transaction[];
      }

      function invoke(data: SolanaBlock): SolanaBlock & {
        transaction_count: number;
        successful_txs: number;
        success_rate: string;
      } {
        let transaction_count = 0;
        let successful_txs = 0;
        let success_rate = "0.00";

        if (data.transactions && Array.isArray(data.transactions)) {
          transaction_count = data.transactions.length;

          successful_txs = data.transactions.filter(
            tx => tx.meta && tx.meta.err === null
          ).length;

          success_rate = transaction_count > 0
            ? (successful_txs / transaction_count * 100).toFixed(2)
            : "0.00";
        }

        return {
          ...data,
          transaction_count,
          successful_txs,
          success_rate
        };
      }

TypeScript features

Type safety benefits

TypeScript provides:
  • Compile-time type checking: Catch errors before deployment
  • IntelliSense: Better IDE autocomplete and suggestions
  • Refactoring support: Safer code changes
  • Self-documenting code: Types serve as inline documentation

Supported TypeScript features

  • Interface definitions - Type aliases - Union and intersection types - Generic types - Optional properties (?) - Readonly properties
  • All ES6+ features (arrow functions, destructuring, spread operator) - JSON.parse() and JSON.stringify() - Math object (Math.floor, Math.random, etc.) - Date object - String methods (split, substring, replace, etc.) - Array methods (map, filter, reduce, etc.) - Object methods (Object.keys, Object.values, etc.) - BigInt for large number handling
  • typeof checks - instanceof checks - Custom type predicates - Discriminated unions

NOT available

The following features are not available inside the sandbox:
  • require() or import statements (no external modules — bundle everything into the script field yourself)
  • File system access
  • Network requests (fetch, XMLHttpRequest) — outbound HTTP is blocked. Use an HTTP handler transform when you need to call external APIs.
  • Node.js and browser-only APIs (process, fs, http, window, document, etc.)
  • Timers (setTimeout, setInterval) — QuickJS has no event loop
  • Async/await and Promises (code must be synchronous)
Keep your scripts self-contained. The runtime supports ES2020 plus QuickJS built-ins — no Node, no browser DOM, no package imports.

What is available

The runtime is a QuickJS interpreter running inside WebAssembly. In addition to standard ES2020 syntax, you can use:
  • JSON.parse / JSON.stringify
  • Math, Date, BigInt, RegExp, Map, Set
  • All String, Array, and Object prototype methods
  • console.log / console.error — writes to the pipeline’s stderr log (visible via goldsky turbo logs, but not queryable from your sink)

Error Handling

Always include error handling in your scripts:
transforms:
  safe_transform:
    type: script
    from: source
    language: typescript
    primary_key: id
    schema:
      id: string
      value: string
      metadata: string
      value_eth: string
      processing_error: boolean
      error_message: string
    script: |
      interface Row {
        id: string;
        value: string;
        metadata?: string;
      }

      function invoke(data: Row): Row & {
        value_eth?: string;
        processing_error?: boolean;
        error_message?: string;
      } {
        try {
          const value = BigInt(data.value);
          const value_eth = (Number(value) / 1e18).toFixed(6);

          // Validate metadata is parseable (result discarded; field omitted from schema)
          if (data.metadata) {
            JSON.parse(data.metadata);
          }

          return { ...data, value_eth };
        } catch (error) {
          return {
            ...data,
            processing_error: true,
            error_message: error instanceof Error ? error.message : "Unknown error"
          };
        }
      }
If your script throws an unhandled error, the pipeline will retry processing that record. Use try/catch to handle errors gracefully and flag problematic records for later review.

Performance tuning

Each transform exposes two optional knobs for throughput: parallelism and batch_size.

Parallelism

Controls how many sandboxed script instances process rows in parallel. Each instance handles a slice of the incoming batch.
  • Default: 4
  • Higher values: More concurrency, proportionally more memory
  • 1: Sequential processing — use this when your script depends on row order

Batch size

Controls how many rows are accumulated before invoke is called. Smaller upstream batches are combined until the threshold is reached, which reduces per-call overhead.
  • Default: 0 (disabled — each upstream batch is processed immediately)
  • Higher values: Better throughput on high-volume streams with tiny batches
  • Trade-off: Higher values increase end-to-end latency as rows wait to accumulate

Example

transforms:
  high_throughput_transform:
    type: script
    from: source
    language: typescript
    primary_key: id
    parallelism: 8
    batch_size: 1000
    script: |
      function invoke(data) {
        // Expensive CPU-bound work benefits from parallelism: 8
        return data;
      }

When to tune these parameters

ScenarioRecommendation
High-volume streams with small batchesIncrease batch_size to reduce WASM call overhead
CPU-bound transforms with large batchesIncrease parallelism to process slices concurrently
Memory-constrained environmentsReduce parallelism to limit concurrent instances
Low-latency requirementsKeep batch_size at 0 to process immediately
Order-sensitive processingUse parallelism: 1 to ensure sequential processing
Start with the defaults and adjust based on observed performance. Monitor memory usage when increasing parallelism, and monitor latency when increasing batch_size.

Performance considerations

  • TypeScript is transpiled to JavaScript once when the pipeline starts (no per-record transpile cost) - Each row is executed inside a QuickJS interpreter, which is significantly slower than native SQL transforms - Every record is evaluated individually - Keep scripts simple and avoid expensive per-row work (regex compilation, JSON.parse on huge blobs, allocating large temporary objects, etc.)
  • Scripts run in a sandboxed environment with limited memory - Avoid creating large data structures - Process records one at a time, don’t accumulate state - Clean up temporary variables - Higher parallelism values increase memory usage proportionally
  • Pre-define types and interfaces outside the function - Use built-in methods (Array.map, filter) instead of manual loops - Avoid nested loops and recursive functions - Cache frequently accessed values in variables - Use parallelism and batch_size to tune throughput for your workload

Debugging

Add debug fields

console.log output goes to the pipeline’s stderr log and is not visible in your sink. To inspect intermediate values in the data you actually ship, add debug fields to the returned record:
interface Row {
  id: string;
  value: string;
}

function invoke(
  data: Row
): Row & { debug_original_value: string; debug_new_value: string } {
  const debug_original_value = data.value;

  // Do transformation
  const value = (BigInt(data.value) / BigInt(1e18)).toString();

  return {
    ...data,
    value,
    debug_original_value,
    debug_new_value: value,
  };
}
Then query your sink to see the debug fields.

Test locally

Before deploying, test your logic in a TypeScript playground or Node.js:
interface TestInput {
  id: string;
  value: string;
}

function invoke(data: TestInput): TestInput & { value_eth: string } {
  return {
    ...data,
    value_eth: (Number(BigInt(data.value)) / 1e18).toFixed(6),
  };
}

// Test with sample data
const testData: TestInput = {
  id: "test",
  value: "1000000000000000000",
};

console.log(invoke(testData));
// Output: { id: "test", value: "1000000000000000000", value_eth: "1.000000" }

Best practices

1

Use SQL when possible

SQL transforms are faster and more efficient. Only use TypeScript for logic that SQL cannot express.
2

Define clear types

Define interfaces for your input and output types:
interface Input {
  id: string;
  value: string;
}

function invoke(data: Input): Input & { value_eth: string } {
  // TypeScript will enforce types
  return {
    ...data,
    value_eth: (Number(BigInt(data.value)) / 1e18).toFixed(6)
  };
}
3

Use null to filter records

Return null to filter out records that don’t match your criteria:
interface Row { id: string; value: number; }

function invoke(data: Row): Row | null {
  if (data.value <= 0) {
    return null;  // Filter out this record
  }
  return data;
}
4

Handle null and undefined

Always check for null/undefined values:
interface Row { id: string; value: string | null; }

function invoke(data: Row): Row & { value_eth?: string } {
  if (data.value != null) {
    const value_eth = (Number(BigInt(data.value)) / 1e18).toFixed(6);
    return { ...data, value_eth };
  }
  return data;
}
5

Use type guards

Validate data types at runtime:
function invoke(data: any): any {
  if (typeof data.value === 'string' && data.value.length > 0) {
    data.value_eth = (Number(BigInt(data.value)) / 1e18).toFixed(6);
  }
  return data;
}

When to use TypeScript vs SQL vs HTTP handler

Use CaseBest Transform
Filtering, projections, simple mathSQL - Fastest and most efficient
External API calls, enrichmentHTTP Handler - Access external data
Complex parsing, custom logicTypeScript - Full programming flexibility with type safety
String manipulation within bounds of SQL functionsSQL - More efficient
Conditional logic based on multiple fieldsTypeScript if complex, SQL if simple CASE works
JSON parsing and manipulationTypeScript - Use JSON.parse() with type safety
Working with BigInt calculationsTypeScript - Native BigInt support
Type-safe data transformationsTypeScript - Compile-time type checking