pipes-new-indexer

Pipes: New Indexer

Create new blockchain indexer projects using the Pipes CLI.

When to Use This Skill

Activate when user wants to:

Create a new indexer from scratch
Generate a project with templates (ERC20, Uniswap V3, etc.)
Start indexing a new blockchain protocol
Set up a fresh indexer with proper structure

Overview

The Pipes CLI (@iankressin/pipes-cli ) provides an interactive scaffolding tool that generates production-ready indexer projects with built-in templates for common use cases.

Available Templates

Use npx @iankressin/pipes-cli@latest init --schema to see the full list of available templates and their parameter schemas.

EVM Templates

erc20Transfers - Track ERC20 token transfers:

{"templateId": "erc20Transfers", "params": {"contractAddresses": ["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"]}}

uniswapV3Swaps - Track Uniswap V3 swap events via factory pattern:

{"templateId": "uniswapV3Swaps", "params": {"factoryAddress": "0x1F98431c8aD98523631AE4a59f267346ea31F984"}}

custom - Custom contract events (requires full event ABI):

{"templateId": "custom", "params": {"contracts": [{"contractAddress": "0x...", "contractName": "MyContract", "contractEvents": [{"name": "Transfer", "type": "event", "inputs": [{"name": "from", "type": "address"}, {"name": "to", "type": "address"}, {"name": "value", "type": "uint256"}]}]}]}}

SVM (Solana) Templates

custom - Start with a blank template for custom logic

CRITICAL: Template IDs must use camelCase format. Each template has specific required params

check the schema.

Supported Sinks

ClickHouse - High-performance analytics database (recommended)
PostgreSQL - Relational database with Drizzle ORM

Note: Memory sink is listed in the schema but not yet implemented in the CLI.

How to Use the CLI

Programmatic Mode (RECOMMENDED for Claude Code)

ALWAYS use programmatic mode with the published npm package:

npx @iankressin/pipes-cli@latest init --config '{ "projectFolder": "/path/to/my-indexer", "packageManager": "bun", "networkType": "evm", "network": "ethereum-mainnet", "templates": [{"templateId": "erc20Transfers", "params": {"contractAddresses": ["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"]}}], "sink": "clickhouse" }'

CRITICAL: Template IDs must use camelCase:

Use "uniswapV3Swaps" NOT "uniswap-v3-swaps"
Use "erc20Transfers" NOT "erc20-transfers"

Inspecting Available Templates

Before creating an indexer, inspect supported templates and their configuration:

npx @iankressin/pipes-cli@latest init --schema

This displays:

All available template IDs (camelCase format)
Required and optional parameters for each template
Sink-specific configurations
Network options

Critical Rule: NEVER MANUALLY CREATE INDEXER FILES

ALWAYS use the Pipes CLI programmatic mode. Manual file creation = YOLO mode = guaranteed problems.

If the CLI fails:

Fix the CLI issue first
Never work around it by creating files manually
Manual creation bypasses all scaffolding, dependency setup, and configuration

Workflow for Helping Users

Step 0: Research Protocol Architecture (MANDATORY)

Before writing ANY code or generating the project:

Understand the protocol structure:

Visit the protocol's documentation
Identify contract relationships (vault vs underlying protocol, factory vs instances, etc.)
Determine which contract emits the events you need

Ask clarifying questions:

What blockchain do they want to index? (Ethereum, Polygon, Solana, etc.)
What does "track X" mean in this context? (e.g., "allocations" could mean rebalancing events OR actual positions)
Which contract emits the relevant events? (Don't assume - verify!)
Is there a specific contract, pool, or address? (Important for customization)
Time range needed? (Recent data only = faster, full history = slower)
Where should the data be stored? (ClickHouse, PostgreSQL, CSV)
What should the project be named?

Verify your understanding:

Look at actual transactions on Etherscan to see which events are emitted
Check if there are multiple contracts involved
Understand the data flow between contracts

Step 1: Inspect Available Templates

Before generating the project, inspect the available templates:

npx @iankressin/pipes-cli@latest init --schema

This ensures you use the correct templateId and understand the required configuration.

Step 2: Run the CLI

IMPORTANT: Use camelCase for templateId values. Every template requires a params object - check --schema for required fields.

Step 3: Post-generation Setup (AUTOMATED - Do this AFTER CLI succeeds)

If using ClickHouse (Local Docker):

Get the actual password from existing container OR use "default" if creating new

Create the database:

docker exec <container-name> clickhouse-client --query "CREATE DATABASE IF NOT EXISTS pipes"

Update the .env file with correct password:

sed -i '' 's/CLICKHOUSE_PASSWORD=.*/CLICKHOUSE_PASSWORD=<actual-password>/' <project-folder>/.env

CRITICAL - CLEAR SYNC TABLE IF REUSING DATABASE:

If you're sharing a ClickHouse database between multiple indexers, ALWAYS clear the sync table:

docker exec <container-name> clickhouse-client --password <password>
--query "DROP TABLE IF EXISTS pipes.sync"

Why this matters: Shared sync tables cause indexers to resume from wrong blocks, skip data, or sync incorrect ranges. This is a common source of "missing data" errors.

If using ClickHouse Cloud:

Configure .env for Cloud:

CLICKHOUSE_URL=https://[service-id].[region].aws.clickhouse.cloud:8443 CLICKHOUSE_DATABASE=pipes CLICKHOUSE_USER=default CLICKHOUSE_PASSWORD=[your-actual-cloud-password]

Create database manually (CLI migrations don't create databases):

Go to https://clickhouse.cloud/
Navigate to your service
Click "SQL Console"
Run: CREATE DATABASE IF NOT EXISTS pipes;

Verify connection before running indexer:

curl -X POST "https://[your-service-id].[region].aws.clickhouse.cloud:8443/"
--user "default:[your-password]"
-d "SELECT 1"

For complete deployment guide (local Docker or ClickHouse Cloud), see pipes-deploy skill.

Step 4: Customization

For EVM contracts: Update contract addresses in the generated transformer
For custom event handling: Modify the transformer logic
For database schema: Edit the table definitions
For ABI generation: See references/ABI_GUIDE.md

Step 5: Start and Validate

cd <project-folder> bun run dev

VERIFY START BLOCK - Check the first log message shows your intended start block, not a resumed block.

Complete Automation Script

Follow these steps IN ORDER for first-time setup:

Step 1: Check/setup database (ClickHouse example)

CLICKHOUSE_CONTAINER=$(docker ps --filter "name=clickhouse" --format "{{.Names}}" | head -n 1)

if [ -z "$CLICKHOUSE_CONTAINER" ]; then echo "No ClickHouse found, starting new one..." docker run -d --name clickhouse
-p 8123:8123 -p 9000:9000
-e CLICKHOUSE_PASSWORD=default
clickhouse/clickhouse-server CLICKHOUSE_PASSWORD="default" else echo "Using existing ClickHouse: $CLICKHOUSE_CONTAINER" CLICKHOUSE_PASSWORD=$(docker inspect $CLICKHOUSE_CONTAINER | grep CLICKHOUSE_PASSWORD | cut -d'"' -f4) fi

docker exec $CLICKHOUSE_CONTAINER clickhouse-client --query "CREATE DATABASE IF NOT EXISTS pipes"

Step 2: Generate the indexer project

npx @iankressin/pipes-cli@latest init --config '{ "projectFolder": "/path/to/my-new-indexer", "packageManager": "bun", "networkType": "evm", "network": "ethereum-mainnet", "templates": [{"templateId": "erc20Transfers", "params": {"contractAddresses": ["0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48"]}}], "sink": "clickhouse" }'

Step 3: Fix the .env file

cd /path/to/my-new-indexer sed -i '' "s/CLICKHOUSE_PASSWORD=.*/CLICKHOUSE_PASSWORD=$CLICKHOUSE_PASSWORD/" .env

Step 4: Run the indexer

bun run dev

Performance Considerations

Sync Speed Factors

Start block range:

Smaller range = faster sync
1M blocks: 5-10 minutes
5M blocks: 30-60 minutes
Full chain: 2-4 hours

Filtering type:

Contract events (fastest): Events from specific contracts
Token pair filtering (medium): Factory pattern with filters
Address filtering (slowest): Requires scanning all transfers

Number of contracts tracked:

Fewer contracts = faster processing
Start with 1-3 key tokens, expand later if needed

Quick Testing Strategy

For fast iteration during development:

Start with recent blocks (last 1-2 weeks):

range: { from: '21,000,000' }

Test with limited contracts:

contracts: ['0xc02aaa39b223fe8d0a0e5c4f27ead9083c756cc2'] // Just WETH

Once working, expand the range and contracts

Key SDK Patterns (Latest)

Event Parameter Filtering

Filter events by indexed parameters at the source level for maximum performance:

events: { transfers: { event: commonAbis.erc20.events.Transfer, params: { from: ['0x87482e84503639466fad82d1dce97f800a410945'], to: '0x10b32a54eeb05d2c9cd1423b4ad90c3671a2ed5f', }, }, }

Factory Pattern

Track dynamically created contracts (e.g., Uniswap pools). Use factory() inside the contracts field of evmDecoder :

import { evmDecoder, factory, factorySqliteDatabase } from '@subsquid/pipes/evm'

const swaps = evmDecoder({ range: { from: '12,369,621' }, contracts: factory({ address: ['0x1f98431c8ad98523631ae4a59f267346ea31f984'], event: factoryEvents.PoolCreated, parameter: 'pool', database: await factorySqliteDatabase({ path: './factory-pools.sqlite' }), }), events: { swaps: poolEvents.Swap, }, })

Combining Multiple Decoders

Use .pipeComposite() to run multiple named decoders in a single pipeline:

const stream = evmPortalSource({ portal: '...' }).pipeComposite({ transfers: evmDecoder({ events: { transfers: commonAbis.erc20.events.Transfer } }), swaps: evmDecoder({ events: { swaps: uniswapV3Abi.events.Swap } }), })

Target Configuration (ClickHouse)

import { clickhouseTarget } from '@subsquid/pipes/targets/clickhouse'

stream.pipeTo(clickhouseTarget({ client: createClient({ url: process.env.CLICKHOUSE_URL }), onData: async (ctx, data) => { await ctx.insert('transfers', data.transfers) }, onRollback: async (ctx, range) => { // Handle chain reorgs }, }))

Target Configuration (PostgreSQL with Drizzle)

import { drizzleTarget } from '@subsquid/pipes/targets/drizzle/node-postgres'

stream.pipeTo(drizzleTarget({ db: drizzle(pool), tables: [transfersTable], onData: async (ctx, data) => { await ctx.db.insert(transfersTable).values(data.transfers) }, }))

Pipes Best Practices

Single Source of Truth (Database-Centric)

State should live in the database, not in process memory:

Prevents data loss from crashes or restarts
Enables recovery and replay from any point
Use idempotent inserts/updates

ClickHouse patterns:

Materialized views for derived metrics
SummingMergeTree for additive aggregations
AggregatingMergeTree for complex metrics
CollapsingMergeTree for reorg handling

PostgreSQL patterns:

ON CONFLICT UPDATE clauses for state reconciliation
Ensures safe re-runs without duplicates

Block-Aware Pipelines

Always start collection at or before contract deployment block:

Critical for metrics requiring complete event history
Ensures historical accuracy from block 0
Check deployment block on Etherscan before configuring

Reorg Handling (ClickHouse)

For reorg-sensitive events, use CollapsingMergeTree:

// Schema with sign field CREATE TABLE events ( ... sign Int8 ) ENGINE = CollapsingMergeTree(sign) ORDER BY (entity_id, block_number, tx_hash, event_type)

// Rollback handler onRollback: (ctx, range) => { // Insert sign=-1 records for rolled-back blocks const rollbackRecords = events .filter(e => e.block >= range.from) .map(e => ({ ...e, sign: -1 }))

return ctx.insert(rollbackRecords) }

Critical: ORDER BY must include ALL distinguishing fields to prevent unwanted event deduplication.

Event-Based vs. State Queries

Indexers track historical flow (event-based):

Example: "User deposited 100, withdrew 110" = -10 net flow
Good for: Transaction history, activity tracking, audit logs

RPC queries track current state:

Example: "User currently holds 50 shares" = current balance
Good for: Current positions, real-time snapshots

Important: Withdrawals including accrued interest can make event flows appear negative even when positions are positive. Use RPC for current balances, events for historical analysis.

Validation Requirements

Always validate indexed data before production use:

Cross-reference sample transactions with block explorer
Verify event counts match expected ranges
Check for missing blocks or gaps
Reconcile aggregated metrics with known totals

Troubleshooting

CLI Issues

"Network timeout with npx"

Check internet connection
Try again or wait a moment
Ensure npm registry is accessible

"Template 'uniswap-v3-swaps' not found"

Use camelCase: uniswapV3Swaps not uniswap-v3-swaps
Run npx @iankressin/pipes-cli@latest init --schema to see available templates

"Template ID not recognized"

Run --schema flag to verify available templates and their exact IDs
Ensure you're using the latest CLI version with @latest

Database Issues

"Authentication failed: password is incorrect"

Check actual password: docker inspect <container> | grep CLICKHOUSE_PASSWORD
Update .env file with correct password

"Database pipes does not exist"

Create it: docker exec <container> clickhouse-client --query "CREATE DATABASE IF NOT EXISTS pipes"

"port is already allocated"

Use existing container instead of starting new one

"Indexer starts from wrong block / Missing data"

MOST COMMON ISSUE: Shared sync table between projects
Clear the sync table: docker exec <container> clickhouse-client --query "DROP TABLE IF EXISTS pipes.sync"
Restart the indexer - it will now start from the configured block

Related Skills

See ENVIRONMENT_SETUP.md for setup verification - Verify environment first
pipes-troubleshooting - Fix issues
pipes-deploy - Local and cloud deployment

pipes-new-indexer

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

pipes-template-nft-transfers

pipes-performance

pipes-deployment

pipes-deploy-clickhouse-local