Insight

Insight is a high-performance, modular blockchain indexer and data API for EVM chains. It fetches, processes, and stores on-chain data—making it easy to query blocks, transactions, logs, token balances, and more via a robust HTTP API.

🚀 Getting Started

Quickstart (Local Development):

# 1. Clone the repo
git clone https://github.com/thirdweb-dev/insight.git
cd insight

# 2. Copy example configs and secrets
cp configs/config.example.yml configs/config.yml
cp configs/secrets.example.yml configs/secrets.yml

# 3. (Optional) Start dependencies with Docker Compose
docker-compose up -d clickhouse

# 4. Apply ClickHouse migrations
cat internal/tools/*.sql | docker exec -i <clickhouse-container> clickhouse-client --user admin --password password

# 5. Build and run Insight
go build -o main -tags=production
./main orchestrator   # Starts the indexer
./main api           # Starts the API server

# 6. Access the API
# Default: http://localhost:3000

🏗 How It Works

Insight's architecture consists of five main components that work together to continuously index blockchain data:

1. Poller

The Poller continuously fetches new blocks from the configured RPC endpoint. It uses multiple worker goroutines to concurrently retrieve block data, handles successful and failed results, and stores the processed block data and any failures in staging storage.

2. Worker

The Worker processes batches of block numbers, fetching block data, logs, and traces (if supported) from the configured RPC. It divides the work into chunks, processes them concurrently, and returns the results as a collection of WorkerResult structures containing block data, transactions, logs, and traces for each processed block.

3. Committer

The Committer periodically moves data from staging storage to main storage. It ensures blocks are committed sequentially, handling any gaps in the data, and updates various metrics while performing concurrent inserts of blocks, logs, transactions, and traces into the main storage.

4. Failure Recoverer

The FailureRecoverer recovers from block processing failures. It periodically checks for failed blocks, attempts to reprocess them using a worker, and either removes successfully processed blocks from the failure list or updates the failure count for blocks that continue to fail.

5. Orchestrator

The Orchestrator coordinates and manages the poller, failure recoverer, and committer. It initializes these components based on configuration settings and starts them concurrently, ensuring they run independently while waiting for all of them to complete their tasks.

Data Flow

Polling: The Poller continuously checks for new blocks on the blockchain
Processing: Workers fetch and process block data, transactions, logs, and traces
Staging: Processed data is stored in staging storage for validation
Commitment: The Committer moves validated data to main storage
Recovery: Failed blocks are retried by the Failure Recoverer
API: The HTTP API serves queries from the main storage

Work Modes

Insight operates in two distinct work modes that automatically adapt based on how far behind the chain head the indexer is:

Backfill Mode (Catching Up):

Used when the indexer is significantly behind the latest block
Processes blocks in large batches for maximum throughput
Optimized for an error-free indexing process over speed
Automatically switches to live mode when caught up

Live Mode (Real-time):

Used when the indexer is close to the chain head (within ~500 blocks by default)
Processes blocks as they arrive with minimal latency
Optimized for real-time data availability
Switches back to backfill mode if falling behind

The work mode threshold and check interval are configurable via workMode.liveModeThreshold and workMode.checkIntervalMinutes settings.

This modular architecture allows for adaptation to various EVM chains and use cases, with configurable batch sizes, delays, and processing strategies.

⚙️ Installation / Setup

Prerequisites

Go 1.23+
ClickHouse database (Docker Compose included)
(Optional) Redis, Kafka, Prometheus, Grafana for advanced features

Environment Variables & Secrets

Insight supports configuration via environment variables, which is especially useful for containerized deployments and CI/CD pipelines.

Environment Variable Naming Convention:

Use uppercase letters and underscores
Nested YAML paths become underscore-separated variables
Example: rpc.url becomes RPC_URL
Example: storage.main.clickhouse.host becomes STORAGE_MAIN_CLICKHOUSE_HOST

Common Environment Variables:

RPC Configuration:

RPC_URL=https://1.rpc.thirdweb.com/your-client-id
RPC_CHAIN_ID=1
RPC_BLOCKS_BLOCKS_PER_REQUEST=500
RPC_BLOCKS_BATCH_DELAY=100
RPC_LOGS_BLOCKS_PER_REQUEST=250
RPC_LOGS_BATCH_DELAY=100
RPC_TRACES_ENABLED=false

Storage Configuration:

STORAGE_MAIN_CLICKHOUSE_HOST=localhost
STORAGE_MAIN_CLICKHOUSE_PORT=9000
STORAGE_MAIN_CLICKHOUSE_USERNAME=admin
STORAGE_MAIN_CLICKHOUSE_PASSWORD=your-password
STORAGE_MAIN_CLICKHOUSE_DATABASE=main
STORAGE_MAIN_CLICKHOUSE_DISABLE_TLS=false

API Configuration:

API_HOST=0.0.0.0
API_PORT=3000
API_THIRDWEB_CLIENT_ID=your-client-id
API_BASIC_AUTH_USERNAME=admin
API_BASIC_AUTH_PASSWORD=your-api-password

Logging Configuration:

LOG_LEVEL=info
LOG_PRETTIFY=false

Poller Configuration:

POLLER_ENABLED=true
POLLER_INTERVAL=1000
POLLER_BLOCKS_PER_POLL=500
POLLER_FROM_BLOCK=0

Complete Example:

# Set all configuration via environment variables
export RPC_URL="https://1.rpc.thirdweb.com/your-client-id"
export RPC_CHAIN_ID=1
export STORAGE_MAIN_CLICKHOUSE_HOST="your-clickhouse-host"
export STORAGE_MAIN_CLICKHOUSE_PASSWORD="your-password"
export API_BASIC_AUTH_USERNAME="admin"
export API_BASIC_AUTH_PASSWORD="your-api-password"
export LOG_LEVEL="info"

# Run without config files
./main orchestrator
./main api

Secrets Management:

For sensitive credentials, you can use environment variables instead of configs/secrets.yml
Environment variables take precedence over config files
See configs/secrets.example.yml for the complete structure

Docker

docker-compose.yml provides ClickHouse, Redis, Prometheus, and Grafana for local development.
Exposes:
- ClickHouse: localhost:8123 (web UI), localhost:9440 (native)
- Prometheus: localhost:9090
- Grafana: localhost:4000
- Redis: localhost:6379

Database Migrations

SQL migration scripts are in internal/tools/.
Apply them to your ClickHouse instance before running the indexer.

💡 Usage

CLI Commands

Indexer (Orchestrator):
./main orchestrator
Starts the block poller, committer, and failure recovery.
API Server:
./main api
Serves the HTTP API at http://localhost:3000.
Validation & Utilities:
Additional commands: validate, validate-and-fix, migrate-valid (see cmd/).

API Endpoints

All endpoints require HTTP Basic Auth (see configs/secrets.yml).

Blocks:
GET /{chainId}/blocks
Query blocks with filters, sorting, pagination, and aggregation.
Transactions:
GET /{chainId}/transactions
GET /{chainId}/transactions/{to}
GET /{chainId}/transactions/{to}/{signature}
GET /{chainId}/wallet-transactions/{wallet_address}
Logs/Events:
GET /{chainId}/events
GET /{chainId}/events/{contract}
GET /{chainId}/events/{contract}/{signature}
Token Balances & Holders:
GET /{chainId}/balances/{owner}/{type}
GET /{chainId}/holders/{address}
GET /{chainId}/tokens/{address}
Token Transfers:
GET /{chainId}/transfers
Search:
GET /{chainId}/search/{input}
Search by block number, hash, address, or function signature.
Health:
GET /health
Swagger/OpenAPI:
GET /swagger/index.html
GET /openapi.json

See the OpenAPI spec for full details.

🛠 Configuration

Insight supports configuration via multiple methods with the following priority order:

Command-line flags (highest priority)
Environment variables
YAML config files (configs/config.yml)

Configuration Methods

1. YAML Config Files (Recommended for Development):

# configs/config.yml
rpc:
  url: https://1.rpc.thirdweb.com/your-thirdweb-client-id
  blocks:
    blocksPerRequest: 1000
    batchDelay: 0
  logs:
    blocksPerRequest: 400
    batchDelay: 100
  traces:
    enabled: true
    blocksPerRequest: 200
    batchDelay: 100

log:
  level: debug
  pretty: true

storage:
  main:
    clickhouse:
      host: localhost
      port: 9440
      database: "default"
      username: admin
      password: password
      disableTLS: true

2. Environment Variables (Recommended for Production):

# Set configuration via environment variables
export RPC_URL="https://1.rpc.thirdweb.com/your-client-id"
export RPC_BLOCKS_BLOCKS_PER_REQUEST=1000
export RPC_BLOCKS_BATCH_DELAY=0
export LOG_LEVEL="debug"
export LOG_PRETTIFY=true
export STORAGE_MAIN_CLICKHOUSE_HOST="localhost"
export STORAGE_MAIN_CLICKHOUSE_PASSWORD="your-password"

3. Command-line Flags:

# Override specific settings via CLI flags
./main orchestrator --rpc-url="https://1.rpc.thirdweb.com/your-client-id" --log-level=info
./main api --api-host=0.0.0.0 --api-port=8080

Environment Variable Reference

RPC Configuration:

YAML Path	Environment Variable	Description	Default
`rpc.url`	`RPC_URL`	RPC endpoint URL	-
`rpc.chainId`	`RPC_CHAIN_ID`	Blockchain network ID	1
`rpc.blocks.blocksPerRequest`	`RPC_BLOCKS_BLOCKS_PER_REQUEST`	Blocks per RPC request	500
`rpc.blocks.batchDelay`	`RPC_BLOCKS_BATCH_DELAY`	Delay between batches (ms)	100
`rpc.logs.blocksPerRequest`	`RPC_LOGS_BLOCKS_PER_REQUEST`	Logs per RPC request	250
`rpc.logs.batchDelay`	`RPC_LOGS_BATCH_DELAY`	Log batch delay (ms)	100
`rpc.traces.enabled`	`RPC_TRACES_ENABLED`	Enable trace fetching	false
`rpc.traces.blocksPerRequest`	`RPC_TRACES_BLOCKS_PER_REQUEST`	Traces per RPC request	500

Storage Configuration:

YAML Path	Environment Variable	Description	Default
`storage.main.clickhouse.host`	`STORAGE_MAIN_CLICKHOUSE_HOST`	ClickHouse host	localhost
`storage.main.clickhouse.port`	`STORAGE_MAIN_CLICKHOUSE_PORT`	ClickHouse port	9000
`storage.main.clickhouse.username`	`STORAGE_MAIN_CLICKHOUSE_USERNAME`	Database username	admin
`storage.main.clickhouse.password`	`STORAGE_MAIN_CLICKHOUSE_PASSWORD`	Database password	password
`storage.main.clickhouse.database`	`STORAGE_MAIN_CLICKHOUSE_DATABASE`	Database name	main
`storage.main.clickhouse.disableTLS`	`STORAGE_MAIN_CLICKHOUSE_DISABLE_TLS`	Disable TLS	false

API Configuration:

YAML Path	Environment Variable	Description	Default
`api.host`	`API_HOST`	API server host	localhost
`api.port`	`API_PORT`	API server port	3000
`api.thirdweb.clientId`	`API_THIRDWEB_CLIENT_ID`	ThirdWeb client ID	-
`api.basicAuth.username`	`API_BASIC_AUTH_USERNAME`	API username	admin
`api.basicAuth.password`	`API_BASIC_AUTH_PASSWORD`	API password	admin

Logging Configuration:

YAML Path	Environment Variable	Description	Default
`log.level`	`LOG_LEVEL`	Log level (debug, info, warn, error)	debug
`log.prettify`	`LOG_PRETTIFY`	Pretty print logs	true

Poller Configuration:

YAML Path	Environment Variable	Description	Default
`poller.enabled`	`POLLER_ENABLED`	Enable block polling	true
`poller.interval`	`POLLER_INTERVAL`	Polling interval (ms)	1000
`poller.blocksPerPoll`	`POLLER_BLOCKS_PER_POLL`	Blocks per poll	500
`poller.fromBlock`	`POLLER_FROM_BLOCK`	Starting block number	0

Configuration Best Practices

Development:

Use YAML config files for easy configuration management
Keep sensitive data in configs/secrets.yml (gitignored)

Production:

Use environment variables for security and flexibility
Set sensitive credentials via environment variables
Use container orchestration secrets management

Docker/Kubernetes:

# docker-compose.yml example
environment:
  - RPC_URL=https://1.rpc.thirdweb.com/your-client-id
  - STORAGE_MAIN_CLICKHOUSE_HOST=clickhouse
  - STORAGE_MAIN_CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
  - API_BASIC_AUTH_PASSWORD=${API_PASSWORD}

See configs/config.example.yml and configs/secrets.example.yml for complete configuration options.
All config options can be overridden by environment variables or CLI flags.

📁 Project Structure

insight/
  api/           # API layer
  cmd/           # CLI commands (orchestrator, api, validation, etc.)
  configs/       # Config and secrets templates
  docs/          # Swagger/OpenAPI docs
  internal/
    common/      # Core blockchain models/utilities
    handlers/    # HTTP API handlers
    log/         # Logging setup
    metrics/     # Prometheus metrics
    middleware/  # API middleware (auth, CORS, logging)
    orchestrator/# Indexer orchestration logic
    publisher/   # Kafka publisher (optional)
    rpc/         # RPC client logic
    storage/     # ClickHouse connectors
    tools/       # SQL migration scripts
    validation/  # Data validation logic
    worker/      # Block processing workers
  test/          # Mocks and test helpers
  main.go        # Entrypoint
  Dockerfile     # Container build
  docker-compose.yml

🤝 Contributing

Fork & clone the repo.
Install dependencies:
go mod download
Set up local ClickHouse:
docker-compose up -d clickhouse
Apply migrations (see above).
Run tests:
go test ./...
Open a PR with your changes!

🧪 Testing

All core logic is covered by unit tests (see test/ and internal/handlers/*_test.go).
Run the full suite:
```
go test ./...
```

📚 Documentation

API Reference:
- Swagger UI (when running)
- OpenAPI Spec
Metrics:
- Prometheus metrics at http://localhost:2112/metrics
- See internal/metrics/metrics.go for all exposed metrics.
Architecture & Design:
- See the top of this README and code comments for architectural details.

License: Apache 2.0

Let me know if you want this written to your README.md or if you want to further tailor any section!

Name		Name	Last commit message	Last commit date
Latest commit History 426 Commits
.github		.github
api		api
cmd		cmd
configs		configs
docs		docs
internal		internal
test/mocks		test/mocks
.dockerignore		.dockerignore
.gitignore		.gitignore
.mockery.yaml		.mockery.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
funding.json		funding.json
go.mod		go.mod
go.sum		go.sum
indexer		indexer
insight		insight
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Insight

🚀 Getting Started

🏗 How It Works

1. Poller

2. Worker

3. Committer

4. Failure Recoverer

5. Orchestrator

Data Flow

Work Modes

⚙️ Installation / Setup

Prerequisites

Environment Variables & Secrets

Docker

Database Migrations

💡 Usage

CLI Commands

API Endpoints

🛠 Configuration

Configuration Methods

Environment Variable Reference

Configuration Best Practices

📁 Project Structure

🤝 Contributing

🧪 Testing

📚 Documentation

About

Uh oh!

Releases 40

Packages

Uh oh!

Contributors 9

Languages

License

thirdweb-dev/insight

Folders and files

Latest commit

History

Repository files navigation

Insight

🚀 Getting Started

🏗 How It Works

1. Poller

2. Worker

3. Committer

4. Failure Recoverer

5. Orchestrator

Data Flow

Work Modes

⚙️ Installation / Setup

Prerequisites

Environment Variables & Secrets

Docker

Database Migrations

💡 Usage

CLI Commands

API Endpoints

🛠 Configuration

Configuration Methods

Environment Variable Reference

Configuration Best Practices

📁 Project Structure

🤝 Contributing

🧪 Testing

📚 Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 40

Packages 0

Uh oh!

Contributors 9

Languages

Packages