This document describes all configuration options available for the Token API Scraper.
The application can be configured via environment variables. Copy .env.example to .env and adjust the settings:
cp .env.example .env-
CLICKHOUSE_URL- ClickHouse database URL- Default:
http://localhost:8123 - Example:
http://clickhouse.example.com:8123
- Default:
-
CLICKHOUSE_USERNAME- ClickHouse username- Default:
default
- Default:
-
CLICKHOUSE_PASSWORD- ClickHouse password- Default: (empty)
-
CLICKHOUSE_DATABASE- ClickHouse database name- Default:
default
- Default:
NODE_URL- EVM RPC node URL (required)- Example:
https://your-rpc-node.example.com
- Example:
The scraper can apply curated name and symbol values from an external tokens.json file (e.g. CoinGecko-sourced) to matching rows already stored in the metadata table. Overrides are applied once at service startup, before the normal metadata scrape loop begins.
TOKEN_OVERRIDES_URL- URL of atokens.jsonfile to fetch overrides from- Default: (not set — overrides disabled)
- The file must be a JSON array of objects with
network,contract, and at least one ofname,symbol, ordecimals decimalsis optional; when present and different from the stored value, the startup override row updates it too- For each matching
(network, contract)already present inmetadata, the scraper inserts a new replacement row with the curated override values - If an override token is not in
metadatayet, the scraper inserts a startup row for it using the override values and defaultsdecimalsto18when the override entry does not provide one - If the URL is unreachable at startup, the scraper leaves existing metadata unchanged and logs a warning
CONCURRENCY- Number of concurrent RPC requests- Default:
10 - Recommended range:
5-20, depending on RPC node capacity and network conditions - Higher values: Faster processing but may hit rate limits
- Lower values: Slower but more conservative on RPC resources
- Default:
Example:
# Set concurrency to 5 for conservative processing
CONCURRENCY=5 npm run startThe retry mechanism uses exponential backoff with jitter to handle transient RPC failures gracefully:
-
MAX_RETRIES- Maximum number of retry attempts for failed RPC requests- Default:
3 - Controls how many times to retry a failed request
- Default:
-
BASE_DELAY_MS- Base delay in milliseconds for exponential backoff between retries- Default:
400 - Starting delay between retries, which grows exponentially
- Default:
-
JITTER_MIN- Minimum jitter multiplier for backoff delay- Default:
0.7(70% of backoff) - Add randomness to retry delays to prevent thundering herd
- Default:
-
JITTER_MAX- Maximum jitter multiplier for backoff delay- Default:
1.3(130% of backoff)
- Default:
-
MAX_DELAY_MS- Maximum delay in milliseconds between retry attempts- Default:
30000(30 seconds) - Cap on the maximum delay between retries
- Default:
-
TIMEOUT_MS- Timeout in milliseconds for individual RPC requests- Default:
10000(10 seconds) - How long to wait for a single RPC request before timing out
- Default:
Example:
# More aggressive retry settings for unreliable networks
MAX_RETRIES=5 BASE_DELAY_MS=1000 MAX_DELAY_MS=60000 npm run startPrometheus metrics are always enabled and provide real-time monitoring of service performance.
PROMETHEUS_PORT- Prometheus metrics HTTP port- Default:
9090 - Port where metrics will be exposed
- Default:
The Prometheus server starts automatically when a service runs and stays alive across service restarts for continuous monitoring.
Example:
# Prometheus metrics are always enabled on port 9090 by default
npm run start
# Specify a custom port
PROMETHEUS_PORT=8080 npm run startAvailable Metrics:
scraper_total_tasks- Total number of tasks to processscraper_completed_tasks_total- Total number of completed tasks (labeled by status: success/error)scraper_error_tasks_total- Total number of failed tasksscraper_requests_per_second- Current requests per secondscraper_progress_percentage- Current progress percentagescraper_config_info- Configuration information (ClickHouse URL, database, node URL)
Access metrics at: http://localhost:9090/metrics (or your configured port)
The batch insert mechanism improves ClickHouse insert performance by accumulating rows and inserting them in batches instead of one-by-one. This significantly reduces database overhead and improves throughput.
Batch inserts are always enabled to ensure optimal performance. You can configure the batching behavior with the following settings:
-
BATCH_INSERT_INTERVAL_MS- Flush interval in milliseconds- Default:
1000(1 second) - How often to flush accumulated inserts to ClickHouse
- Lower values: More frequent inserts, lower latency
- Higher values: Larger batches, better throughput
- Default:
-
BATCH_INSERT_MAX_SIZE- Maximum batch size before forcing a flush- Default:
10000rows - Flush immediately when this many rows are accumulated
- Prevents memory issues with large queues
- Adjust based on available memory and row size
- Default:
Example:
# Default batch settings (1 second, 10000 rows)
npm run start
# Custom batch settings for high-throughput scenarios
BATCH_INSERT_INTERVAL_MS=5000 BATCH_INSERT_MAX_SIZE=50000 npm run start
# Lower latency configuration (more frequent flushes)
BATCH_INSERT_INTERVAL_MS=500 BATCH_INSERT_MAX_SIZE=5000 npm run startBatch insert benefits:
- Improved insert throughput for large volumes of data
- Reduced database overhead and network calls
- Better resource utilization in high-concurrency scenarios
Services automatically restart after successful completion in a continuous loop. The service runs in the same process without exiting, preserving Prometheus metrics across runs.
AUTO_RESTART_DELAY- Delay in seconds before restarting the service- Default:
10 - Minimum:
1second - Time to wait before restarting the service after completion
- Default:
Example:
# Use default 10 second delay between restarts
npm run cli run metadata-transfers
# Custom 30 second delay between restarts
AUTO_RESTART_DELAY=30 npm run cli run metadata-swaps
# Combine with other options
AUTO_RESTART_DELAY=60 CONCURRENCY=20 npm run cli run metadata-transfersBenefits of continuous auto-restart:
- Process stays alive, avoiding overhead of process restarts
- Prometheus metrics are preserved and accumulated across runs
- Better for long-running monitoring scenarios
- Simplified deployment (no need for external process managers)
-
VERBOSE- Enable verbose logging output- Default:
false - Set to
trueto enable detailed console output - When disabled, services use structured logging only
- Prometheus metrics are still computed regardless of this setting
- Default:
-
LOG_TYPE- Type of log output format- Default:
pretty - Options:
pretty,json,hidden - Controls the format of structured logs from tslog
- Default:
-
LOG_LEVEL- Minimum log level to display- Default:
info - Options:
debug,info,warn,error - Controls which log messages are displayed
- Default:
Example:
# Run with verbose logging
VERBOSE=true npm run cli run metadata-transfers
# Run silently with structured logs (default)
npm run cli run metadata-transfers
# Run with JSON logs for production
LOG_TYPE=json LOG_LEVEL=warn npm run cli run metadata-transfersConfiguration values are applied in the following order (later values override earlier ones):
- Default values (hardcoded in the application)
- Environment variables (from
.envfile or system environment) - Command-line flags (highest priority)
Example showing all three:
# .env file
CONCURRENCY=10
# Command-line flag overrides .env
npm run cli run metadata --concurrency 20Create a .env file in the project root with your configuration:
# Database
CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USERNAME=default
CLICKHOUSE_PASSWORD=secret
CLICKHOUSE_DATABASE=evm_data
# RPC
NODE_URL=https://tron-evm-rpc.publicnode.com
# Performance
CONCURRENCY=15
MAX_RETRIES=5
# Monitoring
PROMETHEUS_PORT=9090CLICKHOUSE_URL=http://localhost:8123
CLICKHOUSE_USERNAME=default
CLICKHOUSE_PASSWORD=
CLICKHOUSE_DATABASE=default
NODE_URL=https://tron-evm-rpc.publicnode.com
CONCURRENCY=5CLICKHOUSE_URL=http://clickhouse-cluster:8123
CLICKHOUSE_USERNAME=scraper_user
CLICKHOUSE_PASSWORD=secure_password_here
CLICKHOUSE_DATABASE=production_evm
NODE_URL=https://your-tron-node.example.com
CONCURRENCY=20
MAX_RETRIES=5
BASE_DELAY_MS=500
MAX_DELAY_MS=60000
PROMETHEUS_PORT=9090
BATCH_INSERT_INTERVAL_MS=1000
BATCH_INSERT_MAX_SIZE=10000For maximum throughput when you have a reliable RPC node:
CONCURRENCY=20
MAX_RETRIES=3
TIMEOUT_MS=5000For unstable networks or rate-limited RPC nodes:
CONCURRENCY=5
MAX_RETRIES=10
BASE_DELAY_MS=1000
MAX_DELAY_MS=120000
TIMEOUT_MS=30000A good middle ground for most use cases:
CONCURRENCY=10
MAX_RETRIES=5
BASE_DELAY_MS=400
MAX_DELAY_MS=30000
TIMEOUT_MS=10000If you see many RPC errors:
- Reduce
CONCURRENCYto be less aggressive - Increase
MAX_RETRIESfor more retry attempts - Increase
TIMEOUT_MSif requests are timing out - Check your
NODE_URLis correct and accessible
If processing is slower than expected:
- Increase
CONCURRENCY(if RPC node can handle it) - Decrease
TIMEOUT_MSto fail faster - Verify network connectivity to RPC node
- Check database performance
If you can't connect to ClickHouse:
- Verify
CLICKHOUSE_URLis correct - Check firewall rules allow connection
- Verify credentials (
CLICKHOUSE_USERNAME,CLICKHOUSE_PASSWORD) - Test connection:
curl http://localhost:8123/ping