github ar-io/ar-io-node r49
Release 49

21 hours ago

This is an optional release that significantly improves the ClickHouse ETL pipeline with better performance, reliability, and Apache Iceberg metadata support. While optional for most users, this release is important for anyone experimenting with Parquet exports and ClickHouse integration.

Added

  • Apache Iceberg Metadata Generation: Added generate-iceberg-metadata script to create Apache Iceberg table metadata for exported Parquet datasets, enabling compatibility with query engines like DuckDB and Spark. Controlled by new ENABLE_ICEBERG_GENERATION environment variable (default: false). Note: Iceberg metadata generation is still under active development and currently incomplete.

  • HyperBEAM Sidecar Support: Added optional HyperBEAM container configuration with .env.hb.example template for running AO processes alongside the gateway.

  • ETL Configuration Documentation: Documented existing ClickHouse auto-import environment variables in .env.example:

    • CLICKHOUSE_AUTO_IMPORT_SLEEP_INTERVAL - interval between import cycles (default: 3600 seconds)
    • CLICKHOUSE_AUTO_IMPORT_HEIGHT_INTERVAL - batch size in blocks (default: 10000)
    • CLICKHOUSE_AUTO_IMPORT_MAX_ROWS_PER_FILE - Parquet file size limit (default: 1000000)

Changed

  • ETL Pipeline Architecture: Refactored the ClickHouse ETL pipeline for improved reliability and modularity:
    • Implemented staging-based workflow to prevent data corruption
    • Changed from API-based triggering to direct script execution
    • Made L1 transaction export the default behavior
    • Changed default export location from data/parquet to data/datasets/default
    • Performance: Greatly improved query performance through better index usage in the refactored pipeline
    • Stability: Fixed issue where the 'core' service would occasionally crash due to long-running SQLite queries

Docker Images

  • ar-io-core: ghcr.io/ar-io/ar-io-core:8c1f559a5d8cf8a0a9a4c577b56f7b989b467e62
  • ar-io-envoy: ghcr.io/ar-io/ar-io-envoy:da2abd14cdf3248db21673878c6f2c7b752a3850
  • ar-io-clickhouse-auto-import: ghcr.io/ar-io/ar-io-clickhouse-auto-import:5d06e824ce7d18764bed130025a3f493657cd39d
  • ar-io-observer: ghcr.io/ar-io/ar-io-observer:6cb911e4ac9fd04a1795144f86b77ad0174ee6d9
  • ar-io-litestream: ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8
  • ao-cu: ghcr.io/permaweb/ao-cu:08436a88233f0247f3eb35979dd55163fd51a153

Don't miss a new ar-io-node release

NewReleases is sending notifications on new releases.