github ar-io/ar-io-node r53
Release 53

16 hours ago

This is an optional release that introduces root transaction offset tracking for nested bundles and observer performance improvements. The release enables more efficient data retrieval through comprehensive offset tracking with Turbo and GraphQL integration, while improving observer reliability with increased chunk validation success rates.

Added

  • Root Transaction and Offset Tracking: Comprehensive offset tracking system for nested ANS-104 bundles:
    • Turbo /offsets endpoint integration for accurate root transaction discovery and offset calculations
    • Handles multi-level nested bundles with cumulative offset tracking
    • Cycle detection and maximum nesting depth protection (10 levels)
    • Database persistence of root transaction IDs and absolute offset values
  • GraphQL Root TX Index: Dedicated GraphQL endpoint configuration for root transaction lookups:
    • GRAPHQL_ROOT_TX_GATEWAYS_URLS: JSON object mapping GraphQL endpoints to weights (default: {"https://arweave-search.goldsky.com/graphql": 1})
    • Parent chain traversal with metadata extraction (content type, size)
    • Fallback mechanism when Turbo is unavailable
    • Configurable lookup order via ROOT_TX_LOOKUP_ORDER (default: "db,turbo")
  • Database Migration: Added offset tracking columns to contiguous_data_ids table:
    • root_transaction_id: Top-level Arweave transaction containing the data
    • root_data_item_offset: Absolute position where data item headers begin in root bundle
    • root_data_offset: Absolute position where data payload begins in root bundle
  • HTTP Headers: New headers exposing absolute root offset information:
    • X-AR-IO-Root-Data-Item-Offset: Enables direct byte-range requests to data item headers
    • X-AR-IO-Root-Data-Offset: Enables direct byte-range requests to data payloads
  • Outbound Rate Limiting for External APIs: Token bucket rate limiting for outbound calls to Turbo and GraphQL services (separate from the Redis-based inbound rate limiter added in Release 52):
    • Turbo API: Configurable via TURBO_ROOT_TX_RATE_LIMIT_BURST_SIZE (default: 5), TURBO_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL (default: 6), TURBO_ROOT_TX_RATE_LIMIT_INTERVAL (default: "minute")
    • GraphQL API: Configurable via GRAPHQL_ROOT_TX_RATE_LIMIT_BURST_SIZE (default: 5), GRAPHQL_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL (default: 6), GRAPHQL_ROOT_TX_RATE_LIMIT_INTERVAL (default: "minute")
    • Prevents excessive API usage and respects external service limits (defaults to 6 requests per minute = 1 per 10 seconds)
  • Configuration Options:
    • ENABLE_DATA_ITEM_ROOT_TX_SEARCH: Enable/disable root transaction search for data items in offset-aware sources (default: true)
    • ENABLE_PASSTHROUGH_WITHOUT_OFFSETS: Control whether offset-aware sources allow data retrieval without offset information (default: true)
    • Dedicated rate limiting configuration for Turbo and GraphQL root TX lookups
    • Separate GraphQL gateway configuration for root lookups vs data retrieval
  • Documentation and Testing:
    • Comprehensive bundle offsets documentation in docs/drafts/bundle-offsets.md
    • Rate limiting behavior tests validating token accumulation and request delays
    • Enhanced test coverage for offset tracking and nested bundle scenarios

Changed

  • Observer: Increased OFFSET_SAMPLE_COUNT default from 3 to 4 to improve chunk validation success rate with early stopping
  • Increased rate limiter defaults to accommodate larger response payloads:
    • RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET: 10,000 → 1,000,000 tokens (~10 MB → ~976 MB bucket capacity)
    • RATE_LIMITER_IP_TOKENS_PER_BUCKET: 2,000 → 100,000 tokens (~2 MB → ~98 MB bucket capacity)
    • Resource refill rate remains 100 tokens/sec (~98 KB/sec)
    • IP refill rate remains 20 tokens/sec (~20 KB/sec)
    • Note: 1 token = 1 KB of response data, minimum 1 token per request
    • Rate limiter remains disabled by default (ENABLE_RATE_LIMITER=false)
  • Performance Optimization: RootParentDataSource now uses pre-computed root offsets when available:
    • Skip bundle traversal entirely when offsets are cached in database
    • Direct offset-based data retrieval without parent chain traversal
    • Use rootDataOffset to skip headers when fetching data payloads
    • Significantly reduces latency for nested bundle data retrieval

Fixed

  • Security: Resolved transitive dependency vulnerabilities by adding yarn resolutions:
    • ws@7.5.10: Fixed DoS vulnerability when handling requests with many HTTP headers (CVE in ws <7.5.10)
    • semver@7.6.3: Fixed Regular Expression Denial of Service (ReDoS) vulnerability (CVE in semver <7.5.2)

Docker Images

  • ar-io-envoy: ghcr.io/ar-io/ar-io-envoy:159d6467108122a3413c5ab45150d334dc9fb78f
  • ar-io-core: ghcr.io/ar-io/ar-io-core:3a1db3ee7f73ae436dec2c11fa502efbdcaf4b9a
  • ar-io-observer: ghcr.io/ar-io/ar-io-observer:d21ea765b0dae92154439394a878e5d857d24dc3
  • ar-io-clickhouse-auto-import: ghcr.io/ar-io/ar-io-clickhouse-auto-import:4512361f3d6bdc0d8a44dd83eb796fd88804a384
  • ar-io-litestream: ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8

Don't miss a new ar-io-node release

NewReleases is sending notifications on new releases.