ar-io/ar-io-node r53 on GitHub

This is an optional release that introduces root transaction offset tracking for nested bundles and observer performance improvements. The release enables more efficient data retrieval through comprehensive offset tracking with Turbo and GraphQL integration, while improving observer reliability with increased chunk validation success rates.

Added

Root Transaction and Offset Tracking: Comprehensive offset tracking system for nested ANS-104 bundles:
- Turbo /offsets endpoint integration for accurate root transaction discovery and offset calculations
- Handles multi-level nested bundles with cumulative offset tracking
- Cycle detection and maximum nesting depth protection (10 levels)
- Database persistence of root transaction IDs and absolute offset values
GraphQL Root TX Index: Dedicated GraphQL endpoint configuration for root transaction lookups:
- GRAPHQL_ROOT_TX_GATEWAYS_URLS: JSON object mapping GraphQL endpoints to weights (default: {"https://arweave-search.goldsky.com/graphql": 1})
- Parent chain traversal with metadata extraction (content type, size)
- Fallback mechanism when Turbo is unavailable
- Configurable lookup order via ROOT_TX_LOOKUP_ORDER (default: "db,turbo")
Database Migration: Added offset tracking columns to contiguous_data_ids table:
- root_transaction_id: Top-level Arweave transaction containing the data
- root_data_item_offset: Absolute position where data item headers begin in root bundle
- root_data_offset: Absolute position where data payload begins in root bundle
HTTP Headers: New headers exposing absolute root offset information:
- X-AR-IO-Root-Data-Item-Offset: Enables direct byte-range requests to data item headers
- X-AR-IO-Root-Data-Offset: Enables direct byte-range requests to data payloads
Outbound Rate Limiting for External APIs: Token bucket rate limiting for outbound calls to Turbo and GraphQL services (separate from the Redis-based inbound rate limiter added in Release 52):
- Turbo API: Configurable via TURBO_ROOT_TX_RATE_LIMIT_BURST_SIZE (default: 5), TURBO_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL (default: 6), TURBO_ROOT_TX_RATE_LIMIT_INTERVAL (default: "minute")
- GraphQL API: Configurable via GRAPHQL_ROOT_TX_RATE_LIMIT_BURST_SIZE (default: 5), GRAPHQL_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL (default: 6), GRAPHQL_ROOT_TX_RATE_LIMIT_INTERVAL (default: "minute")
- Prevents excessive API usage and respects external service limits (defaults to 6 requests per minute = 1 per 10 seconds)
Configuration Options:
- ENABLE_DATA_ITEM_ROOT_TX_SEARCH: Enable/disable root transaction search for data items in offset-aware sources (default: true)
- ENABLE_PASSTHROUGH_WITHOUT_OFFSETS: Control whether offset-aware sources allow data retrieval without offset information (default: true)
- Dedicated rate limiting configuration for Turbo and GraphQL root TX lookups
- Separate GraphQL gateway configuration for root lookups vs data retrieval
Documentation and Testing:
- Comprehensive bundle offsets documentation in docs/drafts/bundle-offsets.md
- Rate limiting behavior tests validating token accumulation and request delays
- Enhanced test coverage for offset tracking and nested bundle scenarios

Changed

Observer: Increased OFFSET_SAMPLE_COUNT default from 3 to 4 to improve chunk validation success rate with early stopping
Increased rate limiter defaults to accommodate larger response payloads:
- RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET: 10,000 → 1,000,000 tokens (~10 MB → ~976 MB bucket capacity)
- RATE_LIMITER_IP_TOKENS_PER_BUCKET: 2,000 → 100,000 tokens (~2 MB → ~98 MB bucket capacity)
- Resource refill rate remains 100 tokens/sec (~98 KB/sec)
- IP refill rate remains 20 tokens/sec (~20 KB/sec)
- Note: 1 token = 1 KB of response data, minimum 1 token per request
- Rate limiter remains disabled by default (ENABLE_RATE_LIMITER=false)
Performance Optimization: RootParentDataSource now uses pre-computed root offsets when available:
- Skip bundle traversal entirely when offsets are cached in database
- Direct offset-based data retrieval without parent chain traversal
- Use rootDataOffset to skip headers when fetching data payloads
- Significantly reduces latency for nested bundle data retrieval

Fixed

Security: Resolved transitive dependency vulnerabilities by adding yarn resolutions:
- ws@7.5.10: Fixed DoS vulnerability when handling requests with many HTTP headers (CVE in ws <7.5.10)
- semver@7.6.3: Fixed Regular Expression Denial of Service (ReDoS) vulnerability (CVE in semver <7.5.2)

Docker Images

ar-io-envoy: ghcr.io/ar-io/ar-io-envoy:159d6467108122a3413c5ab45150d334dc9fb78f
ar-io-core: ghcr.io/ar-io/ar-io-core:3a1db3ee7f73ae436dec2c11fa502efbdcaf4b9a
ar-io-observer: ghcr.io/ar-io/ar-io-observer:d21ea765b0dae92154439394a878e5d857d24dc3
ar-io-clickhouse-auto-import: ghcr.io/ar-io/ar-io-clickhouse-auto-import:4512361f3d6bdc0d8a44dd83eb796fd88804a384
ar-io-litestream: ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8

ar-io/ar-io-node r53 Release 53 on GitHub

Added

Changed

Fixed

Docker Images

ar-io/ar-io-node r53
Release 53

on GitHub