This is an optional release that introduces root transaction offset tracking for nested bundles and observer performance improvements. The release enables more efficient data retrieval through comprehensive offset tracking with Turbo and GraphQL integration, while improving observer reliability with increased chunk validation success rates.
Added
- Root Transaction and Offset Tracking: Comprehensive offset tracking system for nested ANS-104 bundles:
- Turbo
/offsets
endpoint integration for accurate root transaction discovery and offset calculations - Handles multi-level nested bundles with cumulative offset tracking
- Cycle detection and maximum nesting depth protection (10 levels)
- Database persistence of root transaction IDs and absolute offset values
- Turbo
- GraphQL Root TX Index: Dedicated GraphQL endpoint configuration for root transaction lookups:
GRAPHQL_ROOT_TX_GATEWAYS_URLS
: JSON object mapping GraphQL endpoints to weights (default:{"https://arweave-search.goldsky.com/graphql": 1}
)- Parent chain traversal with metadata extraction (content type, size)
- Fallback mechanism when Turbo is unavailable
- Configurable lookup order via
ROOT_TX_LOOKUP_ORDER
(default: "db,turbo")
- Database Migration: Added offset tracking columns to
contiguous_data_ids
table:root_transaction_id
: Top-level Arweave transaction containing the dataroot_data_item_offset
: Absolute position where data item headers begin in root bundleroot_data_offset
: Absolute position where data payload begins in root bundle
- HTTP Headers: New headers exposing absolute root offset information:
X-AR-IO-Root-Data-Item-Offset
: Enables direct byte-range requests to data item headersX-AR-IO-Root-Data-Offset
: Enables direct byte-range requests to data payloads
- Outbound Rate Limiting for External APIs: Token bucket rate limiting for outbound calls to Turbo and GraphQL services (separate from the Redis-based inbound rate limiter added in Release 52):
- Turbo API: Configurable via
TURBO_ROOT_TX_RATE_LIMIT_BURST_SIZE
(default: 5),TURBO_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL
(default: 6),TURBO_ROOT_TX_RATE_LIMIT_INTERVAL
(default: "minute") - GraphQL API: Configurable via
GRAPHQL_ROOT_TX_RATE_LIMIT_BURST_SIZE
(default: 5),GRAPHQL_ROOT_TX_RATE_LIMIT_TOKENS_PER_INTERVAL
(default: 6),GRAPHQL_ROOT_TX_RATE_LIMIT_INTERVAL
(default: "minute") - Prevents excessive API usage and respects external service limits (defaults to 6 requests per minute = 1 per 10 seconds)
- Turbo API: Configurable via
- Configuration Options:
ENABLE_DATA_ITEM_ROOT_TX_SEARCH
: Enable/disable root transaction search for data items in offset-aware sources (default: true)ENABLE_PASSTHROUGH_WITHOUT_OFFSETS
: Control whether offset-aware sources allow data retrieval without offset information (default: true)- Dedicated rate limiting configuration for Turbo and GraphQL root TX lookups
- Separate GraphQL gateway configuration for root lookups vs data retrieval
- Documentation and Testing:
- Comprehensive bundle offsets documentation in
docs/drafts/bundle-offsets.md
- Rate limiting behavior tests validating token accumulation and request delays
- Enhanced test coverage for offset tracking and nested bundle scenarios
- Comprehensive bundle offsets documentation in
Changed
- Observer: Increased
OFFSET_SAMPLE_COUNT
default from 3 to 4 to improve chunk validation success rate with early stopping - Increased rate limiter defaults to accommodate larger response payloads:
RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET
: 10,000 → 1,000,000 tokens (~10 MB → ~976 MB bucket capacity)RATE_LIMITER_IP_TOKENS_PER_BUCKET
: 2,000 → 100,000 tokens (~2 MB → ~98 MB bucket capacity)- Resource refill rate remains 100 tokens/sec (~98 KB/sec)
- IP refill rate remains 20 tokens/sec (~20 KB/sec)
- Note: 1 token = 1 KB of response data, minimum 1 token per request
- Rate limiter remains disabled by default (
ENABLE_RATE_LIMITER=false
)
- Performance Optimization: RootParentDataSource now uses pre-computed root offsets when available:
- Skip bundle traversal entirely when offsets are cached in database
- Direct offset-based data retrieval without parent chain traversal
- Use
rootDataOffset
to skip headers when fetching data payloads - Significantly reduces latency for nested bundle data retrieval
Fixed
- Security: Resolved transitive dependency vulnerabilities by adding yarn resolutions:
ws@7.5.10
: Fixed DoS vulnerability when handling requests with many HTTP headers (CVE in ws <7.5.10)semver@7.6.3
: Fixed Regular Expression Denial of Service (ReDoS) vulnerability (CVE in semver <7.5.2)
Docker Images
- ar-io-envoy:
ghcr.io/ar-io/ar-io-envoy:159d6467108122a3413c5ab45150d334dc9fb78f
- ar-io-core:
ghcr.io/ar-io/ar-io-core:3a1db3ee7f73ae436dec2c11fa502efbdcaf4b9a
- ar-io-observer:
ghcr.io/ar-io/ar-io-observer:d21ea765b0dae92154439394a878e5d857d24dc3
- ar-io-clickhouse-auto-import:
ghcr.io/ar-io/ar-io-clickhouse-auto-import:4512361f3d6bdc0d8a44dd83eb796fd88804a384
- ar-io-litestream:
ghcr.io/ar-io/ar-io-litestream:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8