This is a recommended release focused on cross-gateway GraphQL fan-out, ClickHouse query-path hardening, and composite query resilience. Key highlights include GatewaysGqlQueryable, a new adapter that fans GraphQL queries out to configured upstream ar-io-node gateways and merges the results — letting a node compose its local index with broader upstream coverage — and a parallelized composite ClickHouse/SQLite GraphQL path protected by a SQLite circuit breaker that surfaces PARTIAL_RESULT warnings via extensions.warnings instead of silent partials. ClickHouse gets several query-path improvements: dropping FINAL in favor of LIMIT 1 BY dedupe to re-enable projection planning, a new owner_address bloom with projection skipping on tag filters, a tag_names / tag_values fix for owner_projection, a configurable query timeout (default 3s), and a max_rows_to_read guardrail that fails noisy full-scans fast. It also adds per-job status tracking to the Parquet export admin API and bundles an Observer update to ddd3a9c with reference-gateway chunk-header offset validation and continuous-observer reliability hardening, alongside a set of ClickHouse auto-import reliability fixes.
Added
-
Fan-Out GraphQL Over Upstream Gateways (
GatewaysGqlQueryable): A newGqlQueryableadapter fans GraphQL queries out to configured upstream ar-io-node gateways and merges the results, letting a node act as a thin fan-out proxy or compose its local index with upstream sources for broader coverage. Single-record queries use first-non-null resolution; connection queries k-way merge by the ar-io-node cursor tuple and dedupe by id. Per-endpoint circuit breakers isolate slow or failing upstreams. Configured viaGATEWAYS_GQL_URLS; disabled by default. -
Configurable ClickHouse GraphQL Query Timeout: The ClickHouse GQL backend now applies a configurable timeout both server-side (as
max_execution_time, so ClickHouse aborts runaway queries and frees resources) and client-side (as the HTTPrequest_timeout, with a 2s grace window so the server-side timeout error surfaces before the client aborts). Default 3s. -
max_rows_to_readGuardrail on ClickHouse GraphQL Queries: Every GraphQL query against the ClickHousetransactionstable now appendsSETTINGS max_rows_to_read = N. Queries that would scan more than the configured threshold throwCode: 158: Limit for rows ... exceededinstead of silently scanning the whole table — catches projection-shadowing bugs and planner regressions where a skip index is bypassed. Default 10M rows (~20% of current table size); tunable viaCLICKHOUSE_GQL_MAX_ROWS_TO_READ. -
Per-Job Status Tracking for Parquet Export API:
POST /ar-io/admin/export-parquetnow returns ajobId, and the exporter keeps a bounded per-job history (32 entries) so concurrent callers can each poll their own record atGET /ar-io/admin/export-parquet/status/:jobId. The legacy singleton status endpoint is retained for back-compat and still reflects the most-recent update.scripts/parquet-exportprefers the per-job endpoint when ajobIdis returned and falls back to the singleton-with-drift-detection path for older gateways.
Changed
-
Observer Update to
ddd3a9c: Bundles two upstream PRs on top of the previous21098d2pin.- Reference-gateway chunk-header offset validation: The observer now HEADs the reference gateway's
/chunk/{offset}/dataand anchors the advertisedx-arweave-chunk-*headers (tx id, boundaries, data root) to the chain via/tx/{id}/offsetand/tx/{id}, replacing the block-and-tx binary search as the default offset-validation path. Typical cost drops from ~20–30 node lookups per offset to one HEAD plus two O(1) lookups per unique tx, with a per-tx LRU cache for repeated offsets. Any header/chain mismatch or missing header falls back to the legacy chain search, so older gateways keep working. New metricobserver_chunk_metadata_anchor_total{result}(hit / cache_hit / metadata_missing / mismatch / error / fallback) tracks the rollout. Gateways that return an HTTP error on the new probe are no longer blacklisted from the shared pool — only transport failures do. - Continuous observer reliability hardening: The per-gateway schedule map is replaced with a flat list of
ScheduledObservationevents so duplicates, restart catch-up, and overdue retries are deterministic (legacy state auto-migrates on load). An explicit submission deadline (windowEnd + submissionBufferMs) now bounds the epoch — once exceeded, the scheduler clears pending work, marks the epochexpired, and stops issuing observations instead of spinning on stale state. Finalization is gated on both the window being complete and the pending queue being empty, and only flipsreportSubmittedon a successful submit so transient submit failures retry. Unsubmitted prior epochs are discarded on epoch transition rather than force-finalized into the wrong epoch. - Report telemetry: Reports now record each gateway's
releasefield from/ar-io/info, ayarn summarizescript prints pass/fail counts grouped by release, and offset rendering now shows<failures>/<observed> (<pct>)so the denominator reflects the sampled subset.
- Reference-gateway chunk-header offset validation: The observer now HEADs the reference gateway's
-
ClickHouse GraphQL query no longer uses
FINAL: The composite ClickHouse backend previously issuedFROM transactions AS t FINALto deduplicate unmergedReplacingMergeTreeversions at read time.FINALpreventedowner_projectionfrom being selected and forced aPrimaryKeyExpandthat widened the skip-index-pruned granule set by ~4×. It is replaced with aLIMIT 1 BY height, block_transaction_index, is_data_item, idclause that dedupes in-engine as a post-sort filter without disabling projection planning or PREWHERE push-down. Safe because Arweave transaction data is immutable: all versions of a given primary key are byte-identical by construction. -
Composite ClickHouse GraphQL Parallelized With SQLite Circuit Breaker: The
CompositeClickHouseDatabasenow runs its ClickHouse and SQLite legs concurrently instead of serially, and wraps the SQLite leg in an opossum circuit breaker. ClickHouse errors (timeout,max_rows_to_read) still propagate to the caller, while SQLite failures degrade the response to ClickHouse-only results with aPARTIAL_RESULTwarning attached via GraphQLextensions.warnings— ending silent partials for tip-of-chain rows and for the single-recordtransaction(id)lookup, which previously returned a barenullwhen SQLite was unavailable. The ClickHouse max-height boundary-optimization cache is now read non-blocking from the request path, with a background refresh keeping it warm. Fan-out preserves warnings end-to-end:RemoteGqlQueryablepulls upstreamextensions.warningsoff each response,GatewaysGqlQueryablemerges them across sources, and synthesizesUPSTREAM_UNAVAILABLE/UPSTREAM_CIRCUIT_OPENwarnings for partially-failed aggregates that were previously logged-and-dropped. New env vars underCLICKHOUSE_SQLITE_CIRCUIT_BREAKER_*(defaults: timeout 5000ms, error threshold 80%, reset timeout 60000ms, rolling window 30000ms). -
ClickHouse
owner_addressBloom + Skip Projection on Tag Filters: ClickHouse projections cannot carry inline skip indexes, so owner+tag GraphQL queries that routed throughowner_projectionscanned every granule within the owner range. Anowner_addressbloom filter is now defined on the maintransactionstable, and the per-queryoptimize_use_projections = 0guard is extended to tag filters. Owner-only queries still benefit fromowner_projection's sort order; owner+tag queries now fall back to the main table whereid_bloom/tag_names_bloom/tag_values_bloom/owner_address_bloomcan prune granules across all three dimensions. Existing deployments get the index registered via an idempotentALTER TABLE ... ADD INDEX IF NOT EXISTSon the nextclickhouse-importcycle; a manualMATERIALIZE INDEX owner_address_bloomis required to populate the index on existing parts. -
Parquet Export Defaults to Include L1 Transactions and Tags:
ParquetExporter.export()defaults now align with thescripts/parquet-exportCLI wrapper and the auto-verify harness, both of which already included L1 by default. Callers that want L2-only output must now passskipL1Transactions/skipL1Tagsexplicitly.
Fixed
-
ClickHouse
owner_projectionnow usable for tag-filtered owner queries: The projection was previously defined withSELECT *, which in ClickHouse excludesMATERIALIZEDcolumns — sotag_namesandtag_valueswere absent from the projection and the optimizer rejected it for any query with predicates on those columns (which includes all tag-filtered GraphQL queries). The projection body is nowSELECT *, tag_names, tag_values, so the optimizer picksowner_projectionfor owner-scoped queries and reads orders of magnitude fewer granules. Existing deployments need a one-time manual migration (DROP PROJECTION/ADD PROJECTION/MATERIALIZE PROJECTION) — see the inline comment insrc/database/clickhouse/schema.sql. Fresh deployments get the corrected projection from theCREATE TABLEbody with no operator action required. -
GraphQL
Block.timestampNon-Nullable Field Error: Addresses a "Cannot return null for non-nullable field Block.timestamp" error that could surface when resolving blocks with incomplete data. -
GraphQL Data Item Signature Fetch Falls Back to
NOT_FOUND: The data-item path inresolveTxSignaturereturned the fetcher result directly, so anundefinedfromSignatureFetcher.getDataItemSignature(e.g., missing attributes or a stream failure reading from the parent bundle) would trigger a "Cannot return null for non-nullable field" error on theString!signature field. The data-item path now mirrors the transaction path and falls back toNOT_FOUND. -
clickhouse-auto-importHonorsSQLITE_DATA_PATH: Theclickhouse-auto-importcontainer had its SQLite bind mount hardcoded to./data/sqlite, whilecoreused${SQLITE_DATA_PATH:-./data/sqlite}. WhenSQLITE_DATA_PATHwas set, the two containers diverged: the daemon'sbatch_has_datapre-check resolved to a missing path and silently failed open, so empty height ranges were still sent through the full export/import pipeline. The mount is now consistent with core. -
Fail-Fast on ClickHouse GraphQL Rejection: Awaiting
Promise.allSettledgated ClickHouse errors on the SQLite leg's breaker timeout. The composite flow now awaits ClickHouse first and rethrows immediately, absorbing SQLite rejections eagerly so bailing out early does not emit an unhandled rejection. -
Reject Concurrent Parquet Exports + Skip Empty ETL Ranges: The auto-import loop previously wasted cycles (and logged spurious "Input directory does not exist" / "Parquet file too short" errors) on batches that either collided with a still-running export or spanned empty height ranges. The admin endpoint now returns
409instead of swallowing the rejection, the exporter script surfaces a clear error when the singleton status is stale, and batches with no source rows short-circuit via asqlite3pre-check. -
Hive-Layout ClickHouse Importer Requires
blocksandtransactionsFiles: The Hive-layout importer iterated a per-table glob; when no files matched, bash left the literal pattern string in the loop variable and the-fcheck silently short-circuited, so the partition reported success even though zero required files were imported — combined with export races that produced empty staging dirs, this was silently dropping data. Thematched_countvalidation from the flat-dir path is now ported soblocksandtransactionseach must contribute at least one file;tagsmay still be empty. -
GraphQL Boundary Skips
minHeighton SQLite "New" Tables: The ClickHouse/SQLite GraphQL boundary raisesminHeightto route historical queries away from SQLite. Applied tonew_transactions/new_data_items, the resultingheight >= :minHeightsilently dropped pending rows whose height isNULL. Because the "new" tables only hold unstable/recent data that ClickHouse never covers, the predicate is now skipped entirely for those sources.
Image SHAs
ENVOY_IMAGE_TAG:6934e519fb98a46da4c17bdfa51d66225428b7c0CORE_IMAGE_TAG:cb16b168ca36a45d762d8391676078f98ce67da7CLICKHOUSE_AUTO_IMPORT_IMAGE_TAG:8a1c0c55ed712e283b55b87f2bc8c7111bbc0482LITESTREAM_IMAGE_TAG:be121fc0ae24a9eb7cdb2b92d01f047039b5f5e8OBSERVER_IMAGE_TAG:ddd3a9c15e426c84da24c9fb7a1107620ccc27c1