ArcadeDB 26.5.1 Release Notes
Overview
ArcadeDB 26.5.1 is a major release with over 270 commits and 128 resolved issues. The headline news is the new sparse vector index with server-side hybrid retrieval and INT8 quantization end-to-end, a huge wave of OpenCypher correctness fixes, query partitioning, a new EXTERNAL property storage layout for heavy values, plus a long list of HA, wire-protocol, and Studio improvements.
Major Highlights
Sparse Vector Index + Hybrid Retrieval
A brand-new LSM_SPARSE_VECTOR index brings sparse-embedding retrieval (BM25/SPLADE-style) directly into the engine, with server-side fusion of dense and sparse results and diversified top-K. (#4065, #4066, #4067, #4068, #4070, #4078, #4119, #4130)
- New index type
LSM_SPARSE_VECTORfor sparse-embedding retrieval. vector.fuse(...)server-side hybrid fusion with RRF, DBSF and LINEAR strategies.vector.neighbors(...)gainsgroupBy/groupSizeoptions for diversified retrieval, including dotted nested-field grouping (#4072) and traversal-integrated grouping (#4071).- WAND / BlockMax-WAND dynamic pruning to scale sparse retrieval to 100M+ documents.
- Sparse-vector partitioning so a single index can be sharded by tenant / domain.
- Reranker SQL functions for two-stage retrieval pipelines.
- Bolt and HTTP wire support for the new sparse vector type, including
$bytes/$int8markers for INT8 query vectors.
INT8 Quantization for Dense Vectors
End-to-end INT8 support across the dense vector pipeline: ingest, storage and query path now share the same 8-bit representation, dramatically reducing disk and RSS without going through the FP32 path. (#3143, #4132, #4133, #4135, #4136)
EXTERNAL Property Storage
New paired-bucket layout for heavy property values (vectors, large strings, JSON). Hot row data stays compact in the main bucket while bulky payloads live in a sibling external bucket, sharply reducing scan cost on wide records. (#4027, #4028)
Query Partitioning
Query-level partitioning lands together with a partition-aware planner that prunes pruned partitions from SQL and Cypher plans, plus integrity guardrails for partitioned types. (#4087, #4088)
HA: Offline Cluster Bootstrap
You can now bootstrap a fresh HA cluster from a pre-seeded database (snapshot-and-restore), avoiding a full re-replication of large datasets when expanding or rebuilding a cluster. Includes regression integration tests for the cluster-formation edge cases. (#4147, #4205)
Production-Ready Helm Chart
The Helm chart has been reworked to align with the Raft-based HA subsystem introduced in 26.4.2 and is now suitable for production rollouts. (#4035)
Cypher: SHOW INDEXES / SHOW CONSTRAINTS
Standard Cypher administrative commands SHOW INDEXES and SHOW CONSTRAINTS are now supported. (#3972)
SQL: FIND REFERENCES
Restores the OrientDB-compatible FIND REFERENCES command for locating all records pointing to a given RID. (#4146)
C# End-to-End Tests
A new e2e suite exercises ArcadeDB over the Postgres wire via Npgsql and Testcontainers, validating the C# client path on every build. (#4036, #4038)
HA Operational Improvements
- Optional human-readable peer names in
HA_SERVER_LISTfor friendlier cluster topology. (#3974) - Studio gains peer add/remove controls in the HA cluster panel. (#4145)
Studio Improvements
- Full-screen mode for graph view. (#4032)
- Clear query button / textbox. (#4121)
- Session reset on token expire instead of silent failure. (#4082)
- Error messages now persist instead of disappearing after a few seconds. (#4124)
- Query history no longer auto-submits on selection. (#4022)
- Inherited indexes are now visible. (#4140)
Major Fixes
OpenCypher Correctness
A large batch of correctness fixes landed across pattern matching, write clauses, subqueries, list/temporal expressions and the optimizer fast-path. Highlights:
valueType(...)now reports theNOT NULLsuffix for non-null values. (#3991)point(...)WGS-84-3D exposes.heightas an alias for.z. (#3992)CALL ... YIELDno longer nullifies variables carried in fromWITH. (#3996, #4094)collect(r)followed by a variable-length match no longer drops all rows. (#3997)- Variable-length pattern segments no longer re-traverse a relationship bound in a prior
MATCH. (#4006) MERGEwith an unbound label-only endpoint now creates a fresh node instead of reusing an existing one. (#3998)SETnow propagates to all aliases bound to the same node within the same query. (#4000)- Self-referential property updates and
SET :Labelare now idempotent across row fanout. (#4016, #4017) - Temporal component access on
date/datetimevalues no longer returns null. (#4018) WITH-carried node variables are no longer nulled out by a laterCREATE/MERGE. (#4019)allReduce(...)no longer evaluates true cases as false. (#4043)- Anonymous middle nodes in multi-hop chains now match rows. (#4092)
- Backslashes in string literals and property values are preserved. (#4093)
- Consecutive directed and undirected relationship patterns must not reuse the same edge. (#4095, #4096)
EXISTS { ... }subqueries returning an outer-variable expression no longer evaluate as false. (#4097)- List literals containing
duration(...)no longer drop rows. (#4099) - List subscript with an inline aggregate index no longer returns null. (#4100)
MATCHimmediately afterCREATEnow sees newly created labeled nodes. (#4101)MATCHon a null carried variable now correctly filters out the row. (#4102)MERGE ... ON MATCH SETno longer returns the pre-update property value. (#4103)MERGEpatterns no longer reuse a newly created endpoint across input rows. (#4104)- Node label-union patterns now match when either label exists. (#4105)
- Pattern comprehensions over existing relationships are no longer empty. (#4106)
reduce(...)over an inline aggregate expression is now evaluated correctly. (#4107)- Relationship type predicates on bound relationship variables no longer evaluate as false. (#4108)
- Repeated relationship variables in
WHEREpatterns now match no rows (as expected). (#4109) - Uncorrelated pattern predicates now correctly reflect existing relationships. (#4110)
- Variable-length pattern comprehensions no longer duplicate projected elements. (#4111)
WHERE falseliteral predicates are no longer ignored. (#4112)datetime()values are now persisted onDATETIME-typed properties. (#4125)OR EXISTS + AND NOT (EXISTS ... OR EXISTS ...)returns the correct rows under two-outer-MATCHbinding. (#4126)- Optimizer fast path is now skipped when a write clause precedes
MATCH. (#4131) CALLsubquerySETno longer leaves the carried outer variable stale. (#4182)id(...)is now numeric and no longer breaks numeric predicates. (#4183)shortestPath/allShortestPathswith variable-length relationship type alternation now match. (#4190)- Write-only
CALLsubqueries no longer return an extra empty row. (#4191) MATCHon a parent edge type now matches sub-typed edges (polymorphic edge traversal). (#4192)- Batch fixes for #4184, #4185, #4186, #4188, #4189. (#4196)
SQL
CONTAINSALLnow works when comparing a list ofIdentifiables against a list of RID strings. (#4002)- Correlated
COLLECT { ... }/COUNT { ... }subqueries with outer-variable access now evaluate correctly instead of always returning empty/zero. (#4014, #4015) SEARCH_INDEXandSEARCH_FIELDSnow propagate return values in filters and correctly handle wildcards. (#4023, #4030)SELECTwith a non-unique LSM index no longer returns zero rows after partial deletes (per-RID tombstones no longer suppress the whole key). (#4024)- Edge creation with
CONTENTno longer silently ignores properties. (#4033) algo.dijkstrano longer yields a weight of zero. (#4042)LIST of STRINGin GraphBatch works again. (#4069)UPDATE EDGE SET @in/@outcorrectly rewires the vertex edge lists. (#4074)=combined withLIKEon time-series types no longer returns zero results. (#4128)- Range queries no longer raise a spurious "Non-existent edge type" error. (#4199)
point.withinBBox(...)now supports cross-meridian bounding boxes. (#3994)
Storage, Indexing and Schema
- HASH index lookups now return rows when data encryption is enabled (keys are kept deterministic). (#4137)
- Orphan
TypeIndexwrapper is now dropped when its last bucket child is removed. (#4179) - Indexes on a subclass are no longer incorrectly related to superclass indexes. (#4120)
- Manual index names are now respected on creation. (#4139)
- Inherited indexes are now shown in Studio. (#4140)
High Availability
- Schema changes now ship to followers, closing a WAL-gap source. (#4077)
- Cluster inconsistency reports after node shutdowns resolved. (#4081)
- Massive inserts over gRPC now replicate correctly. (#4076)
- Correct leader is now reported in the resume table. (#4075)
ClassCastException(RaftReplicatedDatabasetoLocalDatabase) on the leader during import / read-only property writes fixed. (#4144)/api/v1/batchno longer fails with "Error on updating dictionary" on follower nodes. (#4039, #4122)/batchendpoint no longer returns HTTP 500 NPE after a successful commit. (#4123)- Spurious index warnings from cluster followers removed. (#4063)
- e2e-ha integration tests stabilized, with on-demand Toxiproxy support. (#4013, #4020)
Wire Protocols
- PostgreSQL: empty
SELECTresults now include theRowDescriptionschema (#3971);SHOW server_versionreturns a proper value for SQLAlchemy (#4116); CypherWHERE id(n) IN $arrayround-trips correctly afterid()became numeric (#4200); binary array deserialization implemented to unblock JDBCsetArray(#4203); named and positional parameters work via Npgsql (C#) (#4036). - Bolt:
EXPLAIN/PROFILEplans are now included inPULLSUCCESSmetadata, fixing Neo4j drivers'summary.Plan()(#4129); Bolt executor recognises the new sparse vector type (#4079). - gRPC:
InsertStreamthroughput no longer collapses 20-30x after a few hundred unaryexecuteQuerycalls (leakedResultSets closed) (#4197); commit-time constraint violations are surfaced as a stream-levelINTERNALerror instead of being silently absorbed (#4198);DATEcolumns are no longer corrupted via parameter binding (#4181);ARRAY_OF_LONGSandDATETIMEparameter binding preserve int64 / fractional-second precision (#4148, #4149); cluster replication on massive inserts via gRPC fixed (#4076). - HTTP: INT8 query vectors are routed to
byte[]via$bytes/$int8markers for end-to-end INT8 payload savings (#4135);RemoteGraphBatchnow honors unique edge constraints (#4113); edgeDATETIMEparser accepts ISO suffixes (#4142).
Database Lifecycle
GraphAnalyticalViewasync restore no longer fails with "Transaction not started on current thread" when reopening a database. (#4180)- Database restore process and error logging improved. (#4026)
GraphBatchno longer errors on transaction commit in mixed-batch scenarios. (#4080)
Python Bindings
Python bindings refreshed with Codacy/Bandit cleanup, formatting fixes and updated workflow triggers. (#4011, #4041, #4084)
Dependencies
Notable upgrades in this release include:
- Netty 4.2.13.Final
- Undertow 2.4.0.Final
- PostgreSQL JDBC 42.7.11
- Neo4j Java Driver 6.1.0
- Jackson Databind 2.21.3
- Gson 2.14.0
- Swagger 2.2.49
- JLine 4.1.0
- GraalVM 25.0.3
- TestContainers 2.0.5
- Apache TinkerPop / Gremlin compatibility maintained
Plus the usual round of Studio frontend updates (Cytoscape, ApexCharts, SwaggerUI, Marked, PostCSS, Terser, pdfmake and webpack toolchain), CI / GitHub Actions bumps, Docker base image refresh (Eclipse Temurin) and several security-critical Studio dependency updates.