Changelog
56.1.0 (2025-08-21)
Implemented enhancements:
- Implement cast and other operations on decimal32 and decimal64 #7815 #8204 [arrow]
- Speed up Parquet filter pushdown with predicate cache #8203 [parquet]
- Optionally read parquet page indexes #8070 [parquet]
- Parquet reader: add method for sync reader read bloom filter #8023 [parquet]
- [parquet] Support writing logically equivalent types to
ArrowWriter
#8012 [parquet] - Improve StringArray(Utf8) sort performance #7847 [arrow]
- feat: arrow-ipc delta dictionary support #8001 [arrow] (JakeDern)
Fixed bugs:
- The Rustdocs are clean CI job is failing #8175
- [avro] Bug in resolving avro schema with named type #8045 [arrow]
- Doc test failure (test arrow-avro/src/lib.rs - reader) when verifying avro 56.0.0 RC1 release #8018 [arrow]
Documentation updates:
- arrow-row: Document dictionary handling #8168 [arrow] (alamb)
- Docs: Clarify that Array::value does not check for nulls #8065 [arrow] (alamb)
- docs: Fix a typo in README #8036 (EricccTaiwan)
- Add more comments to the internal parquet reader #7932 [parquet] (alamb)
Performance improvements:
- perf(arrow-ipc): avoid counting nulls in
RecordBatchDecoder
#8127 [arrow] (rluvaton) - Use
Vec
directly in builders #7984 [arrow] (liamzwbao) - Improve StringArray(Utf8) sort performance (~2-4x faster) #7860 [arrow] (zhuqi-lucas)
Closed issues:
- [Variant] Improve fuzz test for Variant #8199
- [Variant] Improve fuzz test for Variant #8198
VariantArrayBuilder
tracks starting offsets instead of (offset, len) pairs #8192- Rework
ValueBuilder
API to work withParentState
for reliable nested rollbacks #8188 - [Variant] Rename
ValueBuffer
asValueBuilder
#8186 - [Variant] Refactor
ParentState
to track and rollback state on behalf of its owning builder #8182 - [Variant]
ObjectBuilder
should detect duplicates at insertion time, not at finish #8180 - [Variant] ObjectBuilder does not reliably check for duplicates #8170
- [Variant] Support
StringView
andLargeString
in ´batch_json_string_to_variant` #8145 [parquet] - [Variant] Rename
batch_json_string_to_variant
andbatch_variant_to_json_string
json_to_variant #8144 [parquet] - [avro] Use
tempfile
crate rather than custom temporary file generator in tests #8143 [arrow] - [Avro] Use
Write
ratherdyn Write
in Decoder #8142 [arrow] - [Variant] Nested builder rollback is broken #8136
- [Variant] Add support the remaing primitive type(timestamp_nanos/timestampntz_nanos/uuid) for parquet variant #8126
- Meta: Implement missing Arrow 56.0 lint rules - Sequential workflow #8121
- ARROW-012-015: Add linter rules for remaining Arrow 56.0 breaking changes #8120
- ARROW-010 & ARROW-011: Add linter rules for Parquet Statistics and Metadata API removals #8119
- ARROW-009: Add linter rules for IPC Dictionary API removals in Arrow 56.0 #8118
- ARROW-008: Add linter rule for SerializedPageReaderState usize→u64 breaking change #8117
- ARROW-007: Add linter rule for Schema.all_fields() removal in Arrow 56.0 #8116
- [Variant] Implement
ShreddingState::AllNull
variant #8088 [parquet] - [Variant] Support Shredded Objects in
variant_get
#8083 [parquet] - [Variant]: Implement
DataType::RunEndEncoded
support forcast_to_variant
kernel #8064 [parquet] - [Variant]: Implement
DataType::Dictionary
support forcast_to_variant
kernel #8062 [parquet] - [Variant]: Implement
DataType::Struct
support forcast_to_variant
kernel #8061 [parquet] - [Variant]: Implement
DataType::Decimal32/Decimal64/Decimal128/Decimal256
support forcast_to_variant
kernel #8059 [parquet] - [Variant]: Implement
DataType::Timestamp(..)
support forcast_to_variant
kernel #8058 [parquet] - [Variant]: Implement
DataType::Float16
support forcast_to_variant
kernel #8057 [parquet] - [Variant]: Implement
DataType::Interval
support forcast_to_variant
kernel #8056 [parquet] - [Variant]: Implement
DataType::Time32/Time64
support forcast_to_variant
kernel #8055 [parquet] - [Variant]: Implement
DataType::Date32 / DataType::Date64
support forcast_to_variant
kernel #8054 [parquet] - [Variant]: Implement
DataType::Null
support forcast_to_variant
kernel #8053 [parquet] - [Variant]: Implement
DataType::Boolean
support forcast_to_variant
kernel #8052 [parquet] - [Variant]: Implement
DataType::FixedSizeBinary
support forcast_to_variant
kernel #8051 [parquet] - [Variant]: Implement
DataType::Binary/LargeBinary/BinaryView
support forcast_to_variant
kernel #8050 [parquet] - [Variant]: Implement
DataType::Utf8/LargeUtf8/Utf8View
support forcast_to_variant
kernel #8049 [parquet] - [Variant] Implement
cast_to_variant
kernel #8043 [parquet] - [Variant] Support
variant_get
kernel for shredded variants #7941 [parquet] - Add test for casting
Decimal128
(i128::MIN
andi128::MAX
) tof64
with overflow handling #7939 [arrow]
Merged pull requests:
- [Variant] Enhance the variant fuz test to cover time/timestamp/uuid primitive type #8200 (klion26)
- [Variant] VariantArrayBuilder tracks only offsets #8193 (scovich)
- [Variant] Caller provides ParentState to ValueBuilder methods #8189 (scovich)
- [Variant] Rename ValueBuffer as ValueBuilder #8187 (scovich)
- [Variant] ParentState handles finish/rollback for builders #8185 (scovich)
- [Variant]: Implement
DataType::RunEndEncoded
support forcast_to_variant
kernel #8174 (liamzwbao) - [Variant]: Implement
DataType::Dictionary
support forcast_to_variant
kernel #8173 (liamzwbao) - Implement
ArrayBuilder
forUnionBuilder
#8169 [arrow] (grtlr) - [Variant] Support
LargeString
andStringView
inbatch_json_string_to_variant
#8163 (liamzwbao) - [Variant] Rename
batch_json_string_to_variant
andbatch_variant_to_json_string
#8161 (liamzwbao) - [Variant] Add primitive type timestamp_nanos(with&without timezone) and uuid #8149 (klion26)
- refactor(avro): Use impl Write instead of dyn Write in encoder #8148 [arrow] (Xuanwo)
- chore: Use tempfile to replace hand-written utils functions #8147 [arrow] (Xuanwo)
- feat: support push batch direct to completed and add biggest coalesce batch support #8146 [arrow] (zhuqi-lucas)
- [Variant] Add human-readable impl Debug for Variant #8140 (scovich)
- [Variant] Fix broken metadata builder rollback #8135 (scovich)
- [Variant]: Implement DataType::Interval support for cast_to_variant kernel #8125 (codephage2020)
- Add schema resolution and type promotion support to arrow-avro Decoder #8124 [arrow] (jecsand838)
- Add Initial
arrow-avro
writer implementation with basic type support #8123 [arrow] (jecsand838) - [Variant] Add Variant::Time primitive and cast logic #8114 (klion26)
- [Variant] Support Timestamp to variant for
cast_to_variant
kernel #8113 (abacef) - Bump actions/checkout from 4 to 5 #8110 (dependabot[bot])
- [Varaint]: add
DataType::Null
support to cast_to_variant #8107 (feniljain) - [Variant] Adding fixed size byte array to variant and test #8106 (abacef)
- [VARIANT] Initial integration tests for variant reads #8104 [parquet] (carpecodeum)
- [Variant]: Implement
DataType::Decimal32/Decimal64/Decimal128/Decimal256
support forcast_to_variant
kernel #8101 (liamzwbao) - Refactor arrow-avro
Decoder
to support partial decoding #8100 [arrow] (jecsand838) - fix: Validate metadata len in IPC reader #8097 [arrow] (JakeDern)
- [parquet] further improve logical type compatibility in ArrowWriter #8095 [parquet] (albertlockett)
- [Varint] Implement ShreddingState::AllNull variant #8093 (codephage2020)
- [Variant] Minor: Add comments to tickets for follow on items #8092 (alamb)
- [VARIANT] Add support for DataType::Struct for cast_to_variant #8090 (carpecodeum)
- [VARIANT] Add support for DataType::Utf8/LargeUtf8/Utf8View for cast_to_variant #8089 (carpecodeum)
- [Variant] Implement
DataType::Boolean
support forcast_to_variant
kernel #8085 (sdf-jkl) - [Variant] Implement
DataType::{Date32,Date64}
=>Variant::Date
#8081 (superserious-dev) - Fix new clippy lints from Rust 1.89 #8078 [parquet] [arrow] [arrow-flight] (alamb)
- Implement ArrowSchema to AvroSchema conversion logic in arrow-avro #8075 [arrow] (jecsand838)
- Implement
DataType::{Binary, LargeBinary, BinaryView}
=>Variant::Binary
#8074 (superserious-dev) - [Variant] Implement
DataType::Float16
=>Variant::Float
#8073 (superserious-dev) - create PageIndexPolicy to allow optional indexes #8071 [parquet] (kczimm)
- [Variant] Minor: use From impl to make conversion infallable #8068 [parquet] (alamb)
- Bump actions/download-artifact from 4 to 5 #8066 (dependabot[bot])
- Added arrow-avro schema resolution foundations and type promotion #8047 [arrow] (jecsand838)
- Fix arrow-avro type resolver register bug #8046 [arrow] (yongkyunlee)
- implement
cast_to_variant
kernel to cast native types toVariantArray
#8044 [parquet] (alamb) - Add arrow-avro
SchemaStore
and fingerprinting #8039 [arrow] (jecsand838) - Add more benchmarks for Parquet thrift decoding #8037 [parquet] (etseidl)
- Support multi-threaded writing of Parquet files with modular encryption #8029 [parquet] (rok)
- Add arrow-avro Decoder Benchmarks #8025 [arrow] (jecsand838)
- feat: add method for sync Parquet reader read bloom filter #8024 [parquet] (mapleFU)
- [Variant] Add
variant_get
and ShreddedVariantArray
#8021 [parquet] (alamb) - Implement arrow-avro SchemaStore and Fingerprinting To Enable Schema Resolution #8006 [arrow] (jecsand838)
- [Parquet] Add tests for IO/CPU access in parquet reader #7971 [parquet] (alamb)
- Speed up Parquet filter pushdown v4 (Predicate evaluation cache for async_reader) #7850 [parquet] (XiangpengHao)
- Implement cast and other operations on decimal32 and decimal64 #7815 [arrow] (CurtHagenlocher)
* This Changelog was automatically generated by github_changelog_generator