Changelog
58.1.0 (2026-03-20)
Implemented enhancements:
- Reuse compression dict lz4_block #9566
- [Variant] Add
variant_to_arrowStructtype support #9529 - [Variant] Add
unshred_variantsupport forBinaryandLargeBinarytypes #9526 - [Variant] Add
shred_variantsupport forLargeUtf8andLargeBinarytypes #9525 - [Variant]
variant_gettests clean up #9517 - parquet_variant: Support LargeUtf8 typed value in
unshred_variant#9513 - parquet-variant: Support string view typed value in
unshred_variant#9512 - Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9490 [arrow]
- Followup for support ['fieldName'] in VariantPath #9478
- Speedup DELTA_BINARY_PACKED decoding when bitwidth is 0 #9476 [parquet]
- Support CSV files encoded with charsets other than UTF-8 #9465 [arrow]
- Expose Avro writer schema when building the reader #9460 [arrow]
- Python: avoid importing pyarrow classes ever time #9438
- Add
append_nullstoMapBuilder#9431 [arrow] - Add
append_non_nullstoStructBuilder#9429 [arrow] - Add
append_value_nto GenericByteBuilder #9425 [arrow] - Optimize
from_bitwise_binary_op#9378 [arrow] - Configurable Arrow representation of UTC timestamps for Avro reader #9279 [arrow]
Fixed bugs:
- MutableArrayData::extend does not copy child values for ListView arrays #9561 [arrow]
- ListView interleave bug #9559 [arrow]
- Flight encoding panics with "no dict id for field" with nested dict arrays #9555 [arrow] [arrow-flight]
- "DeltaBitPackDecoder only supports Int32Type and Int64Type" but unsigned types are supported too #9551 [parquet]
- Potential overflow when calling
util::bit_mask::set_bits(soundness issue) #9543 [arrow] - handle Null type in try_merge for Struct, List, LargeList, and Union #9523 [arrow]
- Invalid offset in sparse column chunk data for multiple predicates #9516 [parquet]
- debug_assert_eq! in BatchCoalescer panics in debug mode when batch_size < 4 #9506 [arrow]
- Parquet Statistics::null_count_opt wrongly returns Some(0) when stats are missing #9451 [parquet]
- Error "Not all children array length are the same!" when decoding rows spanning across page boundaries in parquet file when using
RowSelection#9370 [parquet] - Avro schema resolution not properly supported for complex types #9336 [arrow]
Documentation updates:
Performance improvements:
- Introduce
NullBuffer::try_from_unslicedto simplify array construction #9385 [parquet] [arrow] - perf: Coalesce page fetches when RowSelection selects all rows #9578 [parquet] (Dandandan)
- Use chunks_exact for has_true/has_false to enable compiler unrolling #9570 [arrow] (adriangb)
- pyarrow: Cache the imported classes to avoid importing them each time #9439 (Tpt)
Closed issues:
- Duplicate macro definition:
partially_shredded_variant_array_gen#9492 - Enable
LargeList/ListView/LargeListViewforVariantArray::try_new#9455 - Support variables/expressions in record_batch! macro #9245 [arrow]
Merged pull requests:
- [Variant] Add unshred_variant support for Binary and LargeBinary types #9576 (kunalsinghdadhwal)
- [Variant] Add
variant_to_arrowStructtype support #9572 (sdf-jkl) - Make Sbbf Constructers Public #9569 [parquet] (cetra3)
- fix: Used
checked_addfor bounds checks to avoid UB #9568 [arrow] (etseidl) - Add mutable operations to BooleanBuffer (Bit*Assign) #9567 [arrow] (Dandandan)
- chore(deps): update lz4_flex requirement from 0.12 to 0.13 #9565 [parquet] [arrow] (dependabot[bot])
- arrow-select: fix MutableArrayData interleave for ListView #9560 [arrow] (asubiotto)
- Move
ValueIterinto own module, and add publicrecord_countfunction #9557 [arrow] (Rafferty97) - arrow-flight: generate dict_ids for dicts nested inside complex types #9556 [arrow] [arrow-flight] (asubiotto)
- add
shred_variantsupport forLargeUtf8andLargeBinary#9554 (sdf-jkl) - [minor] Download clickbench file when missing #9553 [parquet] (Dandandan)
- DeltaBitPackEncoderConversion: Fix panic message on invalid type #9552 [parquet] (progval)
- Replace interleave overflow panic with error #9549 [arrow] (xudong963)
- feat(arrow-avro):
HeaderInfoto expose OCF header #9548 [arrow] (mzabaluev) - chore: Protect
mainbranch with required reviews #9547 (comphead) - Add benchmark for
infer_json_schema#9546 [arrow] (Rafferty97) - chore(deps): bump black from 24.3.0 to 26.3.1 in /parquet/pytest #9545 [parquet] (dependabot[bot])
- Unroll interleave -25-30% #9542 [arrow] (Dandandan)
- Optimize
take_fixed_size_binaryFor Predefined Value Lengths #9535 [arrow] (tobixdev) - feat: expose arrow schema on async avro reader #9534 [arrow] (mzabaluev)
- Make with_file_decryption_properties pub instead of pub(crate) #9532 [parquet] (Dandandan)
- fix: handle Null type in try_merge for Struct, List, LargeList, and Union #9524 [arrow] (zhuqi-lucas)
- chore: extend record_batch macro to support variables and expressions #9522 [arrow] (buraksenn)
- [Variant] clean up
variant_gettests #9518 (sdf-jkl) - support large string for unshred variant #9515 (friendlymatthew)
- support string view unshred variant #9514 (friendlymatthew)
- Add has_true() and has_false() to BooleanArray #9511 [arrow] (adriangb)
- Fix Invalid offset in sparse column chunk data error for multiple predicates #9509 [parquet] (cetra3)
- fix: remove incorrect debug assertion in BatchCoalescer #9508 [arrow] (Tim-53)
- [Json] Add benchmarks for list json reader #9507 [arrow] (liamzwbao)
- fix: first next_back() on new RowsIter panics #9505 [arrow] (rluvaton)
- Add some benchmarks for decoding delta encoded Parquet #9500 [parquet] (etseidl)
- chore: remove duplicate macro
partially_shredded_variant_array_gen#9498 (codephage2020) - Deprecate ArrowTimestampType::make_value in favor of from_naive_datetime #9491 [arrow] (codephage2020)
- fix: Do not assume missing nullcount stat means zero nullcount #9481 [parquet] (scovich)
- [Variant] Enahcne bracket access for VariantPath #9479 (klion26)
- Optimize delta binary decoder in the case where bitwidth=0 #9477 [parquet] (etseidl)
- Add PrimitiveRunBuilder::with_data_type() to customize the values' DataType #9473 [arrow] (brunal)
- Convert
prettyprinttests inarrow-casttoinstainline snapshots #9472 [parquet] [arrow] (grtlr) - Update strum_macros requirement from 0.27 to 0.28 #9471 [arrow] (dependabot[bot])
- docs(parquet): Fix broken links in README #9467 [parquet] (SYaoJun)
- Add list-like types support to VariantArray::try_new #9457 (sdf-jkl)
- Simplify downcast_...!() macro definitions #9454 [arrow] (brunal)
- feat(parquet): add content defined chunking for arrow writer #9450 [parquet] (kszucs)
- refactor: simplify iterator using cloned().map(Some) #9449 [parquet] (SYaoJun)
- feat: Optimize from_bitwise_binary_op with 64-bit alignment #9441 [arrow] (kunalsinghdadhwal)
- docs: fix markdown link syntax in README #9440 (SYaoJun)
- Move
ListLikeArrayto arrow-array to be shared with json writer and parquet unshredding #9437 [arrow] (liamzwbao) - Add
claimmethod to recordbatch for memory accounting #9433 [arrow] (cetra3) - Add
append_nullstoMapBuilder#9432 [arrow] (Fokko) - Add
append_non_nullstoStructBuilder#9430 [arrow] (Fokko) - Add
append_value_nto GenericByteBuilder #9426 [arrow] (Fokko) - refactor: simplify dynamic state for Avro record projection #9419 [arrow] (mzabaluev)
- Add
NullBuffer::from_unsliced_bufferhelper and refactor call sites #9411 [parquet] [arrow] (Eyad3skr) - Implement min, max, sum for run-end-encoded arrays. #9409 [arrow] (brunal)
- feat: add
RunArray::new_uncheckedandRunArray::into_parts#9376 [arrow] (rluvaton) - Fix skip_records over-counting when partial record precedes num_rows page skip #9374 [parquet] (jonded94)
- fix: resolution of complex type variants in Avro unions #9328 [arrow] (mzabaluev)
- feat(arrow-avro): Configurable Arrow timezone ID for Avro timestamps #9280 [arrow] (mzabaluev)
* This Changelog was automatically generated by github_changelog_generator