Changelog

58.2.0 (2026-04-28)

Full Changelog

Implemented enhancements:

Expose ColumnCloseResult on ArrowColumnChunk #9774 [parquet]
Expose FFI data structures fields #9771 [arrow]
short-circuit last predicate in RowFilter when with_limit(N) is set #9765 [parquet]
vectorise dict-index bounds check #9747 [parquet]
Refactor RleEncoder::flush_bit_packed_run #9734 [parquet]
Add benchmark for cast from/to decimals #9728 [arrow]
Add a security policy for arrow-rs #9727 [parquet] [arrow] [arrow-flight]
Support FixedSizeList in arrow-json reader #9714 [arrow]
[Variant] Add VariantArrayBuilder::append_nulls API #9684
[Json] RunEndEncoded decoder optimization #9645 [arrow]
[Variant] variant_get(..., List<_>) non-Struct types support #9615
[Variant] Add unshredded Struct fast-path for variant_get(..., Struct) #9596
Allow setting custom line terminator for CSV writer #9571 [arrow]
[Variant] Align cast logic for variant_get to cast kernel for numeric/bool types #9564 [arrow]
ci: use ubuntu-slim where applicable #9536
Publicly export arrow_string::Predicate and its methods? #9480
Don't create CompressionContext when no compression is selected [IPC] #9463 [arrow]
Parquet: Raw level buffering causes unbounded memory growth for sparse columns #9446 [parquet]
Parallel Parquet Reading #9381 [parquet]

Fixed bugs:

[Variant] unshred_variant panics on malformed bytes despite returning Result #9740
RecordBatch::normalize() does not propagate top level null bitmap into the results #9732 [arrow]
Incorrect accounting in DictEncoder::estimated_memory_size #9719 [parquet]
arrow-ipc writer does not comply with spec for empty variable-size arrays #9716 [arrow]
Panic when reading corrupt parquet file with truncated data instead of ParquetError #9705 [parquet]
NOTICE.txt is inaccurate #9703 [arrow]
Unnecessary dependency on regex crate #9672
[arrow-avro] Avro reader produces incorrect results when reader schema and writer schema differ #9655 [arrow]
parquet docs are broken on docs.rs #9649
[Parquet] ArrowWriter with CDC panics on nested ListArrays #9637 [parquet] [arrow] [arrow-flight]
Use release KEYS file for verification instead of dev KEYS #9603
IPC reader: handling of dictionaries with only null values #9595 [arrow]
Parquet RleDecoder::get_batch_with_dict panics on oob dictionary indices #9434 [parquet]

Documentation updates:

docs(variant): link VariantArray doc to official Parquet Variant extension type #9779 (mcharrel)
Document Security Policy #9730 [parquet] [arrow] [arrow-flight] (alamb)
Docs: add example of how to read parquet row groups in parallel #9396 [parquet] (alamb)

Performance improvements:

parquet: avoid decode and heap allocation on terminal skip in DeltaBitPackDecoder #9784 [parquet]
parquet: O(1) skip for bw=0 miniblocks in DeltaBitPackDecoder #9783 [parquet]
Remove per-message flush overhead in Arrow IPC writer #9762 [arrow]
Support GenericListViewArray::new_unchecked and refactor ListView json decoder #9646 [arrow]
Support nested REE in arrow-ord partition function #9640 [arrow]
[Parquet] Remove the BIT_PACKED encoder #9635 [parquet]
Pre-reserve output capacity in ByteView/ByteArray dictionary decoding #9587 [parquet]
Fuse RLE decoding and view gathering for StringView dictionary decoding #9582 [parquet]
Use branchless index clamping and add get_batch_direct to RleDecoder #9581 [parquet]
Reduce per-byte overhead in VLQ integer decoding #9580 [parquet]
feat(parquet): batch RLE runs in level encoder via scan-ahead #9830 [parquet] (HippoBaro)
fix: lazy-init zstd compression contexts to avoid unnecessary FFI calls #9808 [arrow] (mbutrovich)
parquet: O(1) skip for bw=0 miniblocks in DeltaBitPackDecoder #9786 [parquet] (sahuagin)
chore: add benchmark for row filters with LIMIT short-circuit #9767 [parquet] (haohuaijin)
Push LIMIT / OFFSET into the last RowFilter predicate and skip unused row groups #9766 [parquet] (haohuaijin)
feat(ipc): Remove per-message flush in IPC writer hot path #9763 [arrow] (pchintar)
perf(parquet): Defer fixed length byte array buffer alloc and skip zero-batch init #9756 [parquet] (lyang24)
feat(parquet): batch consecutive null/empty rows in write_list #9752 [parquet] (HippoBaro)
Remove len field from buffer builder #9750 [arrow] (cetra3)
perf(parquet): Vectorize dict-index bounds check in RleDecoder::get_batch_with_dict (up to -7.9%) #9746 [parquet] (Dandandan)
feat(parquet): precompute offset_index_disabled at build-time #9724 [parquet] (HippoBaro)
[Parquet] Improve dictionary decoder by unrolling loops #9662 [parquet] (Dandandan)
[Json] Use partition and take in RunEndEncoded decoder #9658 [arrow] (liamzwbao)
Improve take performance on List arrays #9643 [arrow] (AdamGS)
[Json] Replace ArrayData with typed Array construction in json-reader #9497 [arrow] (liamzwbao)
feat(parquet): stream-encode definition/repetition levels incrementally #9447 [parquet] (HippoBaro)

Closed issues:

Incorrect buffer skipping for V4 Union types in IPC skip_field #9828 [arrow]
Replace wildcard match in skip_field with explicit DataType handling #9821 [arrow]
Column projection misalignment for ListView / LargeListView in IPC reader #9805 [arrow]
Avoid panic on malformed compressed buffer prefix in IPC #9801 [arrow]
DeltaByteArrayDecoder panics on invalid prefix lengths #9796 [parquet]
Use NullBufferBuilder when reading json #9781 [arrow]
Perfectly shredded arrays with top-level null values loss nullability when typed_value is extracted #9701
[Parquet Metadata] API to determine page-index presence separately from page-index load #9693
Union cast is incorrect for duplicate field names #9664 [arrow]
List and ListView are missing take benchmarks #9627 [arrow]
Support RunEndEncoded arrays in comparison kernels (eq, lt, etc.) #9620 [arrow]
variant_get should follow JSONpath semantics #9606
GenericByteViewArray: support finding total length of all strings #9435 [arrow]

Merged pull requests:

support length() on Run-end encoding arrays #9838 [arrow] (Rich-T-kid)
fix(ipc): correct skip_field handling for V4 Union #9829 [arrow] (pchintar)
fix(ipc): replace wildcard in skip_field with explicit DataType handling #9822 [arrow] (pchintar)
Prevent buffer builder length overflow in MutableBuffer::extend_zeros #9820 [arrow] (alamb)
Prevent repeat slice length overflow #9819 [arrow] (alamb)
Prevent BitChunks length overflow #9818 [arrow] (alamb)
Prevent Rows row index overflow #9817 [arrow] (alamb)
Prevent ArrayData validation length overflow #9816 [arrow] (alamb)
[Json] Remove arrow-data dependency from arrow-json #9812 [arrow] (liamzwbao)
Replace BooleanBufferBuilder with NullBufferBuilder in arrow-json if applicable #9811 [arrow] (liamzwbao)
refactor(ipc): derive Default for CompressionContext #9809 [arrow] (mbutrovich)
fix(ipc): reader misalignment when skipping ListView / LargeListView columns #9806 [arrow] (pchintar)
fix(ipc): Avoid panic on malformed compressed buffer prefix #9802 [arrow] (pchintar)
parquet: fix panic in DeltaByteArrayDecoder on invalid prefix lengths #9797 [parquet] (pchintar)
feat(parquet): fuse level encoding with counting and histogram updates #9795 [parquet] (HippoBaro)
Expose ColumnCloseResult on ArrowColumnChunk #9773 [parquet] (leoyvens)
feat: make FFI structs fields pub #9772 [arrow] (ashdnazg)
chore: Refine the error message for List to non List cast #9757 [arrow] (comphead)
refactor(parquet): replace magic 8 literals with named constants #9751 [parquet] (HippoBaro)
feat(ipc): add with_skip_validation to StreamDecoder #9749 [arrow] (pantShrey)
remove panics in unshred variant #9741 (friendlymatthew)
Add benchmark for ListView interleave #9738 [arrow] (vegarsti)
arrow-arith: fix 'occured' -> 'occurred' in arity.rs comments #9736 [arrow] (SAY-5)
Refactor RleEncoder::flush_bit_packed_run to make flow clearer #9735 [parquet] (etseidl)
Fix RecordBatch::normalize() null bitmap bug and add StructArray::flatten() #9733 [arrow] (sqd)
Add benchmark for cast from/to decimals #9729 [arrow] (klion26)
refactor(arrow-avro): use Decoder::flush_block in async reader #9726 [arrow] (mzabaluev)
fix: ParquetError when reading corrupt parquet file with truncated data instead of Panic #9725 [parquet] (xuzifu666)
feat(parquet): add wide-schema writer overhead benchmark #9723 [parquet] (HippoBaro)
fix: correct accounting in DictEncoder::estimated_memory_size, Interner::estimated_memory_size #9720 [parquet] (mzabaluev)
arrow-ipc: Write 0 offset buffer for length-0 variable-size arrays #9717 [arrow] (atwam)
[Json] Support FixedSizeList in json decoder #9715 [arrow] (liamzwbao)
chore(deps): bump actions/upload-pages-artifact from 4 to 5 #9713 (dependabot[bot])
Fix clippy warning in fixed_size_binary_array.rs #9712 [arrow] (AdamGS)
feat: add has_non_empty_nulls helper function in OffsetBuffer #9711 [arrow] (rluvaton)
chore(deps): bump pytest from 7.2.0 to 9.0.3 in /parquet/pytest #9706 [parquet] (dependabot[bot])
Fedora license audit #9704 [arrow] (michel-slm)
[Variant] Take top-level nulls into consideration when extracting perfectly shredded children #9702 (AdamGS)
feat(parquet): add push_decoder benchmark for PushBuffers overhead #9696 [parquet] (HippoBaro)
Add mutable bitwise operations to BooleanArray and NullBuffer::union_many #9692 [arrow] (mbutrovich)
chore(deps): update hashbrown requirement from 0.16.0 to 0.17.0 #9691 [parquet] [arrow] (dependabot[bot])
chore(deps): bump actions/github-script from 8 to 9 #9690 (dependabot[bot])
minor: Re-enable CDC bench #9686 [parquet] (etseidl)
[Variant] Add VariantArrayBuilder::append_nulls API #9685 (sdf-jkl)
feat(parquet): add struct-column writer benchmarks #9679 [parquet] (HippoBaro)
[Arrow] Add API to check if Field has a valid ExtensionType #9677 [parquet] [arrow] (sdf-jkl)
[Variant] variant_get should follow JSONPath semantics for Field path element #9676 (sdf-jkl)
ParquetMetaDataPushDecoder API to clear all buffered ranges #9673 [parquet] (nathanb9)
Fix union cast incorrectness for duplicate field names #9666 [arrow] (friendlymatthew)
chore: re-export MAX_INLINE_VIEW_LEN from arrow_data #9665 [arrow] (rluvaton)
No longer allow BIT_PACKED level encoding in Parquet writer #9656 [parquet] (etseidl)
feat(parquet): add sparse-column writer benchmarks #9654 [parquet] (HippoBaro)
Support GenericListViewArray::new_unchecked and refactor ListView json decoder #9648 [arrow] (liamzwbao)
[Json] Add json reader benchmarks for ListView #9647 [arrow] (liamzwbao)
fix(parquet): fix CDC panic on nested ListArrays with null entries #9644 [parquet] (kszucs)
Add a test for reading nested REE data in json #9634 [arrow] (alamb)
[Variant] Fix variant_get to return List<T> instead of List<Struct> #9631 (liamzwbao)
ci: use ubuntu-slim runner for lightweight CI jobs #9630 (CuteChuanChuan)
Add bloom filter folding to automatically size SBBF filters #9628 [parquet] (adriangb)
Add List and ListView take benchmarks #9626 [arrow] (AdamGS)
ParquetPushDecoder API to clear all buffered ranges #9624 [parquet] (nathanb9)
fix: handle missing dictionary batch for null-only columns in IPC reader #9623 [arrow] (joaquinhuigomez)
Fix MutableBuffer::clear #9622 [parquet] [arrow] (Rafferty97)
feat[arrow-ord]: suppport REE comparisons #9621 [arrow] (asubiotto)
chore(deps): update sha2 requirement from 0.10 to 0.11 #9618 [arrow] (dependabot[bot])
Expose option to set line terminator for CSV writer #9617 [arrow] (svranesevic)
[Json] Add json reader benchmarks for Map and REE #9616 [arrow] (liamzwbao)
deps: fix object_store breakage for 0.13.2 #9612 (mzabaluev-flarion)
[Variant] Support Binary/LargeBinary children #9610 (AdamGS)
fix: use writer types in Skipper for resolved named record types #9605 [arrow] (ariel-miculas)
feat(parquet): derive PartialEq and Eq for CdcOptions #9602 [parquet] (kszucs)
Add finish_preserve_values to ArrayBuilder trait #9601 [arrow] (adamreichold)
[Variant] extend shredded null handling for arrays #9599 (sdf-jkl)
[Variant] Add unshredded Struct fast-path for variant_get(..., Struct) #9597 (sdf-jkl)
Pre-reserve output capacity in ByteView/ByteArray dictionary decoding #9590 [parquet] (Dandandan)
[Variant] Align cast logic for variant_get to cast kernel for numeric/bool types #9563 [arrow] (klion26)
Add support to cast from UnionArray #9544 [arrow] (friendlymatthew)
Support ListView codec in arrow-json #9503 [arrow] (liamzwbao)

* This Changelog was automatically generated by github_changelog_generator

apache/arrow-rs 58.2.0 arrow 58.2.0 on GitHub

Changelog

58.2.0 (2026-04-28)

apache/arrow-rs 58.2.0
arrow 58.2.0

on GitHub