Changelog
56.0.0 (2025-07-29)
Breaking changes:
- arrow-schema: Remove dict_id from being required equal for merging #7968 [arrow] (brancz)
- [Parquet] Use
u64
forSerializedPageReaderState.offset
&remaining_bytes
, instead ofusize
#7918 [parquet] (JigaoLuo) - Upgrade tonic dependencies to 0.13.0 version (try 2) #7839 [arrow] [arrow-flight] (alamb)
- Remove deprecated Arrow functions #7830 [arrow] [arrow-flight] (etseidl)
- Remove deprecated temporal functions #7813 [arrow] (etseidl)
- Remove functions from parquet crate deprecated in or before 54.0.0 #7811 [parquet] (etseidl)
- GH-7686: [Parquet] Fix int96 min/max stats #7687 [parquet] (rahulketch)
Implemented enhancements:
- [parquet] Relax type restriction to allow writing dictionary/native batches for same column #8004
- Support casting int64 to interval #7988 [arrow]
- [Variant] Add
ListBuilder::with_value
for convenience #7951 [parquet] - [Variant] Add
ObjectBuilder::with_field
for convenience #7949 [parquet] - [Variant] Impl PartialEq for VariantObject #7943 #7948
- [Variant] Offer
simdutf8
as an optional dependency when validating metadata #7902 [parquet] [arrow] - [Variant] Avoid collecting offset iterator #7901 [parquet]
- [Variant] Remove superfluous check when validating monotonic offsets #7900 [parquet]
- [Variant] Avoid extra allocation in
ObjectBuilder
#7899 [parquet] - [Variant][Compute]
variant_get
kernel #7893 [parquet] - [Variant][Compute] Add batch processing for Variant-JSON String conversion #7883 [parquet]
- Support
MapArray
in lexsort #7881 [arrow] - [Variant] Add testing for invalid variants (fuzz testing??) #7842 [parquet]
- [Variant] VariantMetadata, VariantList and VariantObject are too big for Copy #7831 [parquet]
- Allow choosing flate2 backend #7826 [parquet]
- [Variant] Tests for creating "large"
VariantObjects
s #7821 [parquet] - [Variant] Tests for creating "large"
VariantList
s #7820 [parquet] - [Variant] Support VariantBuilder to write to buffers owned by the caller #7805 [parquet]
- [Variant] Move JSON related functionality to different crate. #7800 [parquet]
- [Variant] Add flag in
ObjectBuilder
to control validation behavior on duplicate field write #7777 [parquet] - [Variant] make
serde_json
an optional dependency ofparquet-variant
#7775 [parquet] - [coalesce] Implement specialized
BatchCoalescer::push_batch
forPrimitiveArray
#7763 [arrow] - Add sort_kernel benchmark for StringViewArray case #7758 [arrow]
- [Variant] Improved API for accessing Variant Objects and lists #7756 [parquet]
- Buildable reproducible release builds #7751
- Allow per-column parquet dictionary page size limit #7723 [parquet]
- [Variant] Test and implement efficient building for "large" Arrays #7699 [parquet]
- [Variant] Improve VariantBuilder when creating field name dictionaries / sorted dictionaries #7698 [parquet]
- [Variant] Add input validation in
VariantBuilder
#7697 [parquet] - [Variant] Support Nested Data in
VariantBuilder
#7696 [parquet] - Parquet: Incorrect min/max stats for int96 columns #7686 [parquet]
- Add
DictionaryArray::gc
method #7683 [arrow] - [Variant] Add negative tests for reading invalid primitive variant values #7645 [parquet]
Fixed bugs:
- [Variant] Panic when appending nested objects to VariantBuilder #7907 [parquet]
- Panic when casting large Decimal256 to f64 due to unchecked
unwrap()
#7886 [arrow] - Incorrect inlined string view comparison after " Add prefix compare for inlined" #7874 [parquet] [arrow]
- [Variant]
test_json_to_variant_object_very_large
takes over 20s #7872 [parquet] - [Variant] If
ObjectBuilder::finalize
is not called, the resulting Variant object is malformed. #7863 [parquet] - CSV error message has values transposed #7848 [arrow]
- Concating struct arrays with no fields unnecessarily errors #7828 [arrow]
- Clippy CI is failing on main after Rust
1.88
upgrade #7796 [parquet] [arrow] [arrow-flight] - [Variant] Field lookup with out of bounds index causes unwanted behavior #7784 [parquet]
- Error verifying
parquet-variant
crate on 55.2.0 withverify-release-candidate.sh
#7746 test_to_pyarrow
tests fail during release verification #7736 [arrow]- [parquet_derive] Example for ParquetRecordWriter is broken. #7732
- [Variant]
Variant::Object
can contain two fields with the same field name #7730 [parquet] - [Variant] Panic when appending Object or List to VariantBuilder #7701 [parquet]
- Slicing a single-field dense union array creates an array with incorrect
logical_nulls
length #7647 [arrow] - Ensure page encoding statistics are written to Parquet file #7643 [parquet] (etseidl)
Documentation updates:
- Minor: Upate
cast_with_options
docs about casting integers --> intervals #8002 [arrow] (alamb) - docs: More docs to
BatchCoalescer
#7891 [arrow] (2010YOUY01) - chore: fix a typo in
ExtensionType::supports_data_type
docs #7682 [arrow] (mbrobbel) - [Variant] Add variant docs and examples #7661 [parquet] (alamb)
- Minor: Add version to deprecation notice for
ParquetMetaDataReader::decode_footer
#7639 [parquet] (etseidl)
Performance improvements:
RowConverter
on list should only encode the sliced list values and not the entire data #7993 [arrow]- [Variant] Avoid extra allocation in list builder #7977 [parquet]
- [Variant] Convert JSON to Variant with fewer copies #7964 [parquet]
- Optimize sort kernels partition_validity method #7936 [arrow]
- Speedup sorting for inline views #7857 [arrow]
- Perf: Investigate and improve parquet writing performance #7822 [parquet] [arrow]
- Perf: optimize sort string_view performance #7790 [arrow]
- Clickbench microbenchmark spends significant time in memcmp for not_empty predicate #7766 [arrow]
- Use prefix first for comparisons, resort to data buffer for remaining data on equal values #7744 [arrow]
- Change use of
inline_value
to inline it to a u128 #7743 [arrow] - Add efficient way to upgrade keys for additional dictionary builders #7654 [arrow]
- Perf: Make sort string view fast(1.5X ~ 3X faster) #7792 [arrow] (zhuqi-lucas)
- Add specialized coalesce path for PrimitiveArrays #7772 [arrow] (alamb)
Closed issues:
- Implement full-range
i256::to_f64
to replace current ±∞ saturation for Decimal256 → Float64 #7985 - [Variant]
impl FromIterator
fprVariantPath
#7955 validated
andis_fully_validated
flags doesn't need to be part of PartialEq #7952 [parquet]- [Variant] remove VariantMetadata::dictionary_size #7947 [parquet]
- [Variant] Improve
VariantArray
performance by storing the index of the metadata and value arrays #7920 - [Variant] Converting variant to JSON string seems slow #7869 [parquet]
- [Variant] Present Variant at Iceberg Summit NYC July 10, 2025 #7858
- [Variant] Avoid second copy of field name in MetadataBuilder #7814 [parquet]
- Remove APIs deprecated in or before 54.0.0 #7810 [parquet] [arrow] [arrow-flight]
- [Variant] Make it harder to forget to finish a pending parent i n ObjectBuilder #7798 [parquet]
- [Variant] Remove explicit ObjectBuilder::finish() and ListBuilder::finish and move to
Drop
impl #7780 [parquet] - Reduce repetition in tests for arrow-row/src/run.rs #7692 [arrow]
- [Variant] Add tests for invalid variant values (aka verify invalid inputs) #7681 [parquet]
- [Variant] Introduce structs for Variant::Decimal types #7660 [parquet]
Merged pull requests:
- Add benchmark for converting StringViewArray with mixed short and long strings #8015 [arrow] (ding-young)
- [Variant] impl FromIterator for VariantPath #8011 [parquet] (sdf-jkl)
- Create empty buffer for a buffer specified in the C Data Interface with length zero #8009 [arrow] (viirya)
- bench: add benchmark for converting list and sliced list to row format #8008 [arrow] (rluvaton)
- bench: benchmark interleave structs #8007 [arrow] (rluvaton)
- [Parquet] Allow writing compatible DictionaryArrays to parquet writer #8005 [parquet] (albertlockett)
- doc: remove outdated info from CONTRIBUTING doc in project root dir. #7998 (sonhmai)
- perf: only encode actual list values in
RowConverter
(16-26 times faster for small sliced list) #7996 [arrow] (rluvaton) - test: add tests for converting sliced list to row based #7994 [arrow] (rluvaton)
- perf: Improve
interleave
performance for struct (3-6 times faster) #7991 [arrow] (rluvaton) - [Variant] Avoid extra buffer allocation in ListBuilder #7987 [parquet] (klion26)
- Implement full-range
i256::to_f64
to eliminate ±∞ saturation for Decimal256 → Float64 casts #7986 [arrow] (kosiew) - Minor: Restore warning comment on Int96 statistics read #7975 [parquet] (alamb)
- Add additional integration tests to arrow-avro #7974 [arrow] (nathaniel-d-ef)
- Perf: optimize actual_buffer_size to use only data buffer capacity for coalesce #7967 [arrow] (zhuqi-lucas)
- Implement Improved arrow-avro Reader Zero-Byte Record Handling #7966 [arrow] (jecsand838)
- Perf: improve sort via
partition_validity
to use fast path for bit map scan (up to 30% faster) #7962 [arrow] (zhuqi-lucas) - [Variant] Revisit VariantMetadata and Object equality #7961 [parquet] (friendlymatthew)
- [Variant] Add ListBuilder::with_value for convenience #7959 [parquet] (codephage2020)
- [Variant] remove VariantMetadata::dictionary_size #7958 [parquet] (codephage2020)
- [Variant] VariantMetadata is allowed to contain the empty string #7956 [parquet] (scovich)
- Add arrow-avro support for Impala Nullability #7954 [arrow] (veronica-m-ef)
- [Test] Add tests for VariantList equality #7953 [parquet] (alamb)
- [Variant] Add ObjectBuilder::with_field for convenience #7950 [parquet] (alamb)
- [Variant] Adding code to store metadata and value references in VariantArray #7945 (abacef)
- [Variant] Add
variant_kernels
benchmark #7944 (alamb) - [Variant] Impl
PartialEq
for VariantObject #7943 [parquet] (friendlymatthew) - [Variant] Add documentation, tests and cleaner api for Variant::get_path #7942 [parquet] (alamb)
- arrow-ipc: Remove all abilities to preserve dict IDs #7940 [parquet] [arrow] [arrow-flight] (brancz)
- Optimize partition_validity function used in sort kernels #7937 [arrow] (jhorstmann)
- [Variant] Avoid extra allocation in object builder #7935 [parquet] (klion26)
- [Variant] Avoid collecting offset iterator #7934 [parquet] (codephage2020)
- Minor: Support BinaryView and StringView builders in
make_builder
#7931 [arrow] (kylebarron) - chore: bump MSRV to 1.84 #7926 [parquet] [arrow] [arrow-flight] (mbrobbel)
- Update bzip2 requirement from 0.4.4 to 0.6.0 #7924 [arrow] (mbrobbel)
- [Variant] Reserve capacity beforehand during large object building #7922 [parquet] (friendlymatthew)
- [Variant] Add
variant_get
compute kernel #7919 [parquet] (Samyak2) - Improve memory usage for
arrow-row -> String/BinaryView
when utf8 validation disabled #7917 [arrow] (ding-young) - Restructure compare_greater function used in parquet statistics for better performance #7916 [parquet] (jhorstmann)
- [Variant] Support appending complex variants in
VariantBuilder
#7914 [parquet] (friendlymatthew) - [Variant] Add
VariantBuilder::new_with_buffers
to write to existing buffers #7912 [parquet] (alamb) - Convert JSON to VariantArray without copying (8 - 32% faster) #7911 [parquet] (alamb)
- [Variant] Use simdutf8 for UTF-8 validation #7908 [parquet] [arrow] (codephage2020)
- [Variant] Avoid superflous validation checks #7906 [parquet] (friendlymatthew)
- Add
VariantArray
andVariantArrayBuilder
for constructing Arrow Arrays of Variants #7905 (alamb) - Update sysinfo requirement from 0.35.0 to 0.36.0 #7904 [parquet] (dependabot[bot])
- Fix current CI failure #7898 [arrow] (viirya)
- Remove redundant is_err checks in Variant tests #7897 [parquet] (viirya)
- [Variant] test: add variant object tests with different sizes #7896 [parquet] (odysa)
- [Variant] Define basic convenience methods for variant pathing #7894 [parquet] (scovich)
- fix:
view_types
benchmark slice should follow by correct len array #7892 [arrow] (zhuqi-lucas) - Add arrow-avro support for bzip2 and xz compression #7890 [arrow] (jecsand838)
- Add arrow-avro support for Duration type and minor fixes for UUID decoding #7889 [arrow] (jecsand838)
- [Variant] Reduce variant-related struct sizes #7888 [parquet] (scovich)
- Fix panic on lossy decimal to float casting: round to saturation for overflows #7887 [arrow] (kosiew)
- Add tests for invalid variant metadata and value #7885 [parquet] (viirya)
- [Variant] Introduce parquet-variant-compute crate to transform batches of JSON strings to and from Variants #7884 (harshmotw-db)
- feat: support
MapArray
in lexsort #7882 [arrow] (rluvaton) - fix: mark
DataType::Map
as unsupported inRowConverter
#7880 [arrow] (rluvaton) - [Variant] Speedup validation #7878 [parquet] (friendlymatthew)
- benchmark: Add StringViewArray gc benchmark with not null cases #7877 [arrow] (zhuqi-lucas)
- [ARROW-RS-7820][Variant] Add tests for large variant lists #7876 [parquet] (klion26)
- fix: Incorrect inlined string view comparison after Add prefix compar… #7875 [arrow] (zhuqi-lucas)
- perf: speed up StringViewArray gc 1.4 ~5.x faster #7873 [arrow] (zhuqi-lucas)
- [Variant] Remove superflous validate call and rename methods #7871 [parquet] (friendlymatthew)
- Benchmark: Add rich testing cases for sort string(utf8) #7867 [arrow] (zhuqi-lucas)
- chore: update link for
row_filter.rs
#7866 [parquet] (haohuaijin) - [Variant] List and object builders have no effect until finalized #7865 [parquet] (scovich)
- Added number to string benches for json_writer #7864 [arrow] (abacef)
- [Variant] Introduce
parquet-variant-json
crate #7862 [parquet] (alamb) - [Variant] Remove dead code, add comments #7861 [parquet] (alamb)
- Speedup sorting for inline views: 1.4x - 1.7x improvement #7856 [arrow] (Dandandan)
- Fix union slice logical_nulls length #7855 [arrow] (codephage2020)
- Add
get_ref/get_mut
to JSON Writer #7854 [arrow] (cetra3) - [Minor] Add Benchmark for RowConverter::append #7853 [arrow] (Dandandan)
- Add Enum type support to arrow-avro and Minor Decimal type fix #7852 [arrow] (jecsand838)
- CSV error message has values transposed #7851 [arrow] (Omega359)
- [Variant] Fuzz testing and benchmarks for vaildation #7849 [parquet] (carpecodeum)
- [Variant] Follow up nits and uncomment test cases #7846 [parquet] (friendlymatthew)
- [Variant] Make sure ObjectBuilder and ListBuilder to be finalized before its parent builder #7843 [parquet] (viirya)
- Add decimal32 and decimal64 support to Parquet, JSON and CSV readers and writers #7841 [parquet] [arrow] (CurtHagenlocher)
- Implement arrow-avro Reader and ReaderBuilder #7834 [arrow] (jecsand838)
- [Variant] Support creating sorted dictionaries #7833 [parquet] (friendlymatthew)
- Add Decimal type support to arrow-avro #7832 [arrow] (jecsand838)
- Allow concating struct arrays with no fields #7829 [arrow] (AdamGS)
- Add features to configure flate2 #7827 [parquet] (zeevm)
- make builder public under experimental #7825 [parquet] (XiangpengHao)
- Improvements for parquet writing performance (25%-44%) #7824 [parquet] [arrow] (jhorstmann)
- Use in-memory buffer for arrow_writer benchmark #7823 [parquet] (jhorstmann)
- [Variant] impl [Try]From for VariantDecimalXX types #7809 [parquet] (scovich)
- [Variant] Speedup
ObjectBuilder
(62x faster) #7808 [parquet] (friendlymatthew) - [VARIANT] Support both fallible and infallible access to variants #7807 [parquet] (scovich)
- Minor: fix clippy in parquet-variant after logical conflict #7803 [parquet] (alamb)
- [Variant] Add flag in
ObjectBuilder
to control validation behavior on duplicate field write #7801 [parquet] (micoo227) - Fix clippy for Rust 1.88 release #7797 [parquet] [arrow] [arrow-flight] (alamb)
- [Variant] Simplify
Builder
buffer operations #7795 [parquet] (friendlymatthew) - fix: Change panic to error in
take
kernel for StringArrary/BinaryArray on overflow #7793 [arrow] (chenkovsky) - Update base64 requirement from 0.21 to 0.22 #7791 [parquet] (dependabot[bot])
- Fix RowConverter when FixedSizeList is not the last #7789 [arrow] (findepi)
- Add schema with only primitive arrays to
coalesce_kernel
benchmark #7788 [arrow] (alamb) - Add sort_kernel benchmark for StringViewArray case #7787 [arrow] (zhuqi-lucas)
- [Variant] Check pending before
VariantObject::insert
#7786 [parquet] (friendlymatthew) - [VARIANT] impl Display for VariantDecimalXX #7785 [parquet] [arrow] (scovich)
- [VARIANT] Add support for the json_to_variant API #7783 [parquet] (harshmotw-db)
- [Variant] Consolidate examples for json writing #7782 [parquet] (alamb)
- Add benchmark for about view array slice #7781 [arrow] (ctsk)
- [Variant] Add negative tests for reading invalid primitive variant values #7779 [parquet] (superserious-dev)
- [Variant] Support creating nested objects and object with lists #7778 [parquet] (friendlymatthew)
- [VARIANT] Validate precision in VariantDecimalXX structs and add missing tests #7776 [parquet] (scovich)
- Add tests for
BatchCoalescer::push_batch_with_filter
, fix bug #7774 [arrow] (alamb) - [Variant] Minor: make fields in
VariantDecimal*
private, add examples #7770 [parquet] (alamb) - Extend the fast path in GenericByteViewArray::is_eq for comparing against empty strings #7767 [arrow] (jhorstmann)
- [Variant] Improve getter API for
VariantList
andVariantObject
#7757 [parquet] (friendlymatthew) - [Variant] Add Variant::as_object and Variant::as_list #7755 [parquet] (alamb)
- [Variant] Fix several overflow panic risks for 32-bit arch #7752 [parquet] (scovich)
- Add testing section to pull request template #7749 (alamb)
- Perf: Add prefix compare for inlined compare and change use of inline_value to inline it to a u128 #7748 [arrow] (zhuqi-lucas)
- Move arrow-pyarrow tests that require
pyarrow
to be installed intoarrow-pyarrow-testing
crate #7742 (alamb) - [Variant] Improve write API in
Variant::Object
#7741 [parquet] (friendlymatthew) - [Variant] Support nested lists and object lists #7740 [parquet] (friendlymatthew)
- feat: [Variant] Add Validation for Variant Deciaml #7738 [parquet] (Weijun-H)
- Add fallible versions of temporal functions that may panic #7737 [arrow] (adriangb)
- fix: Implement support for appending Object and List variants in VariantBuilder #7735 [parquet] (Weijun-H)
- parquet_derive: update in working example for ParquetRecordWriter #7733 (LanHikari22)
- Perf: Optimize comparison kernels for inlined views #7731 [arrow] (zhuqi-lucas)
- arrow-row: Refactor arrow-row REE roundtrip tests #7729 [arrow] (brancz)
- arrow-array: Implement PartialEq for RunArray #7727 [arrow] (brancz)
- fix: Do not add null buffer for
NullArray
in MutableArrayData #7726 [arrow] (comphead) - Allow per-column parquet dictionary page size limit #7724 [parquet] (XiangpengHao)
- fix JSON decoder error checking for UTF16 / surrogate parsing panic #7721 [arrow] (nicklan)
- [Variant] Use
BTreeMap
forVariantBuilder.dict
andObjectBuilder.fields
to maintain invariants upon entry writes #7720 [parquet] (friendlymatthew) - Introduce
MAX_INLINE_VIEW_LEN
constant for string/byte views #7719 [arrow] (alamb) - [Variant] Introduce new type over &str for ShortString #7718 [parquet] (friendlymatthew)
- Split out variant code into several new sub-modules #7717 [parquet] (scovich)
- add
garbage_collect_dictionary
toarrow-select
#7716 [arrow] (davidhewitt) - Support write to buffer api for SerializedFileWriter #7714 [parquet] (zhuqi-lucas)
- Support
FixedSizeList
RowConverter #7705 [arrow] (findepi) - Make variant iterators safely infallible #7704 [parquet] (scovich)
- Speedup
interleave_views
(4-7x faster) #7695 [arrow] (Dandandan) - Define a "arrow-pyrarrow" crate to implement the "pyarrow" feature. #7694 [arrow] (brunal)
- feat: add constructor to efficiently upgrade dict key type to remaining builders #7689 [arrow] (albertlockett)
- Document REE row format and add some more tests #7680 [arrow] (alamb)
- feat: add min max aggregate support for FixedSizeBinary #7675 [arrow] (alexwilcoxson-rel)
- arrow-data: Add REE support for
build_extend
andbuild_extend_nulls
#7671 [arrow] (brancz) - Variant: Write Variant Values as JSON #7670 [parquet] (carpecodeum)
- Remove
lazy_static
dependency #7669 [arrow] (Expyron) - Finish implementing Variant::Object and Variant::List #7666 [parquet] (scovich)
- Add
RecordBatch::schema_metadata_mut
andField::metadata_mut
#7664 [arrow] (emilk) - [Variant] Simplify creation of Variants from metadata and value #7663 [parquet] (alamb)
- chore: group prost dependabot updates #7659 (mbrobbel)
- Initial Builder API for Creating Variant Values #7653 [parquet] (PinkCrow007)
- Add
BatchCoalescer::push_filtered_batch
and docs #7652 [arrow] (alamb) - Optimize coalesce kernel for StringView (10-50% faster) #7650 [arrow] (alamb)
- arrow-row: Add support for REE #7649 [arrow] (brancz)
- Use approximate comparisons for pow tests #7646 [arrow] (adamreeve)
- [Variant] Implement read support for remaining primitive types #7644 [parquet] (superserious-dev)
- Add
pretty_format_batches_with_schema
function #7642 [arrow] (lewiszlw) - Deprecate old Parquet page index parsing functions #7640 [parquet] (etseidl)
- Update FlightSQL
GetDbSchemas
andGetTables
schemas to fully match the protocol #7638 [arrow] [arrow-flight] (sgrebnov) - Minor: Remove outdated FIXME from
ParquetMetaDataReader
#7635 [parquet] (etseidl) - Fix the error info of
StructArray::try_new
#7634 [arrow] (xudong963) - Fix reading encrypted Parquet pages when using the page index #7633 [parquet] (adamreeve)
- [Variant] Add commented out primitive test casees #7631 [parquet] (alamb)
* This Changelog was automatically generated by github_changelog_generator