Changelog
57.0.0 (2025-10-19)
Breaking changes:
- Use
Arc<FileEncryptionProperties>everywhere to be be consistent withFileDecryptionProperties#8626 [parquet] (alamb) - feat: Improve DataType display for
RunEndEncoded#8596 [arrow] (Weijun-H) - Add
ArrowError::AvroError, remaining types and roundtrip tests toarrow-avro, #8595 [arrow] (jecsand838) - [thrift-remodel] Refactor Thrift encryption and store encodings as bitmask #8587 [parquet] (etseidl)
- feat: Enhance
Mapdisplay formatting in DataType #8570 [arrow] (Weijun-H) - feat: Enhance DataType display formatting for
ListViewandLargeListViewvariants #8569 [arrow] (Weijun-H) - Use custom thrift parser for parquet metadata (phase 1 of Thrift remodel) #8530 [parquet] (etseidl)
- refactor: improve display formatting for Union #8529 [arrow] (Weijun-H)
- Use
Arc<FileDecryptionProperties>to reduce size of ParquetMetadata and avoid copying whenencryptionis enabled #8470 [parquet] (alamb) - Fix for column name based projection mask creation #8447 [parquet] (etseidl)
- Improve Display formatting of DataType::Timestamp #8425 [parquet] [arrow] (emilk)
- Use more compact Debug formatting of Field #8424 [arrow] (emilk)
- Reuse zstd compression context when writing IPC #8405 [arrow] [arrow-flight] (albertlockett)
- [Decimal] Add scale argument to validation functions to ensure accurate error logging #8396 [arrow] (Weijun-H)
- Quote
DataType::Structfield names inDisplayformatting #8291 [parquet] [arrow] (emilk) - Improve
DisplayforDataTypeandField#8290 [parquet] [arrow] (emilk) - Bump pyo3 to 0.26.0 #8286 (mbrobbel)
Implemented enhancements:
- Added Avro support (new
arrow-avrocrate) #4886 - parquet-rewrite: supports compression level and write batch size #8639
- Error not panic when int96 stastistics aren't size 12 #8614 [parquet]
- [Variant] Make
VariantArrayiterable #8612 - [Variant] impl
PartialEqforVariantArray#8610 - [Variant] Remove potential panics when probing
VariantArray#8609 - [Variant] Remove ceremony of going from list of
VarianttoVariantArray#8606 - Eliminate redundant validation in
RecordBatch::project#8591 [arrow] - [PARQUET][BENCH] Arrow writer bench with compression and/or page v2 #8559 [parquet]
- [Variant] casting functions are confusingly named #8531 [parquet]
- Support writing GeospatialStatistics in Parquet writer #8523 [parquet]
- [thrift-remodel] Optimize
convert_row_groups#8517 [parquet] - [Variant] Add variant to arrow primitive support for boolean/timestamp/time #8515
- Test
thrift-remodelbranch with DataFusion #8513 [parquet] - Make
UnionArray::is_denseMethod Public #8503 [arrow] - Add
append_nmethod toFixedSizeBinaryDictionaryBuilder#8497 [arrow] - [Parquet] Reduce size of ParquetMetadata when encryption feature is enabled #8469 [parquet]
- [Parquet] Remove useless mut requirements in geting bloom filter function #8461 [parquet]
- Change
serdedependency toserde_corewhere applicable #8451 [arrow] - [Parquet] Split
ParquetMetadataReaderinto IO/decoder state machine and thrift parsing #8439 [parquet] - Remove compiler warning for redundant config enablement #8412 [arrow]
- Add geospatial statistics creation support for GEOMETRY/GEOGRAPHY Parquet logical types #8411 [arrow]
arrow_jsonlackswith_timestamp_formatfunctions likearrow_csvhad offered #8398 [arrow]- Unify API for writing column chunks / row groups in parallel #8389 [parquet]
- Reuse zstd context in arrow IPC writer #8386 [arrow] [arrow-flight]
- [Variant] Support reading/writing Parquet Variant LogicalType #8370 [parquet]
- [Variant] Implement a
shred_variantfunction #8361 - [Parquet] Expose ReadPlan and ReadPlanBuilder #8347 [parquet]
- [Variant] [Shredding] Support typed_access for
List#8337 [parquet] - [Variant] [Shredding] Support typed_access for
Struct#8336 [parquet] - [Variant] [Shredding] Support typed_access for
Time64(Microsecond)#8334 [parquet] - [Variant] [Shredding] Support typed_access for
Decimal128#8332 [parquet] - [Variant] [Shredding] Support typed_access for
Timestamp(Microsecond, _)andTimestamp(Nanosecond, _)#8331 [parquet] - [Variant] [Shredding] Support typed_access for
Date32#8330 [parquet] - [Variant] Support strict casting for all data types #8303
- [Variant] Support typed access for string types in variant_get #8285
- [Variant]: Implement
DataType::FixedSizeListsupport forcast_to_variantkernel #8281
Fixed bugs:
- Fix arrow-avro Writer Documentation related to AvroBinaryFormat #8631 [arrow]
- Decimal -> Decimal cast wrongly fails for large scale reduction #8579 [arrow]
- [Parquet] Avoid fetching multiple pages when
max_predicate_cache_sizeis 0 #8542 [parquet] - DataType parsing no longer works correctly for old formatted timestamps #8539 [parquet] [arrow]
- [Parquet] ArrowWriter flush does not work #8534 [parquet]
arrow::compute::interleavefails with struct arrays with no fields #8533 [arrow]- [Parquet] Over memory consumation for writer page v1 compressed #8526 [parquet]
- Incorrect Behavior of Collecting a filtered iterator to a BooleanArray #8505 [arrow]
- [Parquet] ProjectionMask::columns name handling is bug prone #8443 [parquet]
- [Variant] Shredded typed_value columns must have valid variant types #8435 [parquet]
- cargo test -p parquet fails with default
ulimit#8406 [parquet] - Column with List(Struct) causes failed to decode level data for struct array #8404 [parquet]
- Binaryview Utf8 Cast Issue #8403 [arrow]
- Decimal precision validation displays value without accounting for scale #8382 [arrow]
- [Variant]
VariantArray::data_typereturnsStructType, causingArray::as_structto panic #8319 [parquet] - [Variant] writing a VariantArray to parquet panics #8296 [parquet]
Documentation updates:
Performance improvements:
- [parquet] Improve encoding mask API (wrap bare i32 in a struct w/ docs) #8588 [parquet]
- bench: create
zipkernel benchmarks #8654 [arrow] (rluvaton) - Skip redundant validation checks in RecordBatch#project #8583 [arrow] (pepijnve)
- [thrift-remodel] Remove conversion functions for row group and column metadata #8574 [parquet] (etseidl)
- [PARQUET] Improve memory efficency for compressed writer parquet 1.0 #8527 [parquet] (lilianm)
- perf: improve
GenericByteBuilder::append_arrayto use SIMD for extending the offsets #8388 [arrow] (rluvaton)
Closed issues:
- Utf-8, LargeUtf8, Utf8View #8601
- [Variant] Improve the get type logic for DataType in variant to arrow row builder #8538
- Add a README.md for arrow-avro #8504 [arrow]
- Fix UnionArray references to "positive" values #8418 [arrow]
- [Variant]
metadatafield should be marked is non-nullable #8410 [parquet] - [Avro] Example read_with_utf8view.rs fails to run with error "Error: ParseError("Unexpected EOF while reading Avro header")" #8380 [arrow]
- [Geospatial]: Add CI checks for
parquet-geospatialcrate #8377 - [Geospatial] Create new
parquet-geometrycrate #8374
Merged pull requests:
- parquet-rewrite: add write_batch_size and compression_level config #8642 [parquet] (mapleFU)
- Introduce a ThriftProtocolError to avoid allocating and formattings strings for error messages #8636 [parquet] (jhorstmann)
- [thrift-remodel] Add macro to reduce boilerplate necessary to implement Thrift serialization #8634 [parquet] (etseidl)
- Fix Writer docs and rename
AvroBinaryFormattoAvroSoeFormat#8633 [arrow] (jecsand838) - [Variant] Bulk insert elements into List and Object Builders #8629 (friendlymatthew)
- [Variant] impl
PartialEqandFromIterator<Option<..>>forVariantArray#8627 (friendlymatthew) - [Variant] Remove ceremony from iterator of variants into VariantArray #8625 (friendlymatthew)
- Undeprecate
ArrowWriter::into_serialized_writerand add docs #8621 [parquet] (alamb) - fix: incorrect assertion in
BitChunks::new#8620 [arrow] (rluvaton) - [Variant] Clean up redundant
get_type_name#8617 (liamzwbao) - [Minor] Hide thrift macros #8616 [parquet] (etseidl)
- Deprecate
parquet::formatmodule #8615 [parquet] (etseidl) - [Variant] Make
VariantArrayiterable #8613 (friendlymatthew) - [Variant] Impl
ExtendforVariantArrayBuilder#8611 (friendlymatthew) - build(deps): bump actions/setup-node from 5 to 6 #8604 (dependabot[bot])
- Check int96 min/max instead of panicking #8603 [parquet] (rambleraptor)
- [thrift-remodel] Refactor Parquet Thrift code into new
thriftmodule #8599 [parquet] (etseidl) - [Parquet] Remove use of
parquet::formatin metadata bench code #8598 [parquet] (lichuang) - Remove experimental warning from
extensionmodule #8597 [arrow] (mbrobbel) - Adding
try_append_valueimplementation toByteViewBuilder#8594 [arrow] (samueleresca) - Add RecordBatch::project microbenchmark #8592 [arrow] (pepijnve)
- [parquet] Add a sync fn to ArrowWriter that flushes Writer #8586 [parquet] (PiotrSrebrny)
- chore: use magic number
FOOTER_SIZEinstead of hard code number #8585 [parquet] (lichuang) - Add support for run-end encoded (REE) arrays in arrow-avro #8584 [arrow] (jecsand838)
- Unify API for writing column chunks / row groups in parallel #8582 [parquet] (adamreeve)
- Fix linting issues missed by #8506 #8581 [parquet] (etseidl)
- Fix broken decimal->decimal casting with large scale reduction #8580 [arrow] (scovich)
- Migrate
arrowand workspace to Rust 2024 #8578 [parquet] [arrow] [arrow-flight] (mbrobbel) - Fix doctests of parquet push decoded without default features #8577 [parquet] (mbrobbel)
- Avoid panics and warnings when building avro without default features #8576 [arrow] (mbrobbel)
- Add support for 64-bit Schema Registry IDs (Id64) in arrow-avro #8575 [arrow] (jecsand838)
- fix: bug when struct nullability determined from
Dict<_, ByteArray>>column #8573 [parquet] (albertlockett) - fix: Support
interleave_structto handle empty fields #8563 [arrow] (Weijun-H) - [Variant] Define and use VariantDecimalType trait #8562 (scovich)
- [PARQUET] Update parquet writer bench with compression and pagev2 #8560 [parquet] (lilianm)
- Replace serde with
serde_corewhen possible #8558 [arrow] (AdamGS) - fix: use default field name when name is None in Field conversion #8557 [arrow] (Weijun-H)
- Add arrow-avro README.md file #8556 [arrow] (jecsand838)
- minor(parquet): Fix test_not_found on Windows #8555 [parquet] (nuno-faria)
- [Parquet] Avoid fetching multiple pages when the predicate cache is disabled #8554 [parquet] (nuno-faria)
- [Variant] Support variant to
Decimal32/64/128/256#8552 [arrow] (liamzwbao) - Arrow-avro Writer Dense Union support #8550 [arrow] (nathaniel-d-ef)
- Arrow-Avro: Resolve named field discrepancies #8546 [arrow] (nathaniel-d-ef)
- Migrate
arrow-avroto Rust 2024 #8545 [arrow] (mbrobbel) - feat: Export
is_densepublic #8544 [arrow] (Weijun-H) - Fix "Incorrect Behavior of Collecting a filtered iterator to a BooleanArray" #8543 [arrow] (tobixdev)
- Support old syntax for DataType parsing #8541 [arrow] (alamb)
- [Variant] Decimal unshredding support #8540 [parquet] (scovich)
- [Variant] Improve documentation and make kernels consistent #8536 [parquet] (alamb)
- feat: support casting from null to float16 #8535 [arrow] (chenkovsky)
- Add benchmarks for FromIter (PrimitiveArray and BooleanArray) #8525 [arrow] (tobixdev)
- Support writing GeospatialStatistics in Parquet writer #8524 [parquet] (paleolimbot)
- Fix some new rustdoc warnings #8522 [parquet] (etseidl)
- [Variant] Reverse VariantAsPrimitive trait to PrimitiveFromVariant #8519 (scovich)
- [Variant] Add variant to arrow primitive support for boolean/timestamp/time #8516 (klion26)
- [Variant] Add list support to unshred_variant #8514 [parquet] (scovich)
- Migrate
parquet-variant-jsonto Rust 2024 #8512 (mbrobbel) - Migrate
parquet-variant-computeto Rust 2024 #8511 (mbrobbel) - Migrate
parquet-variantto Rust 2024 #8510 (mbrobbel) - Migrate
parquet-geospatialto Rust 2024 #8509 (mbrobbel) - Migrate
parquet_derive_testto Rust 2024 #8508 (mbrobbel) - Migrate
parquet_deriveto Rust 2024 #8507 (mbrobbel) - Migrate
parquetto Rust 2024 #8506 [parquet] (mbrobbel) - [Variant] ReadOnlyMetadataBuilder borrows its underlying VariantMetadata #8502 (scovich)
- [Variant] Add a VariantBuilderExt impl for VariantValueArrayBuilder #8501 (scovich)
- build(deps): update sysinfo requirement from 0.36.0 to 0.37.1 #8500 [parquet] (dependabot[bot])
- [Variant] Introduce new BorrowedShreddingState concept #8499 (scovich)
- Add
append_nmethod toFixedSizeBinaryDictionaryBuilder#8498 [arrow] (albertlockett) - Fix docs.rs build: Use
doc_cfginstead of removeddoc_auto_cfg#8494 [parquet] [arrow] [arrow-flight] (mbrobbel) - Remove allow unused from arrow-avro lib.rs file #8493 [arrow] (jecsand838)
- Regression Testing, Bug Fixes, and Public API Tightening for arrow-avro #8492 [arrow] (jecsand838)
- Migrate
arrow-stringto Rust 2024 #8491 [arrow] (mbrobbel) - Migrate
arrow-selectto Rust 2024 #8490 [arrow] (mbrobbel) - Migrate
arrow-schemato Rust 2024 #8489 [arrow] (mbrobbel) - Migrate
arrow-rowto Rust 2024 #8488 [arrow] (mbrobbel) - Migrate
arrow-pyarrow-testingto Rust 2024 #8487 (mbrobbel) - Migrate
arrow-pyarrow-integration-testingto Rust 2024 #8486 (mbrobbel) - Migrate
arrow-pyarrowto Rust 2024 #8485 (mbrobbel) - Migrate
arrow-ordto Rust 2024 #8484 [arrow] (mbrobbel) - [Variant] Support strict casting for Decimals #8483 (liamzwbao)
- feat(json): Add temporal formatting options when write to JSON #8482 [arrow] (linyihai)
- [Variant] Define and use unshred_variant function #8481 [parquet] (scovich)
- [Minor] Remove private APIs from Parquet metadata benchmark #8478 [parquet] (etseidl)
- Add examples of using
Field::try_extension_type#8475 [arrow] (alamb) - Fix Rustfmt in arrow-cast #8473 [arrow] (mbrobbel)
- Disable incremental builds in CI #8471 (mbrobbel)
- Update Rust toolchain to 1.90 #8468 [arrow] (mbrobbel)
- [Parquet] Minor: Remove mut ref for getting row-group bloom filter #8462 [parquet] (mapleFU)
- refactor: split
numdependency #8459 [parquet] [arrow] (crepererum) - Migrate
arrow-jsonto Rust 2024 #8458 [arrow] (mbrobbel) - Migrate
arrow-ipcto Rust 2024 #8457 [arrow] (mbrobbel) - Migrate
arrow-flightto Rust 2024 #8456 [arrow] [arrow-flight] (mbrobbel) - Migrate
arrow-datato Rust 2024 #8455 [arrow] (mbrobbel) - Migrate
arrow-csvto Rust 2024 #8454 [arrow] (mbrobbel) - Migrate
arrow-castto Rust 2024 #8453 [arrow] (mbrobbel) - Migrate
arrow-bufferto Rust 2024 #8452 [arrow] (mbrobbel) - Migrate
arrow-arrayto Rust 2024 #8450 [arrow] (mbrobbel) - Migrate
arrow-arithto Rust 2024 #8449 [arrow] (mbrobbel) - Expose
fieldsinStructBuilder#8448 [arrow] (lewiszlw) - [Variant] Simpler shredding state #8444 [parquet] (scovich)
- Unpin comfytable #8440 [arrow] (alamb)
- Variant integration fixes #8438 [parquet] (scovich)
- Refactor: extract FooterTail from ParquetMetadataReader #8437 [parquet] (alamb)
- Refactor: Move parquet metadata parsing code into its own module #8436 [parquet] (alamb)
- Update
UnionArraywording to 'non-negative' #8434 [arrow] (jdockerty) - Adds Duration(TimeUnit) support to arrow-avro reader and writer #8433 [arrow] (nathaniel-d-ef)
- Update release schedule #8432 (mbrobbel)
- expose read plan and plan builder via mod #8431 [parquet] (yeya24)
- Bump MSRV to 1.85 #8429 [arrow] (mbrobbel)
- Fix clippy #8426 (alamb)
- Fix red main by updating test #8421 [parquet] (emilk)
- Implement AsRef for Schema and Field #8417 [arrow] (findepi)
- [Variant] mark metadata field as non-nullable #8416 (ding-young)
- Respect
CastOptions.safewhen castingBinaryView→Utf8View(returnnullfor invalid UTF‑8) #8415 [arrow] (kosiew) - Add Parquet geospatial statistics utility #8414 [arrow] (paleolimbot)
- Remove explicit default cfg option #8413 [arrow] (abacef)
- Support parquet canonical extension type roundtrip #8409 [parquet] (alamb)
- Support reading/writing
VariantArrayto parquet with Variant LogicalType #8408 [parquet] (alamb) - Follow-up on arrow-avro Documentation #8402 [arrow] (jecsand838)
- [Variant][Shredding] Support typed_access for timestamp_micro/timestamp_nano #8401 [parquet] (klion26)
- Expose ReadPlan and ReadPlanBuilder #8399 [parquet] (yeya24)
- Propagate errors instead of panics: Replace usages of
newwithtry_newfor Array types #8397 [arrow] (Jefffrey) - [Variant] Fix NULL handling for shredded object fields #8395 (scovich)
- Add Arrow Variant Extension Type, remove
Arrayimpl forVariantArrayandShreddedVariantFieldArray#8392 [parquet] (alamb) - Minor cleanup creating Schema #8391 [parquet] (alamb)
- [Geospatial]: Add CI checks for
parquet-geospatialcrate #8390 (kylebarron) - Follow-up Improvements to Avro union handling #8385 [arrow] (jecsand838)
- fix: reset the offset of 'file_for_view' #8381 [arrow] (TrevorADHD)
- [Variant] [Shredding] feat: Support typed_access for Date32 #8379 [parquet] (PinkCrow007)
- [Geospatial]: Scaffolding for new
parquet-geospatialcrate #8375 (kylebarron) - Avro writer prefix support #8371 [arrow] (nathaniel-d-ef)
- [Variant] Define new shred_variant function #8366 (scovich)
- Add arrow-avro Reader support for Dense Union and Union resolution (Part 2) #8349 [arrow] (jecsand838)
- Move ParquetMetadata decoder state machine into ParquetMetadataPushDecoder #8340 [parquet] (alamb)
- [Variant]: Implement
DataType::FixedSizeListsupport forcast_to_variantkernel #8282 (liamzwbao)
* This Changelog was automatically generated by github_changelog_generator