github jorgecarleitao/arrow2 v0.11.0

latest releases: v0.17.0, v0.16.0, v0.15.0...
2 years ago

Arrow2 v0.11.0 is out!! 🎉🎉🎉

This release is mainly focus on improving upon the previous one on better parquet support. In particular, we have the main ingredients to read indexed parquet pages, which allow skipping deserializing individual pages, and since this version parquet files are written with page indexes. There is still some work to improve the frontend API to skip pages via statistics, which will be left for the next version.

This version also contains multiple bug fixes.

Thanks everyone that contributed to this release (individual PRs below)! 🙇

Changelog

Full Changelog

Breaking changes:

New features:

Fixed bugs:

  • Parquet regression: exceptions.ArrowErrorException: NotYetImplemented("Can't read Dictionary(UInt32, LargeUtf8, false) from parquet") #955
  • Reading Parquet binary column panics during deserialization 'attempt to subtract with overflow` #944
  • Reading Parquet file written by pyarrow with lz4 compression fails with OutOfSpec("Thrift out of range") #940
  • Issues when trying to create a parquet file with FixedSizedListArray #691
  • Fixed bug in writing csv with buffer resizing #965 (ritchie46)
  • Fixed bug in reading binary parquet #945 (jorgecarleitao)
  • Fixed error in writing fixedSizeListArray to parquet #941 (jorgecarleitao)
  • Fixed support to read dict nested binary parquet #924 (jorgecarleitao)

Enhancements:

Documentation updates:

Testing updates:

Don't miss a new arrow2 release

NewReleases is sending notifications on new releases.