jorgecarleitao/arrow2 v0.15.0 on GitHub

A new release is here, adding a number of new features and improvements to arrow2. Thank you to everyone that contributed to it!

This release adds support to a new format, the "record" JSON format, contributed by @AnIrishDuck, a new trait TryExtendFromSelf to efficiently concatenate an array into an existing mutable array, and multiple improvements by @sundy-li and @ritchie46 to performance. Finally, we have a new API OffsetsBuffer and Offsets proposed by @ritchie46 to allow creating variable sized-arrays without having to check for offsets.

This release also features a number of contributions from first contributors:

@benesch made their first contribution in #1271
@RinChanNOWWW made their first contribution in #1287
@datapythonista made their first contribution in #1290
@sandflee made their first contribution in #1286
@Samrose-Ahmed made their first contribution in #1279
@jondo2010 made their first contribution in #1300
@cyr made their first contribution in #1318
@universalmind303 made their first contribution in #1321

Thank you everyone for the great work this year, and happy festivities everyone!

Full Changelog

Breaking changes:

Added values' capacity to MutableBinaryArray::reserve #1277
Removed from_data from all arrays #1328 (jorgecarleitao)
Added Offsets and OffsetsBuffer #1316 (jorgecarleitao)
Bumped parquet2 dependency #1304 (ritchie46)
Added data_pagesize_limit to write parquet pages #1303 (sundy-li)
Bumped arrow-format to 0.8 #1298 (Xuanwo)
Improved iterators #1270 (jorgecarleitao)

New features:

Added TryExtendFromSelf #1278 (jorgecarleitao)
Added support for JSON ser/de records layout #1275 (AnIrishDuck)

Fixed bugs:

Parquet writes all values of sliced arrays? #1323
Avro schema: Invalid record names #1269
Fixed writing nested/sliced arrays to parquet #1326 (ritchie46)
Fixed failing to accept dictionary full of nulls #1312 (ritchie46)
Added support for Extension types in ffi #1300 (jondo2010)
Fixed error in memory usage of sliced binary/list/utf8arrays #1293 (ritchie46)
Fixed descending ordering when specify nulls first #1286 (sandflee)
Added avro record names when converting arrow schema to avro #1279 (Samrose-Ahmed)

Enhancements:

Fixed clippy #1336 (jorgecarleitao)
Improved UnionArray #1331 (jorgecarleitao)
Bumped json-deserializer version #1321 (universalmind303)
Removed flushing during arrow IPC writing to improve performance when using a buffered writer #1318 (cyr)
Improved performance of check_indexes #1313 (ritchie46)
Improved performance of checking offsets ~-64-73% #1305 (ritchie46)
Added reserve to pushable containers in parquet extend_from_decoder #1301 (ritchie46)
Optimized slicing #1285 (jorgecarleitao)
Improved ZipValidity iterators #1284 (ritchie46)
Added MutableBinaryValuesArray #1276 (jorgecarleitao)

Documentation updates:

Fixed link from the API to the guide #1290 (datapythonista)