A new release is here! 🎉🎉🎉
This one marked by further alignment with the arrow specification. Of special mention:
- ✅ Added full support for
async
parquet write (by @GrandChaman) - ✅ Added fast
extend_*values
toMutablePrimitiveArray
(by @ritchie46) - ✅ Added support for compute to
BinaryArray
(by @zhyass) - ✅ Added support to extension types (IPC, FFI, etc.) (by @jorgecarleitao)
- ✅ Added support for the brand new
MONTH_DAY_NANO
interval type (by @jorgecarleitao) - 🚀 Improved performance of the calculation of null counts by 5x (by @jorgecarleitao)
- 🔧 Made
cargo
features not default (by @jorgecarleitao)
As usual, there is a small number of backward incompatible changes. See associated issues below, which include the migration paths to each of them.
Breaking changes:
- Added
Extension
toDataType
#361 MonthDayNano
added to enumIntervalUnit
#360- Make
io::parquet::write::write_*
return size of file in bytes #354 - Renamed
bitmap::utils::null_count
tobitmap::utils::count_zeros
#342 - Made
GroupFilter
optional in parquet'sRecordReader
and added method to set it. #386 (jorgecarleitao) - Removed
PartialOrd
andOrd
of all enums indatatypes
#379 (jorgecarleitao) - Made
cargo
features not default #369 (jorgecarleitao) - Prepare APIs for extension types #357 (jorgecarleitao)
New features:
- Added support for
async
parquet write #372 (GrandChaman) - Add support to extension types in FFI #363 (jorgecarleitao)
- Added support for field's metadata via FFI #362 (jorgecarleitao)
- Added support for
Extension
(logical) type #359 (jorgecarleitao) - Added support for compute to
BinaryArray
#346 (zhyass) - Added support for reading binary from CSV #337 (jorgecarleitao)
- Added support for
MONTH_DAY_NANO
interval type #268 (jorgecarleitao)
Fixed bugs:
- Parquet read skips a few rows at the end of the page #373
parquet_read
fails when a column has too many rows with string values #366parquet_read
panics withindex_out_of_bounds
#351- Fixed error in
MutableBitmap::push_unchecked
#384 (jorgecarleitao) - Fixed display of timestamp with tz. #375 (jorgecarleitao)
Enhancements:
- Added
extend_*values
toMutablePrimitiveArray
#383 (ritchie46) - Improved performance of writing to CSV (20-25%) #382 (jorgecarleitao)
- Bumped
lexical-core
#378 (jorgecarleitao) - Fixed casting of utf8 <> Timestamp with and without timezone #376 (jorgecarleitao)
- Added
Send+Sync
toMutableBuffer
#368 (jorgecarleitao) - Improved performance of unary _not_ for aligned bitmaps (3x) #365 (jorgecarleitao)
- Reduced dependencies within
num
#353 (jorgecarleitao) - Bumped to parquet2 v0.4 #352 (jorgecarleitao)
- Bumped tonic and prost in flight #344 (PsiACE)
- Improved null count calculation (5x) #343 (jorgecarleitao)
- Improved perf of deserializing integers from json (30%) #340 (jorgecarleitao)
- Simplified code of json schema inference #339 (jorgecarleitao)
Documentation updates:
- Moved guide examples to examples/ #387 (jorgecarleitao)
- Added more docs. #358 (jorgecarleitao)
- Improved API docs. #355 (jorgecarleitao)
Testing updates:
- Moved tests to
tests/
#389 (jorgecarleitao) - Moved compute tests to tests/ #388 (jorgecarleitao)
- Added more tests. #380 (jorgecarleitao)
- Pinned nightly in SIMD tests #364 (jorgecarleitao)
- Improved benches for take #348 (jorgecarleitao)
- Made IPC integration tests run tests that are not run by arrow-rs #278 (jorgecarleitao)