⚠️ Deprecations
- Rename
series_equal
/frame_equal
toequals
(#12618) - Rename
map_dict
toreplace
and change default behavior (#12599)
🚀 Performance improvements
- order(s) of magnitude speedup when initialising
List
dtypeSeries
from 2D numpy array (#12672) - improve
merge_local_rhs_categorical
traversal (#12660) - make values_size estimate correct for sliced arrays (#12658)
- improve parquet utf8 validation (#12655)
- parquet pre-allocate buffer in binary plain encode (#12652)
- optimize dict binary decoding in parquet (#12648)
- ensure we only check the values within bounds (#12633)
- parquet; elide recursion in hot path (#12625)
- improve cov/corr algorithm (#12590)
✨ Enhancements
- Join operations on local categoricals (#12657)
- Implement
PySeries.from_buffer
for boolean buffers (#12654) - Implement
PySeries.from_buffer
for numeric types (#12646) - use RLE_DICTIONARY for integers in parquet (#12647)
- extend recent
filter
syntax upgrades towhen/then
construct (#12603) - implement RLE_DICT encoding for utf8/binary columns (reduced parquet file size) (#12623)
- implement 'DeltaByteArray' decoding for parquet (#12602)
🐞 Bug fixes
- json null inference (#12677)
- cov/corr respect f32 type (#12676)
- fix ternary zip_with null broadcast (#12668)
- support negative slice on eager frame (#12644)
- fix concurrency budget assertion (#12641)
- fix oob in set operations (#12640)
- panic reading parquet nested struct column (#12614)
- Fix deprecation message for
DataFrame.sum
(#12619) - features:
performant,lazy,random
(#12600)
🛠️ Other improvements
- Use
range
instead ofnp.arange
in constructors (#12621) - update custom allocator instructions to include macOS (#12593)
Thank you to all our contributors for making this release possible!
@alexander-beedie, @c-peters, @cardoso, @dmitrybugakov, @nameexhaustion, @orlp, @ritchie46 and @stinodego