🏆 Highlights
- extend
filter
capabilities with new support for*args
predicates,**kwargs
constraints, and chained boolean masks (#11740)
⚠️ Deprecations
- Deprecate non-keyword args for
ewm
methods (#11804) - Deprecate
use_pyarrow
param forSeries.to_list
(#11784) - Rename
group_by_rolling
torolling
(#11761)
🚀 Performance improvements
- Improve
DataFrame.get_column
performance by ~35% (#11783) - rechunk before grouping on multiple keys (#11711)
- process parquet statistics before downloading row-group (#11709)
- push down predicates that refer to group_by keys (#11687)
- slightly faster float equality (#11652)
✨ Enhancements
- Expressify pct_change and move to ops (#11786)
- primitive kwargs in plugins (#11268)
- add
DATE
function for SQL (#11541) - Add config setting to control how many List items are printed (#11409)
- Use
OrderedDict
for schemas (#11742) - allow specifying schema in
pl.scan_ndjson
(#10963) - add support for "outer" mode to frame
update
method (#11688) - transparently support "qmark" parameterisation of SQLAlchemy queries in
read_database
(#11700) - support multiple sources in scan_file (#11661)
- support batched frame iteration over
read_database
queries (#11664) - column selector support for
DataFrame.melt
andLazyFrame.unnest
(#11662)
🐞 Bug fixes
- ensure projections containing only hive columns are projected (#11803)
- patch broken aHash AES intrinsics on ARM (#11801)
- fix key in object-store cache (#11790)
- handle logical types in plugins (#11788)
- Fix values printed by
assert_*_equal
AssertionError whenexact=False
(#11781) - make
PyLazyGroupby
reusable (#11769) - only exclude final output names of group_by key expressions (#11768)
- Fix subsecond parsing in timedelta conversions (#11759)
- fix ambiguity wrt list aggregation states (#11758)
- Correctly process subseconds in
pl.duration
(#11748) - use actual number of read rows for hive materialization (#11690)
- return float dtype in interpolate (for method="linear") for numeric dtypes (#11624)
- fix seg fault in concat_str of empty series (#11704)
- fix sort_by regression (#11679)
- Fix match on last item for
join_asof
withstrategy="nearest"
(#11673)
🛠️ Other improvements
- Bump lint dependencies (#11802)
- Minor updates to assertion utils and docstrings (#11798)
- Remove unused
_to_rust_syntax
util (#11795) - Minor tweak in code example in section Coming from Pandas (#11764)
- Fix Exception module paths (#11785)
- Rename
IntegralType
toIntegerType
(#11773) - more granular polars-ops imports (#11760)
- Link to
expand_selector
in user guide (#11722) - Add parametric test for
df.to_dict
/series.to_list
(#11757) - Minor fix in code example in section Coming from Pandas (#11745) (#11745)
- Move tests for
group_by_dynamic
into one module (#11741) - Update group_by_dynamic example (#11737)
- reorder pl.duration arguments (#11641)
- remove default features from some crates (#11680)
- *_horizontal dependent on reduce_expr to expression architecture (#11685)
- clarify that median is equivalent to the 50% percentile shown in
describe
metrics (#11694) - update rustc and fix future (#11696)
- Publish release after uploading assets (#11686)
- upgrade pyo3 to 0.20.0 (#11683)
- better align
help
command output following addition of some longer options (#11681) - sum_horizontal to expression architecture (#11659)
- add note about use of
polars-lts-cpu
for macOS x86-64/rosetta (#11660) - improve rank implementation, especially around nulls (#11651)
Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Walnut356, @aberres, @alexander-beedie, @alicja-januszkiewicz, @cmdlineluser, @jrycw, @mcrumiller, @messense, @nameexhaustion, @orlp, @petrosbar, @rancomp, @reswqa, @ritchie46, @romanovacca, @sd2k, @stinodego, @svaningelgem and @thomasjpfan