An upgrade guide is available on our website.
🏆 Highlights
- implementing sink_csv for LazyFrame (#10682)
- Support
DataFrame
init from queries against users' existing database connections (#10649) - Rename
groupby
togroup_by
(#10656)
💥 Breaking changes
- return
f64
forrank
whenmethod="average"
(#10734) - Update a lot of error types (#10637)
- Remove deprecated behavior from vertical aggregations (#10602)
- Read/write support for IPC streams in DataFrames (#10606)
- Change behavior of
all
- fix Kleene logic implementation forall
/any
(#10564) - Improve consistency of parsing expression input (#9512)
- allow
from_arrow
to take a generator of RecordBatches, change error type toTypeError
(#10529) - remove fixed_seed and add pl.set_random_seed (#10388)
- Make
arange
an alias forint_range
(#9983) date_range
/time_range
no longer return aList
type (#10526)- Remove various functionalities deprecated before
0.18
(#10527) - Improve some error types and messages (#10470)
⚠️ Deprecations
- Rename
map
tomap_batches
(#10801) - Rename
GroupBy.apply
tomap_groups
(#10799) - Rename
DataFrame.apply
tomap_rows
(#10797) - Rename
Series/Expr.rolling_apply
torolling_map
(#10750) - Rename
Series/Expr.apply
tomap_elements
(#10678) - Rename
groupby
togroup_by
(#10656) - Deprecate some parameters of
cut
/qcut
(#10484)
🚀 Performance improvements
- parse time zones outside of downcast_iter() in replace_time_zone (#10713)
- use binary abstraction for atan2 (#10588)
- use binary abstraction in pow (#10562)
✨ Enhancements
- activate cse for group_by (again) (#10749)
- implementing sink_csv for LazyFrame (#10682)
- Supports series unique & arg_unique & n_unique for list (#10743)
- repeat_by should also support broadcasting of LHS (#10735)
- deprecate 'use_earliest' argument in favour of 'ambiguous', which can take expressions (#10719)
- is_first also supports numeric list type. (#10727)
- improve slice pushdown in unions (#10723)
- Explicitly implement
Protocol
for interchange classes (#10688) - Support min and max strategy for binary & str columns fill null (#10673)
- support broadcasting in list set operations (#10668)
- csv: add schema argument (#10665)
- Support
DataFrame
init from queries against users' existing database connections (#10649) - add
truncate_ragged_lines
(#10660) - supports cast to list (#10623)
- Update a lot of error types (#10637)
- preserve whitespace in notebook output (#10644)
- Remove deprecated behavior from vertical aggregations (#10602)
- support selector usage in
write_excel
arguments (#10589) - Add
LazyFrame.collect_async
andpl.collect_all_async
(#10616) - Read/write support for IPC streams in DataFrames (#10606)
- propagate null is in
is_in
and more generic array construction (#10614) - Change behavior of
all
- fix Kleene logic implementation forall
/any
(#10564) - frame-level
cast
support (#10504) - Improve consistency of parsing expression input (#9512)
- Add failed column to cast exception (#10507)
- allow
from_arrow
to take a generator of RecordBatches, change error type toTypeError
(#10529) - Remove deprecated
get_idx_type
- useget_index_type
instead (#10556) - Make
arange
an alias forint_range
(#9983) date_range
/time_range
no longer return aList
type (#10526)- Remove various functionalities deprecated before
0.18
(#10527) - Improve some error types and messages (#10470)
- suggest str.to_datetime instead of apply and stdlib strptime (#10266)
🐞 Bug fixes
- get_single_leaf can't handle Expr::Count (#10790)
- support groupby literal in streaming (#10771)
ORDER BY
on unselected columns (#10752)- Fix is_in cannot cast list type for float (#10769)
- whitespace CSS in Notebook HTML updated to use
pre-wrap
instead ofpre
(#10739) - only preserve sortedness flag in replace_time_zone when safe (#10738)
- Error on
value_counts
on column named"counts"
(#10737) - return
f64
forrank
whenmethod="average"
(#10734) - Keep min/max and arg_min/arg_max consistent. (#10716)
- use time zone from dtype to overwrite output time zone when initialising Series (#10689)
- Cast small int type when scan csv in streaming mode. (#10679)
- raise exception with invalid
on
arg type for join_asof (#10690) - Reused input series in rolling_apply should not be orderly (#10694)
- re-sort buffer when update window swap the whole buffer (#10696)
- Set the correct fast_explode flag for ListUtf8ChunkedBuilder (#10684)
- Sorted Utf8Chunked max_str and min_str should consider null value (#10675)
- Correctly handle time zones in
write_delta
(#10633) - fix apply for empty series in threading mode (#10651)
- respect 'ignore_errors=False' in csv parser (#10641)
- fix rename + projection pushdown (#10624)
- fix int/float downcast in
is_in
(#10620) - Change behavior of
all
- fix Kleene logic implementation forall
/any
(#10564) - Fix serialization for categorical chunked. (#10609)
- Take input_schema to create physical expr for Selection (#10571)
- Clear window cache after evaluate predication expr (#10505)
- Parsing regex col in Expr::Columns (#10551)
- sanitize column naming in boolean ops (#10531)
- Fix
write_delta
with schema indelta_write_options
(#10541) - remove fixed_seed and add pl.set_random_seed (#10388)
- respect
pl.Config
options relating to shape, column names, and types when rendering HTML (#10449)
🛠️ Other improvements
- update cargo.lock (#10800)
- Create
.venv
in repo root (#10789) - refactored
write_database
unit tests to properly separate concerns (#10773) - Fix some broken links / formatting (#10772)
- Document chained when-then behaviour more prominently (#10759)
- Fix test failing due to new
adbc
release (#10763) - Unpin
connectorx
and bump other Python dependencies (#10753) - add note to
testing
docs about module import (#10741) - Clear GitHub Actions caches weekly (#10715)
- Update for new pyarrow
13.0.0
behavior (#10691) - Fix minor issue with
sink_parquet
docs (#10669) - Remove
deprecate_renamed_methods
util (#10537) - add "see also" entries to ne/eq_missing and update related examples (#10667)
- fix potential memory leak from usage of
inspect.currentframe
(#10630) - give more relevant example for polars.apply (#10631)
- Bump ruff and enable new setting (#10626)
- Add docstrings for
Expr.meta
namespace (#10617) - Enforce up-to-date
Cargo.lock
(#10555) - deprecate DataFrame.replace (#10600)
- ensure that
make requirements
fully refreshes unpinned packages/deps (#10591) - fix out-of-date explain default parameter (#10566)
- Fix
expr_dispatch
decorator to work on methods with decorators (#10549) - Fix link to source code (#10542)
- Add title to index page (#10539)
- Disable SIM108 lint (#10519)
- Keep versioned docs (#10500)
- switch to
pyo3/maturin-action
(#10503) - Update URLs for dev documentation (#10495)
- Skip failing test (#10496)
- Add version switcher to API reference (#10488)
Thank you to all our contributors for making this release possible!
@JulianCologne, @MarcoGorelli, @Object905, @OndrejSlamecka, @SeanTroyUWO, @VasanthakumarV, @alexander-beedie, @aminalaee, @braaannigan, @c-peters, @ion-elgreco, @lorepozo, @marki259, @mcrumiller, @messense, @orlp, @owrior, @rben01, @reswqa, @ritchie46, @sdamashek, @stinodego, @svaningelgem, @titoeb, @trueb2, @washcycle and @zundertj