pola-rs/polars py-1.33.0 on GitHub

💥 Breaking changes

Remove, deprecate or change eager Exprs to be lazy compatible (#24027)

🚀 Performance improvements

Native streaming int_range with len or count (#24280)
Lower arg_unique natively to the streaming engine (#24279)
Move unordering optimization to end (#24286)
Do ordering simplification step after common sub-plan elimination (#24269)
Always simplify order requirements in IR (#24192)
Basic de-duplication of filter expressions (#24220)
Cache the IR in pipe_with_schema (#24213)
Lower arg_where natively to streaming engine (#24088)
Lower Expr.shift to streaming engine (#24106)
Lower order-preserving groupby to streaming engine (#24053)

✨ Enhancements

Add CSE for custom io sources using pointer for hashing (#24297)
Allow pl.Expr.log to take in an expression (#24226)
Add caching to user credential providers (#23789)
Expose mkdir parameter on write_parquet (#24239)
Implement diff() in streaming engine (#24189)
Enable Expr.diff(n) for negative n (#24200)
Allow upcasting null-typed columns to nested column types in scans (#24185)
Log pyarrow predicate conversion result in sensitive verbose logs (#24186)
Drop PyArrow requirement for write_database with the ADBC engine (#24136)
Add a deprecation warning for pl.Series.shift(Null) (#24114)
Improve Debug formatting of DataType (#24056)
Add LazyFrame.pipe_with_schema (#24075)
Catch additional temporal attributes in BytecodeParser function analysis (#24076)
Add cum_* as native streaming nodes (#23977)
Add peak_{min,max} support for booleans (#24068)
Add DataFrame.map_columns for eager evaluation (#23821)

🐞 Bug fixes

Invalid conversion from non-bit numpy bools (#24312)
Make dt.epoch('s') serializable (#24302)
Make Expr.rechunk serializable (#24303)
Schema mismatch for 'log' operation (#24300)
Incorrect first/last aggregate in streaming engine (#24289)
Fix group offsets in sliced groups (#24274)
Panic in inexact date(time) conversion (#24268)
Keep DSL cache after serialization and deserialization (#24265)
Sanitize and warn about eval usage (#24262)
Correct incorrect default in from_pandas overload for include_index (#24258)
Unique with keep="none" in new optimization pass (#24261)
Correct size limits for Decimal cast (#24252)
Unordered unions in check order observing pass (#24253)
Fix dtype for slice on Literal in agg context (#24137)
Fix incorrect filter(lit(True)) when scanning hive (#24237)
In-memory group_by on 128-bit integers (#24242)
Fix panic in gather inside groupby with invalid indices (#24182)
Release the GIL in map_groups (#24225)
Remove extra explode in LazyGroupBy.{head,tail} (#24221)
Fix panic in polars cloud CSV scan (#24197)
Fix panic when loading categorical columns from IO plugin (#24205)
Fix credential provider did not auto-init on partition sinks (#24188)
Fix engine type for concat_list on AggScalar implode (#24160)
Rolling_mean handle centered weights with len(values) < window_size (#24158)
Reading is_in predicate for Parquet plain strings (#24184)
Support native DuckDB connection in read_database (#24177)
Make PyCategories pickleable (#24170)
Remove unused unsound function to_mutable_slice (#24173)
PyO3 extension types giving compat_level errors (#24166)
Allow non-elementwise by in top_k (#24164)
Fix sort_by for group_by_dynamic context (#24152)
Input-independent length aggregations in streaming (#24153)
Release GIL when iterating df in to_arrow (#24151)
Respect non-elementwise join_where conditions (#24135)
Fix mismatched pytest test collection error (#24133)
Resolve schema mismatch for div on Boolean (#24111)
Fix from_repr parsing of negative durations (#24115)
Make group_by/partition_by iterator keys tuple[Any, ...] to enable tuple-unpacking (#24113)
Keep name when doing empty group-aware aggregation (#24098)
Implode instead of reshape_list (#24078)
Rolling mean with weights incorrect when min_samples < window_size (#23485)
Allow merge_sorted for all types (#24077)
Include datatypes in row_encode expression (#24074)
Include UDF materialized type in serialization (#24073)
Correct .rolling() output type for non-aggregations (#24072)
Correct planner output schema for join_asof (#24071)
Correct output for fold and reduce (#24069)
Expr.meta.output_name for struct fields (#24064)
Ensure upcast operations on pl.Date default to microsecond precision (#23981)
Add peak_{min,max} support for booleans (#24068)
Planner output type for mean with strange input type (#24052)
Remove, deprecate or change eager Exprs to be lazy compatible (#24027)

📖 Documentation

Fix few typos (#24305)
Add missing reference to LazyFrame.pipe_with_schema() on the website (#24285)
Automatically register doctest.ELLIPSIS so we don't have to add the inline directive each time (#24146)
Update categorical comparison documentation in user guide (#24249)
Add missing references for Seriers.rolling_*_by methods (#24254)
Fix formatting of Series.value_counts examples (#24245)
Add hint to use DataFrame/Series constructors in from_arrow docstring (#22942)
Update GPU un/supported features (#24195)
Add DataFrame.map_columns to API (#24128)
Update multiple pages in the Polars Cloud user guide (#23661)
Fix str.find_many() docstring example (#24092)

📦 Build system

Re-enable macos-x86-64 (#24266)
Drop binary support for macos_x86-64 (#24257)

🛠️ Other improvements

Remove PDS-H code (#24301)
Get ready for even more cloud tests (#24292)
Add tests for slices with caches (#24288)
Readd ordering tests (#24284)
Fix Makefile venv path (#24251)
Remove unnecessary parentheses (#24244)
Make non-nested shift{,_and_fill} ops generic (#24224)
Remove unused Wrap (#24214)
Allow upcasting null-typed columns to nested column types in scans (#24185)
Automatically label a few more types of PR (#24147)
Update toolchain (#24156)
Add order_sensitive property for AExpr (#24116)
Mark more tests as not possible on cloud (#24103)
Turn AggExpr::Count from tuple to struct (#24096)
Mark tests that may fail in cloud (#24067)
Extend read database tests to capture more ADBC functionality (#24002)
Make CI perf failures more lenient (#24066)
Fix hive partition string encoding in CI by upgrading deltalake (#24018)
Make tests with sinks run on cloud again (#24048)

Thank you to all our contributors for making this release possible!
@Kevin-Patyk, @MarcoGorelli, @NeejWeej, @agossard, @alexander-beedie, @aparna2198, @borchero, @coastalwhite, @deanm0000, @dsprenkels, @eitsupi, @etiennebacher, @gab23r, @henryharbeck, @jjurm, @kdn36, @math-hiyoko, @mcrumiller, @mroeschke, @nameexhaustion, @orlp, @r-brink, @ritchie46, @stijnherfst, @vdrn and @wence-

pola-rs/polars py-1.33.0 Python Polars 1.33.0 on GitHub

💥 Breaking changes

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

pola-rs/polars py-1.33.0
Python Polars 1.33.0

on GitHub