github pola-rs/polars py-1.33.0-beta.1
Python Polars 1.33.0-beta.1

latest release: py-1.33.0
pre-release5 days ago

💥 Breaking changes

  • Remove, deprecate or change eager Exprs to be lazy compatible (#24027)

🚀 Performance improvements

  • Always simplify order requirements in IR (#24192)
  • Basic de-duplication of filter expressions (#24220)
  • Cache the IR in pipe_with_schema (#24213)
  • Lower arg_where natively to streaming engine (#24088)
  • Lower Expr.shift to streaming engine (#24106)
  • Lower order-preserving groupby to streaming engine (#24053)

✨ Enhancements

  • Allow pl.Expr.log to take in an expression (#24226)
  • Add caching to user credential providers (#23789)
  • Expose mkdir parameter on write_parquet (#24239)
  • Implement diff() in streaming engine (#24189)
  • Enable Expr.diff(n) for negative n (#24200)
  • Allow upcasting null-typed columns to nested column types in scans (#24185)
  • Log pyarrow predicate conversion result in sensitive verbose logs (#24186)
  • Drop PyArrow requirement for write_database with the ADBC engine (#24136)
  • Add a deprecation warning for pl.Series.shift(Null) (#24114)
  • Improve Debug formatting of DataType (#24056)
  • Add LazyFrame.pipe_with_schema (#24075)
  • Catch additional temporal attributes in BytecodeParser function analysis (#24076)
  • Add cum_* as native streaming nodes (#23977)
  • Add peak_{min,max} support for booleans (#24068)
  • Add DataFrame.map_columns for eager evaluation (#23821)

🐞 Bug fixes

  • Correct size limits for Decimal cast (#24252)
  • Unordered unions in check order observing pass (#24253)
  • Fix dtype for slice on Literal in agg context (#24137)
  • Fix incorrect filter(lit(True)) when scanning hive (#24237)
  • In-memory group_by on 128-bit integers (#24242)
  • Fix panic in gather inside groupby with invalid indices (#24182)
  • Release the GIL in map_groups (#24225)
  • Remove extra explode in LazyGroupBy.{head,tail} (#24221)
  • Fix panic in polars cloud CSV scan (#24197)
  • Fix panic when loading categorical columns from IO plugin (#24205)
  • Fix credential provider did not auto-init on partition sinks (#24188)
  • Fix engine type for concat_list on AggScalar implode (#24160)
  • Rolling_mean handle centered weights with len(values) < window_size (#24158)
  • Reading is_in predicate for Parquet plain strings (#24184)
  • Support native DuckDB connection in read_database (#24177)
  • Make PyCategories pickleable (#24170)
  • Remove unused unsound function to_mutable_slice (#24173)
  • PyO3 extension types giving compat_level errors (#24166)
  • Allow non-elementwise by in top_k (#24164)
  • Fix sort_by for group_by_dynamic context (#24152)
  • Input-independent length aggregations in streaming (#24153)
  • Release GIL when iterating df in to_arrow (#24151)
  • Respect non-elementwise join_where conditions (#24135)
  • Fix mismatched pytest test collection error (#24133)
  • Resolve schema mismatch for div on Boolean (#24111)
  • Fix from_repr parsing of negative durations (#24115)
  • Make group_by/partition_by iterator keys tuple[Any, ...] to enable tuple-unpacking (#24113)
  • Keep name when doing empty group-aware aggregation (#24098)
  • Implode instead of reshape_list (#24078)
  • Rolling mean with weights incorrect when min_samples < window_size (#23485)
  • Allow merge_sorted for all types (#24077)
  • Include datatypes in row_encode expression (#24074)
  • Include UDF materialized type in serialization (#24073)
  • Correct .rolling() output type for non-aggregations (#24072)
  • Correct planner output schema for join_asof (#24071)
  • Correct output for fold and reduce (#24069)
  • Expr.meta.output_name for struct fields (#24064)
  • Ensure upcast operations on pl.Date default to microsecond precision (#23981)
  • Add peak_{min,max} support for booleans (#24068)
  • Planner output type for mean with strange input type (#24052)
  • Remove, deprecate or change eager Exprs to be lazy compatible (#24027)

📖 Documentation

  • Fix formatting of Series.value_counts examples (#24245)
  • Add hint to use DataFrame/Series constructors in from_arrow docstring (#22942)
  • Update GPU un/supported features (#24195)
  • Add DataFrame.map_columns to API (#24128)
  • Update multiple pages in the Polars Cloud user guide (#23661)
  • Fix str.find_many() docstring example (#24092)

📦 Build system

  • Drop binary support for macos_x86-64 (#24257)

🛠️ Other improvements

  • Remove unnecessary parentheses (#24244)
  • Make non-nested shift{,_and_fill} ops generic (#24224)
  • Remove unused Wrap (#24214)
  • Allow upcasting null-typed columns to nested column types in scans (#24185)
  • Automatically label a few more types of PR (#24147)
  • Update toolchain (#24156)
  • Add order_sensitive property for AExpr (#24116)
  • Mark more tests as not possible on cloud (#24103)
  • Turn AggExpr::Count from tuple to struct (#24096)
  • Mark tests that may fail in cloud (#24067)
  • Extend read database tests to capture more ADBC functionality (#24002)
  • Make CI perf failures more lenient (#24066)
  • Fix hive partition string encoding in CI by upgrading deltalake (#24018)
  • Make tests with sinks run on cloud again (#24048)

Thank you to all our contributors for making this release possible!
@Kevin-Patyk, @agossard, @alexander-beedie, @aparna2198, @borchero, @coastalwhite, @deanm0000, @dsprenkels, @henryharbeck, @jjurm, @kdn36, @math-hiyoko, @mcrumiller, @mroeschke, @nameexhaustion, @orlp, @r-brink, @ritchie46, @stijnherfst, @vdrn and @wence-

Don't miss a new polars release

NewReleases is sending notifications on new releases.