github pola-rs/polars py-1.24.0
Python Polars 1.24.0

one day ago

🚀 Performance improvements

  • Provide a fallback skip batch predicate for constant batches (#21477)
  • Parallelize the passing in new streaming multiscan (#21430)

✨ Enhancements

  • Add lossy decoding to read_csv for non-utf8 encodings (#21433)
  • Add DataFrame.write_iceberg (#15018)
  • Add 'nulls_equal' parameter to is_in (#21426)
  • Improve numeric stability rolling_{std, var, cov, corr} (#21528)
  • IR Serde cross-filter (#21488)
  • Give priority to pycapsule interface in from_dataframe (#21377)
  • Support writing Time type in json (#21454)
  • Activate all optimizations in sinks (#21462)
  • Add AssertionError variant to PolarsError in polars-error (#21460)
  • Pass filter to inner readers in multiscan new streaming (#21436)

🐞 Bug fixes

  • Categorical min/max panicking when string cache is enabled (#21552)
  • Don't encode IPC record batch twice (#21525)
  • Respect rewriting flag in Node rewriter (#21516)
  • Correct skip batch predicate for partial statistics (#21502)
  • Make the Parquet Sink properly phase aware (#21499)
  • Don't divide by zero in partitioned group-by (#21498)
  • Create new linearizer between rowwise new streaming sink phases (#21490)
  • Don't drop rows in sinks between new streaming phases (#21489)
  • Incorrect lazy schema for Expr.list.diff (#21484)
  • Give priority to pycapsule interface in from_dataframe (#21377)
  • Duration Series arithmetic operations (#21425)
  • Fix unwrap None panic when filtering delta with missing columns (#21453)
  • Use stable sort for rolling-groupby (#21444)
  • Throw exception if dataframe is too large to be compatible with Excel (#20900)
  • Address regression with read_excel not handling URL paths correctly (#21428)

📖 Documentation

  • Fix typo (#21554)
  • Correct typos and grammar in Python docstrings (#21524)
  • Move llm page under misc (#21550)
  • Polars Cloud docs (#21548)
  • Add LazyFrame.remote docs entry (#21529)
  • Specify that the key column must be sorted in ascending order in merge_sorted (#21501)
  • Add Polars & LLMs page to the user guide (#21218)
  • Mention that statistics=True doesn't enable all statistics in sink_parquet() (#21434)

🛠️ Other improvements

  • Don't take ownership of IRplan in new streaming engine (#21551)
  • Refactor code for re-use by streaming NDJSON source (#21520)
  • Simplify the phase handling of new streaming sinks (#21530)
  • Improve IPC sink node parallelism (#21505)
  • Use tikv-jemallocator (#21486)
  • Rename 'join_nulls' parameter to 'nulls_equal' in join functions (#21507)
  • Move rolling to polars-compute (#21503)
  • Remove Growable in favor of ArrayBuilder (#21500)
  • Introduce a Sink Node trait in the new streaming engine (#21458)
  • Add test for rolling stability sort (#21456)
  • Add test for empty .is_in predicate filter (#21455)
  • Test for unique length on multiple columns (#21418)

Thank you to all our contributors for making this release possible!
@Kevin-Patyk, @MarcoGorelli, @Matt711, @alexander-beedie, @banflam, @braaannigan, @coastalwhite, @dependabot[bot], @etiennebacher, @ghuls, @kevinjqliu, @lukemanley, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stijnherfst, @thomasjpfan and dependabot[bot]

Don't miss a new polars release

NewReleases is sending notifications on new releases.