github pola-rs/polars py-1.27.0
Python Polars 1.27.0

latest releases: py-1.33.1, py-1.33.0, py-1.33.0-beta.1...
5 months ago

💥 Breaking changes

  • Make bottom interval closed in hist (#22090)
  • Change Partition API to base_path and file_path (#21888)

🚀 Performance improvements

  • Add CSE to streaming groupby (#22196)
  • Speed-up new streaming predicate filtering (#22179)
  • Speedup new-streaming file row count (#22169)
  • Fix quadratic behavior when casting Enums (#22008)
  • Lower is_in to bitmap-output semi-join in new streaming engine (#21948)
  • Fast path for empty inner join (#21965)
  • Add native semi/anti join in new streaming engine (#21937)
  • Cache regex compilation globally (#21929)

✨ Enhancements

  • Add SPLIT_PART string function to the SQL interface (#22158)
  • Allow scalar expr in Expr.diff (#22142)
  • Support additional unsigned int aliases in the SQL interface (#22127)
  • Add STRING_TO_ARRAY function to the SQL interface (#22129)
  • Add dt.is_business_day (#21776)
  • Add an eager parameter to pl.cov (#22098)
  • Add support for Int128 parsing/recognition to the SQL interface (#22104)
  • Add an eager parameter to pl.coalesce (#22092)
  • Add an eager parameter to pl.corr (#22097)
  • Allow sinking to abstract python io and fs classes (#21987)
  • Add add_alp_optimize_exprs to IRBuilder (#22061)
  • Add cat.slice (#21971)
  • Support growing schema if line lenght increases during csv schema inference (#21979)
  • Replace thread unsafe GilOnceCell with Mutex (#21927)
  • Support modified dsl in file cache (#21907)

🐞 Bug fixes

  • Implode in agg (#22197)
  • Reduce GIL hold time for IO plugins in new-streaming (#22186)
  • Enhance predicate validation and cast safety in join_where (#22112)
  • Handle Parquet with compressed empty DataPage v2 (#22172)
  • Schema error during lowering (#22175)
  • Rewrite unroll of overlapping groups to mitigate out of range index panic (#22072)
  • Incorrect rounding for very large/small numbers (#22173)
  • Allow set input to list.set_* operations (#22163)
  • Deadlock in join due to rayon nested task-stealing (#22159)
  • Mark Expr.repeat_by as elementwise (#22068)
  • Fix csv serializer panic by supporting ScalarColumn in as_single_chunk (#22146)
  • Raise an error if a number doesn't have associated unit in duration strings (#22035)
  • Add i128 as supertype to boolean (#22138)
  • Fix panic when constructing DF from pyarrow due to duplicate field names (#22114)
  • Add broadcasts and error messages for many elementwise operations (#22130)
  • Throw error for n=0 on list.gather_every (#22122)
  • Throw error for unsupported rolling operations (#22121)
  • Error on unequal length str.to_integer arguments (#22100)
  • Make bottom interval closed in hist (#22090)
  • Relative path resolution for plugin libraries (#21911)
  • Avoiding panic with striptime for out-of-bounds dates (#21208)
  • Join revmaps for categoricals in merge_sorted (#21976)
  • Fix glob expansion matching extra files (#21991)
  • Ensure SQL dot-notation for nested column fields resolves correctly (#22109)
  • Parquet filter performance regression from multiscan dispatch (#22116)
  • Panic for unequal length ewm_mean_by args (#22093)
  • Add scalarity checks to pl.repeat (#22088)
  • Type check n parameter of pl.repeat (#22071)
  • Mark bitwise_{count,leading,trailing}_{ones,zeros} as elementwise (#22044)
  • Mark pl.*_ranges functions correctly as element-wise (#22059)
  • Correctly type check pl.arctan2 (#22060)
  • Mark pl.business_day_count as elementwise (#22055)
  • Check input python type for str.extract_groups (#22032)
  • Check types for fill_char in str.pad_{start,end} (#22036)
  • Mark str.to_decimal properly as non-elementwise (#22040)
  • Documented return type for bin.encode and bin.decode (#22022)
  • Revert #22017 and improve block(_in_place)_on doc comment (#22031)
  • Remove outdated depth warning (#22030)
  • Expression pl.concat was incorrectly marked as elementwise (#22019)
  • Use block_in_place_on to start streaming (#22017)
  • Panic on empty aggregation in streaming (#22016)
  • Error instead of panick for invalid durations in dt.offset_by() and dt.round() (#21982)
  • Raise error instead of silently appending NULL in NDJSON parsing (#21953)
  • Ensure AV is static before pushing to row buffer (#21967)
  • Deadlock in new-streaming multiplexer (#21963)
  • Release GIL in collect_with_callback (#21941)
  • Panic in new RegexCache (#21935)
  • Type hint of cs.exclude() is SelectorType instead of Expr (#21892)
  • Add correct deprecation warning for .str.concat (#21666)
  • Use absolute paths by defaults for plugins (#21904)

📖 Documentation

  • Add user guide section on working with Sheets in Colab (#22161)
  • Update distributed engine docs (#22128)
  • Add Polars Cloud release notes (#22021)
  • Remove trailing space in settings POLARS_CLOUD_CLIENT_ID (#21995)
  • Fix typo (#21954)
  • Fix 'pickleable' typo in docs (#21938)
  • Change ctx to compute=ctx for all remote query examples (#21930)

🛠️ Other improvements

  • Remove old MultiScanExec for in-memory (#22184)
  • Separate FunctionOptions from DSL calls (#22133)
  • Undeprecate backward_fill and forward_fill (#22156)
  • Handle conversion of Duration specially in pyir (#22101)
  • Deprecate duplicate backward_fill and forward_fill interface (#22083)
  • Solve clippy lints for 1.86 (#22102)
  • Remove rust exclusive MaxBound and MinBound fill strategies (#22063)
  • Change Partition API to base_path and file_path (#21888)
  • Fix pydantic model_fields deprecation (#21958)

Thank you to all our contributors for making this release possible!
@DeflateAwning, @EnricoMi, @Jacob640, @JakubValtar, @MarcoGorelli, @MaxJackson, @alexander-beedie, @amotzop, @anath2, @bschoenmaeckers, @cnpryer, @coastalwhite, @dependabot[bot], @eitsupi, @etiennebacher, @hemanth94, @kdn36, @mcrumiller, @nameexhaustion, @orlp, @r-brink, @rgertenbach, @ritchie46, @sebasv, @silannisik, @stijnherfst, @wence-, @zachlefevre and dependabot[bot]

Don't miss a new polars release

NewReleases is sending notifications on new releases.