github pola-rs/polars py-1.23.0
Python Polars 1.23.0

13 hours ago

🚀 Performance improvements

  • Toggle projection pushdown for eager rolling (#21405)
  • Fix pathologic rolling + group-by performance and memory explosion (#21403)
  • Add sampling to new-streaming equi join to decide between build/probe side (#21197)

✨ Enhancements

  • Implement i128 -> str cast (#21411)
  • Connect polars-cloud (#21387)
  • Version DSL (#21383)
  • Make user facing binary formats mostly self describing (#21380)
  • Filter hive files using predicates in new streaming (#21372)
  • Add negative slicing to new streaming multiscan (#21219)
  • Allow iterable of frames as input to align_frames (#21209)
  • Implement sorted flags for struct series (#21290)
  • Support reading arrow Map type from Delta (#21330)
  • Add a dedicated remove method for DataFrame and LazyFrame (#21259)
  • Rename credentials parameter to credential in CredentialProviderAzure (#21295)
  • Implement merge_sorted for struct (#21205)
  • Add positive slice for new streaming MultiScan (#21191)
  • Don't take in rewriting visitor (#21212)
  • Add SQL support for the DELETE statement (#21190)
  • Add row index to new streaming multiscan (#21169)
  • Improve DataFrame fmt in explain (#21158)

🐞 Bug fixes

  • Method dt.ordinal_day was returning UTC results as opposed to those on the local timestamp (#21410)
  • Use Kahan summation for rolling sum kernels. Fix numerical stability issues (#21413)
  • Add scalar checks for n and fill_value parameters in shift (#21292)
  • Upcast small integer dtypes for rolling sum operations (#21397)
  • Don't silently produce null values from invalid input to pl.datetime and pl.date (#21013)
  • Allow duration multiplied w/ primitive to propagate in IR schema (#21394)
  • Struct arithmetic broadcasting behavior (#21382)
  • Prefiltered optional plain primitive kernel (#21381)
  • Panic when projecting only row index from IPC file (#21361)
  • Properly update groups after gather in aggregation context (#21369)
  • Mark test as may_fail_auto_streaming (#21373)
  • Properly set fast_unique in EnumBuilder (#21366)
  • Rust test race condition (#21368)
  • Fix unequal DataFrame column heights from parquet hive scan with filter (#21340)
  • Fix ColumnNotFound error selecting len() after semi/anti join (#21355)
  • Merge Parquet nested and flat decoders (#21342)
  • Incorrect atomic ordering in Connector (#21341)
  • Method dt.offset_by was discarding month and year info if day was included in offset for timezone-aware columns (#21291)
  • Fix pickling polars.col on Python versions <3.11 (#21333)
  • Fix duplicate column names after join if suffix already present (#21315)
  • Skip Batches Expression for boolean literals (#21310)
  • Fix performance regression for eager join_where (#21308)
  • Fix incorrect predicate pushdown for predicates referring to right-join key columns (#21293)
  • Panic in to_physical for series of arrays and lists (#21289)
  • Resolve deadlock due to leaking in Connector recv drop (#21296)
  • Incorrect result for merge_sorted with lexical categorical (#21278)
  • Add Int128 path for join_asof (#21282)
  • Categorical min/max returning String dtype rather than Categorical (#21232)
  • Checking overflow in Sliced function (#21207)
  • Adding a struct field using a literal raises InvalidOperationError (#21254)
  • Return nulls for is_finite, is_infinite, and is_nan when dtype is pl.Null (#21253)
  • Account for minor change in new connectorx release (#21277)
  • Properly implement and test Skip Batch Predicate (#21269)
  • Infinite recursion when broadcasting into struct zip_outer_validity (#21268)
  • Deadlock due to bad logic in new-streaming join sampling (#21265)
  • Incorrect result for top_k/bottom_k when input is sorted (#21264)
  • UTF-8 validation of nested string slice in Parquet (#21262)
  • Raise instead of panicking when casting a Series to a Struct with the wrong number of fields (#21213)
  • Defer credential provider resolution to take place at query collection instead of construction (#21225)
  • Do not panic in strptime() if format ends with '%' (#21176)
  • Raise error instead of panicking for unsupported SQL operations (#20789)
  • Projection of only row index in new streaming IPC (#21167)
  • Fix projection count query optimization (#21162)

📖 Documentation

  • Fix doc for SQL Functions navigation (#21412)
  • Fix initial selector example (#21321)
  • Add pandas strictness API difference (#21312)
  • Improve Expr.name.map docstring example (#21309)
  • Add logo to Ask AI (#21261)
  • Fix docs for Catalog (#21252)
  • AI widget again (#21257)
  • Revert plugin (#21250)
  • Add kappa ask ai widget (#21243)
  • Update social icons in API reference docs (#21214)
  • Improve Arrow key feature description (#21171)
  • Improve example in IO plugins user guide (#21146)

🛠️ Other improvements

  • Move storage of hive partitions to DataFrame (#21364)
  • Feature gate merge sorted in new streaming engine (#21338)
  • Remove new streaming old multiscan (#21300)
  • Add tests for fixed open issues (#21185)
  • Try to mimic all steps (#21249)
  • Require version for POLARS_VERSION (#21248)
  • Fix docs (#21246)
  • Avoid unnecessary packaging dependency (#21223)
  • Remove unused file (#21240)
  • Add use_field_init_shorthand = true to rustfmt (#21237)
  • Don't mutate arena by default in Rewriting Visitor (#21234)
  • Disable the TraceMalloc allocator (#21231)
  • Add feature gate to old streaming deprecation warning (#21179)
  • Install seaborn when running remote benchmark (#21168)

Thank you to all our contributors for making this release possible!
@GiovanniGiacometti, @JakubValtar, @MarcoGorelli, @Matt711, @Shoeboxam, @YichiZhang0613, @alexander-beedie, @bschoenmaeckers, @coastalwhite, @edwinvehmaanpera, @erikbrinkman, @etiennebacher, @hemanth94, @henryharbeck, @jqnatividad, @lukemanley, @mcrumiller, @nameexhaustion, @orlp, @r-brink, @ritchie46 and @ydagosto

Don't miss a new polars release

NewReleases is sending notifications on new releases.