pola-rs/polars py-1.25.2 on GitHub

🏆 Highlights

Enable common subplan elimination across plans in collect_all (#21747)
Add lazy sinks (#21733)
Add PartitionByKey for new streaming sinks (#21689)
Enable new streaming memory sinks by default (#21589)

🚀 Performance improvements

Implement linear-time rolling_min/max (#21770)
Improve InputIndependentSelect by delegating to InMemorySourceNode (#21767)
Enable common subplan elimination across plans in collect_all (#21747)
Allow elementwise functions in recursive lowering (#21653)
Add primitive single-key hashtable to new-streaming join (#21712)
Remove unnecessary black_boxes in Kahan summation (#21679)
Box large enum variants (#21657)
Improve join performance for new-streaming engine (#21620)
Pre-fill caches (#21646)
Optimize only a single cache input (#21644)
Collect parquet statistics in one contiguous buffer (#21632)
Update Cargo.lock (mainly for zstd 1.5.7) (#21612)
Don't maintain order when maintain_order=False in new streaming sinks (#21586)
Pre-sort groups in group-by-dynamic (#21569)

✨ Enhancements

Add support for rolling_(sum/min/max) for booleans through casting (#21748)
Support multi-column sort for all nested types and nested search-sorted (#21743)
Add lazy sinks (#21733)
Add PartitionByKey for new streaming sinks (#21689)
Fix replace flags (#21731)
Add mkdir flag to sinks (#21717)
Enable joins on list/array dtypes (#21687)
Add a config option to specify the default engine to attempt to use during lazyframe calls (#20717)
Support all elementwise functions in IO plugin predicates (#21705)
Stabilize Enum datatype (#21686)
Support Polars int128 in from arrow (#21688)
Use FFI to read dataframe instead of transmute (#21673)
Enable new streaming memory sinks by default (#21589)
Cloud support for new-streaming scans and sinks (#21621)
Add len method to arr (#21618)
Closeable files on unix (#21588)
Add new PartitionMaxSize sink (#21573)
Support engine callback for LazyFrame.profile (#21534)
Dispatch new-streaming CSV negative slice to separate node (#21579)
Add NDJSON source to new streaming engine (#21562)
Support passing token in storage_options for GCP cloud (#21560)

🐞 Bug fixes

Expose and document partitions (#21765)
Fix lazy schema for truediv ops involving List/Array dtypes (#21764)
Fix error due to race condition in file cache (#21753)
Clear NaNs due to zero-weight division in rolling var/std (#21761)
Allow init from BigQuery Arrow data containing ExtensionType cols with irrelevant metadata (#21492)
Disallow cast from boolean to categorical/enum (#21714)
Don't check sortedness in join_asof when 'by' groups supplied, but issue warning (#21724)
Incorrect multithread path taken for aggregations (#21727)
Disallow cast to empty Enum (#21715)
Fix list.mean and list.median returning Float64 for temporal types (#21144)
Incorrect (FixedSize)ListArrayBuilder gather implementation (#21716)
Always fallback in SkipBatchPredicate (#21711)
New streaming multiscan deadlock (#21694)
Ensure new-streaming join BuildState is correct even if never fed morsels (#21708)
IO plugin; support empty iterator (#21704)
Support nulls in multi-column sort (#21702)
Window function check length of groups state (#21697)
Support 128 sum reduction on new streaming (#21691)
IPC round-trip of list of empty view with non-empty bufferset (#21671)
Variance can never be negative (#21678)
Incorrect loop length in new-streaming group by (#21670)
Right join on multiple columns not coalescing left_on columns (#21669)
Casting Struct to String panics if n_chunks > 1 (#21656)
FixFuture attached to different loop error on read_database_uri (#21641)
Fix deadlock in cache + hconcat (#21640)
Properly handle phase transitions in row-wise sinks (#21600)
Enable new streaming memory sinks by default (#21589)
Always use global registry for object (#21622)
Check enum categories when reading csv (#21619)
Unspecialized prefiltering on nullable arrays (#21611)
Release the gil on explain (#21607)
Take into account scalar/partitioned columns in DataFrame::split_chunks (#21606)
Bad null handling in unordered row encoding (#21603)
Fix deadlock in new streaming CSV / NDJSON sinks (#21598)
Bad view index in BinaryViewBuilder (#21590)
Fix CSV count with comment prefix skipped empty lines (#21577)
New streaming IPC enum scan (#21570)
Several aspects related to ParquetColumnExpr (#21563)
Don't hit parquet::pre-filtered in case of pre-slice (#21565)

📖 Documentation

Add skrub to ecosystem.md (#21760)
Add example for percentile rank (#21746)
Make python/rust getting-started consistent and clarify performance risk of infer_schema_length=None (#21734)
Add expression composability to PySpark comparison (#21473)
Document read_().lazy() antipattern (#21623)
Update Polars Cloud interactive workflow examples (#21609)
Add a Plotnine example to the visualization docs (#21597)
Add cloud api reference to Ref guide (#21566)

🛠️ Other improvements

Remove variance numerical stability hack (#21749)
Only use chrono_tz timezones in hypothesis testing (#21721)
Remove order check from flaky test (#21730)
Add sinks into the DSL before optimization (#21713)
Add missing test case for #21701 (#21709)
Remove old-streaming from engine argument (#21667)
Add as_phys_any to PrivateSeries for downcasting (#21696)
Use FFI to read dataframe instead of transmute (#21673)
Work around typos ignore bug (#21672)
Added Test For datetime_range Nanosecond Overflow (#21354)
Update to edition 2024 (#21662)
Update rustc (#21647)
Support object from chunks (#21636)
Push versioned docs on workflow dispatch (#21630)
Fail docs early (#21629)
Check major/minor in docs (#21626)
Add docs workflow (#21624)
Add test for 21581 (#21617)
Remove even more parquet multiscan handling (#21601)
Remove multiscan handling from new streaming parquet source (#21584)
Prepare skeleton for partitioning sinks (#21536)

Thank you to all our contributors for making this release possible!
@GaelVaroquaux, @Kevin-Patyk, @MarcoGorelli, @Matt711, @NathanHu725, @alexander-beedie, @coastalwhite, @dependabot[bot], @jrycw, @kdn36, @lukemanley, @mcrumiller, @nameexhaustion, @orlp, @r-brink, @ritchie46, @wence- and dependabot[bot]

pola-rs/polars py-1.25.2 Python Polars 1.25.2 on GitHub

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

🛠️ Other improvements

pola-rs/polars py-1.25.2
Python Polars 1.25.2

on GitHub