pola-rs/polars py-1.35.0 on GitHub

🏆 Highlights

Stabilize decimal (#25020)

🚀 Performance improvements

Bump foldhash to 0.2.0 and hashbrown to 0.16.0 (#25014)
Lower unique to native group-by and speed up n_unique in group-by context (#24976)
Better parallelize take{_slice,}_unchecked (#24980)
Implement native skew and kurtosis in group-by context (#24961)
Use native group-by aggregations for bitwise_* operations (#24935)
Address group_by_dynamic slowness in sparse data (#24916)
Push filters to PyIceberg (#24910)
Native filter/drop_nulls/drop_nans in group-by context (#24897)
Implement cumulative_eval using the group-by engine (#24889)
Prevent generation of copies of Dataframes in DslPlan serialization (#24852)
Implement native null_count, any and all group-by aggregations (#24859)
Speed up reverse in group-by context (#24855)
Prune unused categorical values when exporting to arrow/parquet/IPC/pickle (#24829)
Don't check duplicates on streaming simple projection in release mode (#24830)
Lower approx_n_unique to the streaming engine (#24821)
Duration/interval string parsing optimisation (2-5x faster) (#24771)
Use native reducer for first/last on Decimals, Categoricals and Enums (#24786)
Implement indexed method for BitMapIter::nth (#24766)
Pushdown slices on plans within unions (#24735)

✨ Enhancements

Stabilize decimal (#25020)
Support ewm_mean() in streaming engine (#25003)
Improve row-count estimates (#24996)
Remove filtered scan paths in IR when possible (#24974)
Introduce remote Polars MCP server (#24977)
Allow local scans on polars cloud (configurable) (#24962)
Add Expr.item to strictly extract a single value from an expression (#24888)
Add environment variable to roundtrip empty struct in Parquet (#24914)
Fast-count for scan_iceberg().select(len()) (#24602)
Add glob parameter to scan_ipc (#24898)
Prevent generation of copies of Dataframes in DslPlan serialization (#24852)
Add list.agg and arr.agg (#24790)
Implement {Expr,Series}.rolling_rank() (#24776)
Don't require PyArrow for read_database_uri if ADBC engine version supports PyCapsule interface (#24029)
Make Series init consistent with DataFrame init for string values declared with temporal dtype (#24785)
Support MergeSorted in CSPE (#24805)
Duration/interval string parsing optimisation (2-5x faster) (#24771)
Recursively apply CSPE (#24798)
Add streaming engine per-node metrics (#24788)
Add arr.eval (#24472)
Drop PyArrow requirement for non-batched usage of read_database with the ADBC engine and support iter_batches with the ADBC engine (#24180)
Improve rolling_(sum|mean) accuracy (#24743)
Add separator to {Data,Lazy}Frame.unnest (#24716)
Add union() function for unordered concatenation (#24298)
Add name.replace to the set of column rename options (#17942)
Support np.ndarray -> AnyValue conversion (#24748)
Allow duration strings with leading "+" (#24737)
Drop now-unnecessary post-init "schema_overrides" cast on DataFrame load from list of dicts (#24739)
Add support for UInt128 to pyo3-polars (#24731)

🐞 Bug fixes

Re-enable CPU feature check before import (#25010)
Implement read_excel workaround for fastexcel/calamine issue loading a column subset from a named table (#25012)
Correctness any(ignore_nulls) and OOB in all (#25005)
Streaming any/all with ignore_nulls=False (#25008)
Fix incorrect join_asof on a casted expression (#25006)
Optimize memory on rolling groups in ApplyExpr (#24709)
Fallback Pyarrow scan to in-memory engine (#24991)
Make Operator::swap_operands return correct operators for Plus, Minus, Multiply and Divide (#24997)
Capitalize letters after numbers in to_titlecase (#24993)
Preserve null values in pct_change (#24952)
Raise length mismatch on over with sliced groups (#24887)
Check duplicate name in transpose (#24956)
Follow Kleene logic in any / all for group-by (#24940)
Do not optimize cross join to iejoin if order maintaining (#24950)
Fix typing of scan_parquet partially unknown (#24928)
Properly release the GIL for read_parquet_metadata (#24922)
Broadcast partition_by columns in over expression (#24874)
Clear index cache on stacked df.filter expressions (#24870)
Fix 'explode' mapping strategy on scalar value (#24861)
Fix repeated with_row_index() after scan() silently ignored (#24866)
Correctly return min and max for enums in groupby aggregation (#24808)
Refactor BinaryExpr in group_by dispatch logic (#24548)
Fix aggstate for gather (#24857)
Keep scalars for length preserving functions in group_by (#24819)
Have range feature depend on dtype-array feature (#24853)
Fix duplicate select panic (#24836)
Inconsistency of list.sum() result type with None values (#24476)
Division by zero in Expr.dt.truncate (#24832)
Potential deadlock in __arrow_c_stream__ (#24831)
Allow double aggregations in group-by contexts (#24823)
Series.shrink_dtype for i128/u128 (#24833)
Fix dtype in EvalExpr (#24650)
Allow aggregations on AggState::LiteralScalar (#24820)
Dispatch to group_aware for fallible expressions with masked out elements (#24815)
Fix error for arr.sum() on small integer Array dtypes containing nulls (#24478)
Fix regression on write_database() to Snowflake due to unsupported string view type (#24622)
Fix XOR did not follow kleene when one side is unit-length (#24810)
Make Series init consistent with DataFrame init for string values declared with temporal dtype (#24785)
Incorrect precision in Series.str.to_decimal (#24804)
Use overlapping instead of rolling (#24787)
Fix iterable on dynamic_group_by and rolling object (#24740)
Use Kahan summation for in-memory groupby sum/mean (#24774)
Release GIL in PythonScan predicate evaluation (#24779)
Type error in bitmask::nth_set_bit_u64 (#24775)
Add Expr.sign for Decimal datatype (#24717)
Correct str.replace with missing pattern (#24768)
Ensure schema_overrides is respected when loading iterable row data (#24721)
Support decimal_comma on Decimal type in write_csv (#24718)

📖 Documentation

Introduce remote Polars MCP server (#24977)
Add {arr,list}.agg API references (#24970)
Support LLM in docs (#24958)
Update Cloud docs with correct fn argument order (#24939)
Update name.replace examples (#24941)
Add i128 and u128 features to user guide (#24938)
Add partitioning examples for sink_* methods (#24918)
Add more {unique,value}_counts examples (#24927)
Indent the versionchanged (#24783)
Relax fsspec wording (#24881)
Add pl.field into the api docs (#24846)
Fix duplicated article in SECURITY.md (#24762)
Document output name determination in when/then/otherwise (#24746)
Specify that precision=None becomes 38 for Decimal (#24742)
Mention polars[rt64] and polars[rtcompat] instead of u64-idx and lts-cpu (#24749)
Fix source mapping (#24736)

📦 Build system

Ensure build_feature_flags.py is included in artifact (#25024)
Update pyo3 and numpy crates to version 0.26 (#24760)

🛠️ Other improvements

Fix benchmark ci (#25019)
Fix non-deterministic test (#25009)
Fix makefile arch detection (#25011)
Make LazyFrame.set_sorted into a FunctionIR::Hint (#24981)
Remove symbolic links (#24982)
Deprecate Expr.agg_groups() and pl.groups() (#24919)
Dispatch to no-op rayon thread-pool from streaming (#24957)
Unpin pydantic (#24955)
Ensure safety of scan fast-count IR lowering in streaming (#24953)
Re-use iterators in set_ operations (#24850)
Remove GroupByPartitioned and dispatch to streaming engine (#24903)
Turn element() into {A,}Expr::Element (#24885)
Pass ScanOptions to new_from_ipc (#24893)
Update tests to be index type agnostic (#24891)
Unset Context in Window expression (#24875)
Fix failing delta test (#24867)
Move FunctionExpr dispatch from plan to expr (#24839)
Fix SQL test giving wrong error message (#24835)
Consolidate dtype paths in ApplyExpr (#24825)
Add days_in_month to documentation (#24822)
Enable ruff D417 lint (#24814)
Turn pl.format into proper elementwise expression (#24811)
Fix remote benchmark by no-longer saving builds (#24812)
Refactor ApplyExpr in group_by context on multiple inputs (#24520)
IR text plan graph generator (#24733)
Temporarily pin pydantic to fix CI (#24797)
Extend and rename rolling groups to overlapping (#24577)
Refactor DataType proptest strategies (#24763)
Add union to documentation (#24769)

Thank you to all our contributors for making this release possible!
@EndPositive, @EnricoMi, @JakubValtar, @Kevin-Patyk, @MarcoGorelli, @Object905, @alexander-beedie, @borchero, @carnarez, @cmdlineluser, @coastalwhite, @craigalodon, @dsprenkels, @eitsupi, @etrotta, @henryharbeck, @jordanosborn, @kdn36, @math-hiyoko, @mjanssen, @nameexhaustion, @orlp, @pavelzw, @r-brink, @ritchie46, @thomasjpfan and @williambdean

pola-rs/polars py-1.35.0 Python Polars 1.35.0 on GitHub

🏆 Highlights

🚀 Performance improvements

✨ Enhancements

🐞 Bug fixes

📖 Documentation

📦 Build system

🛠️ Other improvements

pola-rs/polars py-1.35.0
Python Polars 1.35.0

on GitHub