What's Changed
- partial support for list arithmetic by @ritchie46 in #3307
- shuffle sample option by @ritchie46 in #3308
- improve predicate pushdown by @ritchie46 in #3313
- Improve partitioned agg by @ritchie46 in #3314
- list to struct by @ritchie46 in #3317
- oncecell in favor of lazy_static by @ritchie46 in #3319
- Update cummax documentation by @briandk in #3323
- scan pyarrow dataset by @ritchie46 in #3327
- fix panic in csv parser by @ritchie46 in #3339
- implement anyvalue -> datatype for all variants by @ritchie46 in #3340
- remove badge by @ritchie46 in #3341
- Added
PartitionedWriter
for disk partitioning. by @illumination-k in #3331 - Fast json by @universalmind303 in #3324
- add hash to rust expressions by @ritchie46 in #3350
- serde for group options by @elferherrera in #3349
- Check if length of index in pivot operation is non-zero. Fixes: #3343. by @ghuls in #3346
- improve agg_list performance of chunked numerical data by @ritchie46 in #3351
- Fix init of DataFrame with empty dataset (eg:"[]") and column/schema typedefs by @alexander-beedie in #3353
- rechunk on default sort and groupby by @ritchie46 in #3354
- more partitioned groupby by @ritchie46 in #3355
- Add extension_module in python example by @Maxyme in #3358
- allow join on same cat source by @ritchie46 in #3363
- fix rename same name by @ritchie46 in #3364
- initial timezone support by @ritchie46 in #3357
- pivot index maintain logical type by @ritchie46 in #3367
- use array_ref in favor of chunks by @ritchie46 in #3368
- entropy normalization arg by @ritchie46 in #3369
- categorical keep type in comparisson by @ritchie46 in #3370
- rechunk in asof and allow concat to empty df by @ritchie46 in #3376
- improve overflow of numeric mean by @ritchie46 in #3377
- fix parquet stats by @ritchie46 in #3378
- delay rechunk optimization by @ritchie46 in #3381
- Allow Z in native strpttime by @ritchie46 in #3382
- more partitioned aggregators by @ritchie46 in #3385
- improve partition_by by @ritchie46 in #3386
- Add overload support to partition_by. by @ghuls in #3388
- Check if some arguments for read_csv and scan_csv got a 1 byte input. by @ghuls in #3389
- fix rayon SO in partition_by by @ritchie46 in #3391
- fix bug in predicate pushdown on dependent predicates by @ritchie46 in #3394
- fix predicate pushdown for predicates that do aggregations by @ritchie46 in #3396
- cumulative_eval by @ritchie46 in #3400
- ensure that Cast expressions first updates groups before it flattens by @ritchie46 in #3401
- improve and simplify ternary aggregation by @ritchie46 in #3403
- fix explode empty df by @ritchie46 in #3405
- Improve list builders, iteration and construction by @ritchie46 in #3419
- feature gate timezones by @ritchie46 in #3422
- fix cumulative_eval on window expressions by @ritchie46 in #3421
- csv allow only header and fix lazy rename by @ritchie46 in #3423
- upgrade arrow by @ritchie46 in #3425
- infer dtype of empty list in recursive list construction & fix struct.arr take by @ritchie46 in #3433
- fix struct list concat by @ritchie46 in #3435
- csv parser fallback on chrono if datetime pattern fails by @ritchie46 in #3436
- improve rolling_quantile kernel (no nulls) ~28x by @ritchie46 in #3437
- improve
rolling_{min/max/sum/mean}
prerformance~3.4x
by @ritchie46 in #3444 - struct add chunk and impl reverse by @ritchie46 in #3445
- fix struct equality by @ritchie46 in #3446
- Struct error on different dict orders by @ritchie46 in #3447
- Inherit Exception in fallback exception classes by @adamgreg in #3450
- Struct creations/append/extend stricter schema by @ritchie46 in #3454
- don't allow predicate pushdown if compared column is being coerced by @ritchie46 in #3457
- improve rolling_min/max for columns with null values by @ritchie46 in #3458
- Improve rolling_sum/rolling_mean for windows with null values. by @ritchie46 in #3466
- explode series after slide fast path by @ritchie46 in #3467
- Improve struct by @ritchie46 in #3468
- improve
rolling_var
performance by @ritchie46 in #3470 - power by expression and improve rust lazy ergonomics by @ritchie46 in #3475
- add specialized rolling_std kernel by @ritchie46 in #3476
- fix null commutativity by @ritchie46 in #3479
- use anyvalue if first apply list result is empty by @ritchie46 in #3480
- Added describe method to rust library by @glennpierce in #3320
- Groupby Optimization for sorted keys:
~15x
perf gain. by @ritchie46 in #3489 - make cat merge fallible and loossen restrictions on categorical appends by @ritchie46 in #3491
- Fix LazyFrame.join_asof documentation reference by @adamgreg in #3493
- feat: support pl.Time in Series.str.strptime by @fsimkovic in #3496
- str().extract_all / str().count_match by @ritchie46 in #3507
- add apply to cookbooks by @ritchie46 in #3504
- support all arrow dictionary keys < 64 bit by @ritchie46 in #3508
- fix accidental quadratic behavior in rolling_groupby by @ritchie46 in #3510
- Fix some unit test deprecation warnings by @adamgreg in #3503
Experimental
Allowrolling_<agg>
expressions to determine window size by another{Date, Datetime}
series. by @ritchie46 in #3514- use specialize kernels in rolling_groupby aggregation
~10x
perf gain (window of 100 elements) by @ritchie46 in #3515 - reduce probability of quadratic behavior in min/max rolling by @ritchie46 in #3516
- adjust for kleene logic in drop_na by @ritchie46 in #3529
- fix aggregation of empty list by @ritchie46 in #3527
- fix sorting of chunked numeric arrays by @ritchie46 in #3528
- adjust for kleene logic in drop_na by @ritchie46 in #3530
- Improve rolling min max by @ritchie46 in #3531
- fix null aggregation edge case by @ritchie46 in #3536
- allow concat/append expressions by @ritchie46 in #3541
- make sort by multiple columns parallel by @ritchie46 in #3549
- allow more aggregations on dtype duration by @ritchie46 in #3550
- use first series to validate length by @ritchie46 in #3551
- Raise a more helpful TypeError when trying to subscript a LazyFrame. by @ghuls in #3554
- Readability Fixes r2 by @ryanrussell in #3556
- add count_match, extract_all to python ref guide by @ritchie46 in #3558
- fill_null limits by @ritchie46 in #3559
- test sortedness propagation by @ritchie46 in #3560
- update boolean aggregates and ensure they return IdxSize by @ritchie46 in #3563
- Improve parse_lines error message. by @ghuls in #3569
sorted_merge_join
by @ritchie46 in #3505- Rust Readability Improvements by @ryanrussell in #3573
- fix invalid fast path of sorted joins and improve sortedness propagation by @ritchie46 in #3577
- prevent expensive type coercion in expression and fix when->then->oth… by @ritchie46 in #3579
- Updated the fmt feature flag error message by @TheDan64 in #3586
- Fix u16 Series formatting. by @ghuls in #3584
- update arrow to crates.io:
~2x json
parsing improvement by @ritchie46 in #3588
New Contributors
- @kianmeng made their first contribution in #3311
- @briandk made their first contribution in #3323
- @EwoutH made their first contribution in #3352
- @adamgreg made their first contribution in #3450
- @ryanrussell made their first contribution in #3488
- @fsimkovic made their first contribution in #3496
- @chitralverma made their first contribution in #3578
- @TheDan64 made their first contribution in #3586
Full Changelog: rust-polars-v0.21.1...rust-polars-v0.22.1