🚀 Performance improvements
- optimize join inner materialization of single keys (#8405)
- parallelize sorted group tuple materialization (#8387)
- improve materialization of huge cardinality group tuples (#8382)
- improve group_tuples materialization (#8375)
- conversion speedups from polars int64 timestamps to python temporal types:
- use online variance kernel for aggregation (#8306)
✨ Enhancements
- allow existing
item
method to optionally take row/col indices (#8412) - allow negative 'arange' expression (#8413)
- warn if argument is not explicitly sorted (#8409)
- .to_numpy(use_pyarrow=False) for Object and Boolean (#8397)
- new hypothesis strategy that can generate data for
List
dtypes (#8400) - offer cleaner usage pattern for
Config
object in context-manager context (#8394) - add support for SQL "IN" expr (#8396)
- add a "signed" param to
Series.is_integer
(#8383) - add is_integer (#8373)
- raise error on invalid dict aggregation (#8371)
- cli output mode & sql read_json (#8336)
- more informative keyerror on invalid getitem (#8320)
🐞 Bug fixes
- infer supertype in json serde (#8411)
- duration on empty df (#8403)
- don't inadvertently set
Series
initialised with nested tuple data asObject
dtype (#8401) - use physical in streaming unique global table (#8390)
- recursively bubble up all dtypes in list cast (#8386)
- is_in struct logical types (#8378)
- fix nested null parquet read (#8372)
- fix logical type in ListChunked::new_from_index (#8367)
- fix unintentional loading of hypothesis profile (#8362)
- bubble up logical type in recursive list cast (#8356)
- ensure that
iter_rows
doesn't return nestedTimestamp
values (#8359) - implement clone_inner for all series (#8357)
- add missing
__hash__
support toField
, include "time_zone" inDatetime
hash, fixStruct
hash (#8354) - fix fill_null for categorical (#8353)
- time.cast(str) as strftime (#8351)
- fix logical dtypes in parallel list collection (#8349)
- improve logical types of explode operation (#8348)
- logical type in anonymous list builders (#8346)
- address potential error caused by float division on time_unit scaling (#8337)
- escape csv header names if they contain special chars (#8331)
- nested struct/list/categorical logical/physical (#8334)
- fix struct schema argument (#8327)
- fix precision issue when converting pl.Datetime("ms") to Python datetime (#8332)
- fix deserialize empty list (#8326)
- List<Null> consistency (#8325)
- fix coalesce schema (#8324)
- don't do null propagation (#8322)
- validate
window_size
user input in rolling_expr (#8318) - ensure invalid list eval raises (#8317)
- fix typing overloads of
read_excel
(#8300)
🛠️ Other improvements
- new hypothesis strategy that can generate data for
List
dtypes (#8400) - update
duration
docstring/example (#8392) - Upgrade ruff (#8380)
- enhanced parametric testing for temporal dtypes (#8347)
- Minor update to
strptime
(#8345) - adjust pytest config so as not to inadvertently prevent test debugging in IPython consoles (#8308)
- add newline in pl.DataFrame.pivot docs (#8307)
Thank you to all our contributors for making this release possible!
@JoonHong-Kim, @MarcoGorelli, @StefanBRas, @alexander-beedie, @avimallu, @grantmcdermott, @jonashaag, @rben01, @ritchie46, @stinodego and @universalmind303