🏆 Highlights
- new implementation for
String/Binary
type. (#13748)
⚠️ Deprecations
- Deprecate
dtype_if_empty
parameter forSeries
constructor (#13976)
🚀 Performance improvements
- improve string/binary reverse performance (#14016)
- add "calamine" support to
read_excel
, usingfastexcel
(~8-10x speedup) (#14000) - optimize
DataFrame.describe
by presorting columns (#13822) - elide redundant bound checks. (#13909)
- speedup boolean filter (#13905)
- speedup binview filter (#13902)
- allow python threads in read_ functions (#13886)
- improve binview filter (#13878)
- apply string view GC more conservatively (#13850)
- add optimized BinaryViewArray comparison kernels (#13839)
- lazy cache binview bytes len (#13830)
- fast-path for eager int_range (#13811)
- Optimize
arr.sum
for inner non-null bool (#13800)
✨ Enhancements
- Add
UnstableWarning
for unstable functionality (#13948) - DataFrame supports explode by array column (#13958)
- add "calamine" support to
read_excel
, usingfastexcel
(~8-10x speedup) (#14000) - improve binary formatting (#13981)
- preserve Enum information when going to IPC (#13943)
- support calling
describe
on aLazyFrame
(#13982) - support kwargs in plugin 'field' functions and raise error on unsupported binview layout (#13944)
- support cast decimal to utf8 (#13829)
- add SQL support for
timestamp
precision modifier (#13936) - support negative indexing and expressions for
LEFT
,RIGHT
andSUBSTR
SQL string funcs (#13888) - Introduce
explode
forArrayNameSpace
(#13923) - unify Series/DataFrame
describe
code (#13720) - raise better error message for .dt.time on Date column (#13932)
- List set_operations supports float (#13920)
- Add
ignore_nulls
forarr.join
(#13919) - register 'set_sorted' as batch/elementwise (#13896)
- move Enum/Categorical categories to binview (#13882)
- Add
ignore_nulls
forlist.join
(#13701) - Add
ignore_nulls
forpl.concat_str
(#13877) - Align
int_range
andint_ranges
signatures (#13867) - fix parquet for binview (#13873)
- support mmap for binview in OOC (#13872)
- implement ffi for
binview
(#13871) - Support zero fill null strategy for binary and string columns (#13869)
- allow df.rename and lf.rename to take a renaming function (#13708)
- Implement/fix unary minus operator
-pl.col(...)
(#13776) - extend SQL
EXTRACT
with "century", "millennium", and "timezone" parts (#13634) - fix binview ipc format (#13842)
- add SQL support for
numeric
and/ordecimal
types (#13739) - improve panic message (#13836)
- Expressify
str.zfill
(#13790) - new implementation for
String/Binary
type. (#13748) - Add typing to hvplot plot namespace (#13813)
- Add
nulls_last
forSeries.sort
(#13794) - allow
ftp
URLs, improve URL check (#13781)
🐞 Bug fixes
- count matches on list categorical (#14021)
list.min/max
with empty and/or None elements (#14018)- Make
to_pandas()
work for Dataframe and Series with dtypeObject
(#13910) - raise for
pl.concat(how="align")
when no columns are shared between frames (#13941) - Fix casting from categorical to numeric (#13957)
- read_csv preserve whitespace and newlines (#13934)
- omit implicit 'site' from import-timing test (#14009)
- append decimal with different scale (#13977)
- Use
date_as_object=False
as default forSeries.to_pandas
(just likeDataFrame.to_pandas
) (#13984) - serialize decimal type (#13997)
- check input type for
arr/list.contains
(#13959) - Fix
max_colname_length
formatting inglimpse()
(#13969) - Allow dtype merge when inner dtype is enum (#13938)
- recurse less in streaming shared sinks (#13930)
- ensure order is preserved if streaming from different sources (#13922)
- Fix
is_not_null
for Struct columns (#13921) - convert object-dtyped NumPy str/bytes arrays to pl.String/pl.Binary instead of pl.Object (#13712)
- allow extract of numeric from str AnyValue (#13865)
- single-element .dt.time() and .dt.date() should always preserve sortedness (#13808)
- prune emtpy chunks before set operations (#13898)
- treat null columns as zero in
sum_horizontal
(#13880) - include null count in rolling window validity with
min_periods
(#13863) - Fix interchange protocol for new String type (#13881)
- parquet hybrid RLE encoding did not always align to bit width (#13883)
- Add
ignore_nulls
forlist.join
(#13701) - .dt.time() was panicking for datetimes prior to unix epoch (#13812)
- allow list creation of decimals (#13851)
- ensure kwargs
filter
behaviour matches docstring (expect equivalence witheq
) (#13864) - Implement
abs
for Decimal, error on Date/Time/Datetime (#13821) - rolling nested groups deadlock (#13835)
gather_every
should work on agg context (#13810)- Fix segfault of
is_in
(#13814) - don't panic on full null qcut (#13815)
- validate operator arithmetic with
None
, fixSeries
edge-case (#13780)
📖 Documentation
- Add missing doc entries (#14006)
- add missing len to rst file (#13999)
- Improve structure of user guide (#13951)
- Improve structure of user guide (#13639)
- Introduce ecosystem page in user guide (#13903)
- Mention deltalake write support in README (#13890)
- use proper argument names in the code blocks of api.rst (#13866)
🛠️ Other improvements
- make Enums an actual datatype (#14011)
- omit implicit 'site' from import-timing test (#14009)
- Constructor improvements - part 1 (#14001)
- Add
glimpse
test (#13979) - Move PyO3 ChunkedArray conversion logic into its own module (#13973)
- Fix xdist streaming group (#13974)
- Fix spurious test failures (#13961)
- minor
describe
tidy-up, and slight rewording of some Exception docstrings (#13942) - Fix pip warning filter return code (#13935)
- Minor refactor of PyO3 conversions module (#13929)
- move
filter
topolars-compute
(#13897) - Revert pandas warning filter (#13893)
- Make functions in
expr/general
non-anonymous (#13832) - Fix doctests (#13831)
- Refactor Python release workflow (#13807)
Thank you to all our contributors for making this release possible!
@ByteNybbler, @JulianCologne, @MarcoGorelli, @Wainberg, @alexander-beedie, @c-peters, @dependabot, @dependabot[bot], @edavisau, @flisky, @ion-elgreco, @itamarst, @jacksonthall22, @kstoneriv3, @mcrumiller, @mkucijan, @nameexhaustion, @orlp, @petrosbar, @r-brink, @reswqa, @ritchie46, @stinodego, @taki-mekhalfa, @thomasaarholt and @valorien