⚠️ Deprecations
- Fix group keys in
partition_by(as_dict=True)
/GroupBy.__iter__
in some cases (#13646) - Rename
row_count_name
/row_count_offset
parameters in IO functions torow_index_*
(#13563) - Deprecate
dt.datetime
in favor ofdt.replace_time_zone(None)
(#13520) - Rename
with_row_count
towith_row_index
(#13494) - Deprecate
Expr.where
in favor offilter
(#13440)
🚀 Performance improvements
- elide parallelism restriction on generic rolling expressions (#13662)
- ensure time groups are parallelized (#13660)
- do not eagerly compute bitcount (#13562)
- optimise SQL engine string concat (#13499)
- Refactor expression parsing logic of predicates/constraints (#13468)
- Represent
Enum
categories as Series (#13434) - remove lifetime requirement from CategoricalChunkedBuilder (#13319)
✨ Enhancements
- write parquet ColumnOrder (#13672)
- Impl
contains
for ArrayNameSpace (#13638) - improve
rolling()
expression formatting (#13657) - Implement
is_between
in Rust (#11945) - Add base
PolarsError
andPolarsWarning
class (#13615) - typing overloads for Series operator methods
ge, gt, ...
(#13167) - Expressify
pattern
ofstr.extract
(#13607) - Impl
join
for ArrayNameSpace (#13586) - add SQL engine support for string cast to
json
(#13624) - add SQL engine support for
EXTRACT
andDATE_PART
(#13603) - Allow drop with no inputs as a no-op (#13460)
- add SQL engine support for
POSITION
andSTRPOS
(#13585) - additional multi-column support for
pl.<function>
entries (#13336) is_in
support for array dtype (#13559)- add new
str.find
expression, returning the index of a regex pattern or literal substring (#13561) - Impl and dispatch arr.first/last to get (#13536)
- Implement
from_dataframe
natively (interchange protocol) (#10701) - add SQL engine support for
LIKE
andILIKE
pattern matching (#13522) - improve hive partition pruning (#13358) (#13426)
- Add compact syntax for
int_range
starting from 0 (#13530) - don't rechunk by default in lazy scans (#13518)
- Add
cum_count
expression function (#13478) - add SQL engine support for
IF
control flow function (#13491) - add SQL engine support for
MOD
function (#13502) - return datetime for datetime mean & median (#13417)
- add SQL engine support for
CONCAT_WS
string function (#13483) - Allow map_batches to auto-convert output NumPy arrays to Series (#13277)
- add SQL engine support for
RIGHT
andREVERSE
string functions (#13461) - implement
BinaryView
andUtf8View
inpolars-arrow
(#13243) - add SQL engine support for variadic string
CONCAT
function (#13428) - add support for AND in SQL join-clause context (#13242)
- Impl ordering ops for array namespace (#13414)
- add SQL engine support for
REPLACE
string function (#13431) - add SQL engine support for
SIGN
function (#13429) - add SQL engine support for
IFNULL
function (#13432) - additional SQL support for
bytes
,bit
, andhex
literals (#13389)
🐞 Bug fixes
- gather.get schema (#13679)
- Fix group keys in
partition_by(as_dict=True)
/GroupBy.__iter__
in some cases (#13646) - ensure we hit proper cache in nested
rolling
expressions (#13666) - Allow
av_buffer
cast numeric record to temporal type (#13661) - streaming cross join if swapped is hit (#13656)
- Make sure rolling key is projected when process projection (#13622)
- fix schema inference for json (#13637)
- Improve parsing of inputs for Expr dunders (#13635)
- Empty series of AggregatedList should also have list dtype (#13620)
Series.eq_missing
should return an Expr when the input is an Expr (#13628)- fallback to cast kernel if
inline_cast
AnyValue raise (#13595) - Fix formatting in
describe
for precise quantiles (#13593) - fix reverse variable row decoding (#13587)
- Fix
scatter
for null values (#13578) - Fix
cum_count
with regards to start value / null values (#13535) - Fix precision/scale handling and invalid numbers in string-to-decimal conversions. (#13548)
- Treat Python
None
as null value forObject
dtype (#13564) - Fix
scatter
to allow single temporal inputs (#13577) - Fix interchange protocol data buffer dtype (#10787)
Expr.replace
to single value did not replace NULLs (#13551)- improve hive partition pruning (#13358) (#13426)
- fix projection pushdown for new outer join schema (#13527)
- dont raise when partial function is passed to map_elements (#13524)
- improve reading of mixed string/other dtype column data from spreadsheets with
openpyxl
andpyxlsb
engines (#13495) - ensure size-hint of TrueIdxIter is correct (#13508)
- correct 'outer_coalesce' logic in case of duplicate names (#13501)
- raise for out-of-range datetimes in to_datetime/strptime (#13403)
- Fix Series equality for List/Array types (#13477)
- Keep logical type when getting values from list (#13456)
- Handle duplicate/ambiguous inputs for
replace
(#13217) - Handle empty inputs to Enum constructor (#13446)
- Fix
group_by
iteration when grouping by certain selectors (#13437) - Fix
to_pandas
for 0x0 dataframe (#13420) - Fix offsets for numeric types in
from_buffer
(#13398)
📖 Documentation
- Clarify documentation for the
agg_list
argument inExpr.map_batches
(#13625) - fix linking to feature flags in user guide (#13644)
- bring sink_ndjson docstring in line with other sink docstrings (#13636)
- Update
then
andotherwise
docstrings with "strings are parsed as column names" (#13630) - Add
sink_ndjson
to API reference. (#13627) - Improve documentation on broadcasting (#13394)
- Add note about toolchain issue under native Windows (#13590)
- Hint about ruff setting in VSCode (#13421)
- Clarify examples for .transpose() (#13581)
- Add additional
Series
docstring examples (#13558) - Doc example for
read_csv
(#13161) (#13545) - Add more doc examples on how to create an index column (#13532)
- update SQL section of the README (#13529)
- Add note to
int_range
docs for creating an index column (#13516) - add a note to the
read_database_uri
docstring about escaping special characters in the connection string (#13514) - update polars-business > polars-xdt link (#13509)
- Fix various typos, grammar and formatting in docstrings and user guide (#13506)
- Doc examples for
threadpool_size
andget_index_type
(#13496) - Add missing datetime examples to docs (#13487)
- add polars-distance to plugins page (#13454)
- define file-like object in read_parquet docstring (#13463)
- Move
Series.struct.json_encode
to methods in Sphinx autosummary (#13443) - Add missing examples of
series/list.py
(#13423) - show
datetime.date
import in code block (#13419) - clarify documentation for rle and rle_id (#13397)
- use named series in Series.plot example (#13407)
- fix alphabetical order of documentation entries (#13396)
🛠️ Other improvements
- Auto-add 'needs triage' label to bugs (#13671)
- make rolling index column visible to optimizer (#13658)
- Enable new error message lint to improve stack trace display (#13596)
- Add
Documentation
/Build system
sections to the changelog (#13594) - Filter unhelpful messages in
make build
(#13579) - Remove extra line break between checkboxes in GitHub bug report issues (#13576)
- Narrow type hint for
get_index_type
util (#13556) - Fix some test failures/slowdowns (#13504)
- pandas 2.2 compat (#13467)
- Increase timeout for gevent async test (#13448)
- Do not end docstrings with a blank line (#13193)
Thank you to all our contributors for making this release possible!
@Bromeon, @MarcNuebel, @MarcoGorelli, @ShivMunagala, @Wainberg, @aaarrti, @alexander-beedie, @bchalk101, @c-peters, @cgevans, @cmdlineluser, @collinprince, @deanm0000, @hamishs, @henryharbeck, @ion-elgreco, @jcrozum, @mcrumiller, @nameexhaustion, @orlp, @petrosbar, @r-brink, @reswqa, @ritchie46, @s-banach, @shritesh, @stinodego, @tim-stephenson and @wjandrea