🚀 Performance improvements
- Improved type-inference for
read_excel
andread_ods
, use calamine engine forread_ods
(#15808) - Fix quadratic in binview growable same source (#15734)
- use two binary searches for equality mask when data is sorted (#15702)
- improve filter parallelism (#15686)
✨ Enhancements
- Minor type-inference update for
read_database
(#15809) - Improved type-inference for
read_excel
andread_ods
, use calamine engine forread_ods
(#15808) dt.truncate
supports broadcasting lhs (#15768)- Expressify
str.json_path_match
(#15764) - raise if
storage_options
is passed to read_csv butfsspec
isnt available (#15778) - Support decimal float parsing in CSV (#15774)
- Add context trace to
LazyFrame
conversion errors (#15761) - Improve error message when passing invalid input to
lit
(#15718) - Remove outdated join validation checks (#15701)
🐞 Bug fixes
- drop-nulls edge case; remove drop-nulls special case (#15815)
- ewm_mean_by was skipping initial nulls when it was already sorted by "by" column (#15812)
- Consult cgroups to determine free memory (#15798)
- raise if index count like 2i is used when performing rolling, group_by_dynamic, upsample, or other temporal operatios (#15751)
- Don't deduplicate sort that has slice pushdown (#15784)
- Allow passing files opened by fsspec in
read_parquet
(#15770) - Fix incorrect
is_between
pushdown toscan_pyarrow_dataset
(#15769) - Handle null index correctly for list take (#15737)
- Preserve lexical ordering on concat (#15753)
- Remove incorrect unsafe pointer cast for int -> enum (#15740)
- pass series name to apply for cut/qcut (#15715)
- count of null column shouldn't panic in agg context (#15710)
- manual cache (#15711)
- Ensure we don't hold onto Mutex when grabbing join tuples (#15704)
- allow null dtypes in UDFs if they match the schema (#15699)
- Respect join_null argument for semi/anti joins (#15696)
- Ensure we don't hold RwLock when spawning group parallelism in w… (#15697)
- Ensure empty with_columns is a no-op (#15694)
- Include predicate in cache state union (#15693)
- Add the missing feature flag for
ewm_mean_by
(#15687) - 8/16-bits int could also apply in place for log expr (#15680)
prepare_expression_for_context
shouldn't panic if exceptions raised from optimizer (#15681)
📖 Documentation
- Add docstring examples for datetimes (#13161) (#15804)
- Fix a typo in categorical section of the user guide (#15777)
- Fix a docstring mistake for DataType.is_float (#15773)
- Remove incorrect "1i (1 index count)" from some docs methods (#15750)
- Add example for
Config.set_tbl_width_chars
(#15566) - Align docstring phrasing in
Series/Expr.dt.truncate/round
(#15698) - Various deprecation docstring improvements (#15648)
🛠️ Other improvements
- Always expand horizontal_any/all (#15816)
- Rename decimal_float to decimal_comma (#15817)
- Split coverage calculation (#15780)
- Update readme (#15787)
- Start at using new Bound<> API from PyO3 (#15752)
- Make
json_path_match
expr non-anonymous (#15682)
Thank you to all our contributors for making this release possible!
@MarcoGorelli, @NedJWestern, @Robinsane, @TobiasDummschat, @alexander-beedie, @c-peters, @dependabot, @dependabot[bot], @gasmith, @henryharbeck, @itamarst, @kszlim, @mbuhidar, @nameexhaustion, @orlp, @reswqa, @ritchie46, @stinodego and @wsyxbcl