🚀 Performance improvements
- add new when-then-otherwise kernels (#15089)
- Coerce sorted flag of unit arrays during concat (#15104)
- Use sorted flag for
(first|last)_non_null
(#15050) - OOC sort improvements (#14994)
✨ Enhancements
- raise if both
closed
andby
are passed torolling_*
aggregations (#15108) - raise informative error for rolling_* aggs with
by
of invalid dtype (#15088) - add
non_existent
arg toreplace_time_zone
(#15062) - Support single nested row encodings (#15105)
- make ooc sort configurable (#15084)
- Make
register_plugin
a standalone function and include shared lib discovery (#14804) - Async parquet: Decode parquet on a blocking thread pool (#15083)
- let "ambiguous" take "null" value (#14961)
- Raise informative error message when join would introduce duplicate column name (#15042)
- Allow cast of decimal to boolean (#15015)
- Return error when no supertype can be determined in AnyValue constructor when
strict=false
(#15025) - Implement IpcReaderAsync (#14984)
- Support Array statistics in parquet (#15031)
- Support decimal groupby (#15000)
- Add thread names to rayon thread pool (#15024)
- Support decimal uniq (#15001)
- expose timings in verbose state of OOC sort (#14979)
🐞 Bug fixes
- Fix Series construction from nested list with mixed data types (#15046)
- Support BinaryView in row decoder to prevent a panic in streaming group by (#15117)
- Binview chunked gather; don't modify inlined view (#15124)
- Fix chunked_id gather for binview buffers (#15123)
- Don't cache HTTP object stores as they maintain URL state (#15121)
- use wrapping_add in csv line snooping (#15109)
- Output
u32
whensum_horizontal
provided with single boolean column (#15114) - Ensure
eprintln!
is only called within debug/verbose context (#15100) - Propagate error instead of panicking when calling
product
on an invalid type (#15093) - Raise error when casting Array to different width (#14995)
- Fix file scan bugs for ipc, csv and parquet that occur with combinations of glob paths, row indices and predicates (#15065)
- Incorrectly preserved sorted flag when concatenating sorted series containing nulls (#15082)
- Return largest non-NaN value for
max()
on sorted float arrays if it exists instead of NaN (#15060) - return NaN for all-NaN min/max (#15066)
- Prevent "index out of range for slice" error in parquet reader (#15021)
- Respect
nulls_last
in streaming sort (#15061) - Fix Series construction from nested list with mixed data types (#15046)
- Don't count nulls in streaming
count
agg (#15051) - agg_list on decimal lost scale (#15054)
- Block predicate pushdown on equality that are use in join (#15055)
- Enum equality based on categories (#15053)
- Strict cast in when/then/otherwise operation (#15052)
- Don't panic in
string_addition_to_linear_concat
(#15006) - CSV do utf8-validation after escaping fields (#15004)
- Use primitive constructors to create a Series of lists when dtype is provided (#15002)
- replace_time_zone with single-null-element "ambiguous" was panicking (#14971)
📖 Documentation
- Fix typo in comment (#14997)
🛠️ Other improvements
- Extend and speed up scan tests (#15127)
- always assert on ChunkedArray::get (#15120)
- Use ObjectStore instead of AsyncRead in parquet get metadata (#15069)
- Minor refactor of Rust any value constructors (#15077)
- Simplify streaming execution (#15039)
- Ensure we hit the spilled source path in ooc sort test (#15010)
- Refactor constructor code (#15009)
- Apply
clippy:assigning_clones
lint (#14999) - fix features (#14977)
Thank you to all our contributors for making this release possible!
@JackRolfe, @MKisilyov, @MarcoGorelli, @alexander-beedie, @c-peters, @flisky, @jqnatividad, @mcrumiller, @mickvangelderen, @nameexhaustion, @orlp, @petrosbar, @ritchie46, @stinodego and @trueb2