github pola-rs/polars py-1.5.0
Python Polars 1.5.0

latest releases: py-1.13.1, py-1.13.0, rs-0.44.2...
3 months ago

🚀 Performance improvements

  • Improve binview extend/ifthenelse (#18164)
  • Start on better Parquet delta decoding (#18049)
  • Rechunk group-by __iter__ (#18162)
  • Tune jemalloc to not create muzzy pages (#18148)
  • Reduce default async thread count (#18142)
  • Make expensive selector expansion lazy (#18118)
  • Use single threaded algorithms if only 1 core given (#18101)
  • Use Arc<Vec<_>> instead of Arc<[_]> for paths and hive partitions (#18066)
  • SIMD View from FixedSizeBinary (#18059)
  • Use bitmask to filter Parquet predicate-pushdown items (#17993)
  • Zerocopy buffers for FixedSizeBinary to BinaryView cast (#18043)

✨ Enhancements

  • Create literals for datetime/date expressions (#18184)
  • Create literals in 'datetime' expression (#18182)
  • Expose top-level "has_header" param for read_excel and read_ods (#18078)
  • Raise on invalid 'is_between' and improve error message quality (#18147)

🐞 Bug fixes

  • Fix struct shift and list builder (#18189)
  • Don't load Parquet nested metadata (#18183)
  • Throw bigidx error for Parquet row-count (#18154)
  • Fix unpivot on empty df (#18179)
  • Don't vertically parallelize cse contexts (#18177)
  • Ensure default values are included when saving/restoring the current Config state (#18151)
  • Properly handle empty Parquet row groups with no dictionary (#18161)
  • Struct outer nullabillity (#18156)
  • Fix pyarrow predicate pushdown regression (#18145)
  • Prevent unwanted supertype cast in 'search_sorted' (#18143)
  • Parquet with filter=None (#18139)
  • Don't raise when converting from pandas if index contains duplicate names when include_index=False (the default) (#18133)
  • Fix cast Float to String where Float is not turn to Integer before turning to String (#18123)
  • Don't remove leading whitespace in read_csv (#18131)
  • Py-polars compilation with no features (#18129)
  • String transform to_titlecase was too narrowly defined (#18122)
  • Reading Parquet with Null dictionary page (#18112)
  • When setting write_excel column totals, don't forget to include any row-total cols (#18042)
  • Incorrect lazy CSV select(len()) for compressed files (#18067)
  • Fix sink_ipc_cloud panicking with runtime error (#18091)
  • Properly write Parquet for sliced lists (#18073)
  • Panic reading multiple CSV files from cloud (#18056)
  • Fix CloudWriter to use buffer before making requests (#18027)
  • Fix typos and remove trailing whitespace (#18024)
  • Handle cfg(feature) for shrink_dtype (#18038)

📖 Documentation

  • Fix references to old methods in lazy docstring (#18178)
  • Include PyCapsule Interface in DataFrame and Series API docs (#18174)
  • Corrected example result in group_by docs (#18169)
  • Mention 'Array' in data types overview (#18060)
  • Correct concat rechunk in user guide (#18080)
  • Fix typo in title of Hugging Face docs page (#18097)
  • Update pivot docstring for clarity (#18000)

🛠️ Other improvements

  • Remove unneeded growable (#18165)
  • Update Cargo.lock to fix build error on Linux (#18153)
  • Remove Nth,Wildcard from ExprIR and make conversion falllible (#18115)

Thank you to all our contributors for making this release possible!
@EricTulowetzke, @KDruzhkin, @MarcoGorelli, @Vincenthays, @alexander-beedie, @coastalwhite, @davanstrien, @deanm0000, @ember91, @kylebarron, @mcrumiller, @nameexhaustion, @orlp, @philss, @ritchie46 and @rosstitmarsh

Don't miss a new polars release

NewReleases is sending notifications on new releases.