github pola-rs/polars py-1.6.0
Python Polars 1.6.0

latest releases: py-1.13.0, rs-0.44.2, rs-0.44.1...
2 months ago

💥 Unstable Breaking changes

These API's were marked unstable and are allowed to change.

  • Use Altair in DataFrame.plot (#17995)

🚀 Performance improvements

  • Parquet do not copy uncompressed pages (#18441)
  • Several large parquet optimizations (#18437)
  • Batch Plain Parquet UTF-8 verification (#18397)
  • Partition metadata for parquet statistic loading (#18343)
  • Fix accidental quadratic parquet metadata (#18327)
  • Lazy decompress Parquet pages (#18326)
  • Don't rechunk aligned chunks in owned_binary_chunk_align (#18314)
  • Batch DELTA_LENGTH_BYTE_ARRAY decoding (#18299)
  • Slice pushdown for SimpleProjection (#18296)
  • Use direct path for time/timedelta literals (#18223)
  • Speedup ndjson reader ~40% (#18197)
  • Skip parquet page when unneeded (#18192)

✨ Enhancements

  • Use Altair in DataFrame.plot (#17995)
  • Allow mapping as syntactic sugar in str.replace_many (#18214)
  • Respect input time zone if input is pandas Timestamp (#18346)
  • Improve Schema and DataType interop with Python types (#18308)
  • Add POLARS_BACKTRACE_IN_ERR for debugging (#18333)
  • IR serde (#18298)
  • Improve decimal_comma error message (#18269)
  • Support pre-signed URLs for cloud scan (#18274)
  • Support the most recent version of "duckdb_engine" connections via read_database (#18277)
  • Support empty structs (#18249)
  • Allow float in interpolate_by by column (#18015)
  • Make show_versions more responsive (#18208)

🐞 Bug fixes

  • Enable CSE in eager if struct are expanded (#18426)
  • Treat explode as gather (#18431)
  • Parquet nested values that span several pages (#18407)
  • Support reading empty parquet files (#18392)
  • Recurse on map field during type conversion (#15075)
  • Allow search_sorted on boolean series (#18387)
  • Mark Expr.(lower|upper)_bound as returning scalar (#18383)
  • Fix compressed ndjson row count (#18371)
  • Use correct column names when there are no value columns in unpivot (#18340)
  • Parquet several smaller issues (#18325)
  • Fix group-by slice on all keys (#18324)
  • Compute joint null mask before calling rolling corr/cov stats (#18246)
  • Several scan_parquet(parallel='prefiltered') problems (#18278)
  • Json feature flag missing imports (#18305)
  • Check groups in group-by filter (#18300)
  • Parquet delta encoding for 0-bitwidth miniblocks (#18289)
  • Arguments for upsample only have to be sorted within groups (#18264)
  • Use appropriate bins in hist when bin_count specified (#16942)
  • Raise suitable error on unsupported SQL set op syntax (#18205)
  • Fix invalid state due to cached IR (#18262)
  • Fix failed AWS credential load from '~/.aws/credentials' due to formatting (#18259)
  • Fix panic streaming parquet scan from cloud with slice (#18202)
  • Consistently round half-way points down in dt.round (#18245)
  • Fix duplicate column output and panic for include_file_paths (#18255)
  • Fix unit null rank (#18252)
  • Use physical for row-encoding (#18251)
  • Convert date and datetime in literal construction (#16018)
  • Fix gather str as lit (#18207)

📖 Documentation

  • Add date_range and datetime_ranges examples without eager=True (#18379)
  • Fix incorrect comments in group_by_dynamic (#18415)
  • Alphabetise methods in Python API reference (#18380)
  • Document POLARS_BACKTRACE_IN_ERR env var (#18354)
  • Add missing aggregation entries (#18334) (#18341)
  • Add missing Series methods to API reference (#18312)
  • Document DataFrame.__getitem__ and Series.__getitem__ (#18309)
  • Fix typos and add see also links to struct name expressions (#18282)
  • Improve decimal_comma error message (#18269)
  • Clarify coalesce behaviour in join_asof (#18273)
  • Add note to Expr.shuffle differentiating from df method (#18266)
  • Improve formatting and consistency of various docstrings (#18237)
  • Add missing "Parameters" section to bin.size expr docstring (#18222)
  • Fix column name output in example of DataFrame.map_rows (#18227)

📦 Build system

  • Bump Rust toolchain to nightly-2024-08-26 (#18370)

🛠️ Other improvements

  • Address spurious hypothesis test failure (#18434)
  • Turn all Binary/Utf8 into BinaryView/Utf8View in Parquet (#18331)
  • Fix the required version of rust in README.md (#18357)
  • Remove unused Parquet indexes (#18329)
  • Deprecate serialize json for LazyFrame (#18283)
  • Don't add sink node to cloud query (#18280)
  • Split py-polars crate (#18204)
  • Fix test for new deltalake release (#18211)
  • Update the required version of rust in README.md (#18203)
  • Fix version bifurcation for test_read_database_cx_credentials (#18220)
  • Use or_else for raising (#18206)
  • Remove unused Parquet source files (#18193)

Thank you to all our contributors for making this release possible!
@BartSchuurmans, @ChayimFriedman2, @MarcoGorelli, @StepfenShawn, @agossard, @alexander-beedie, @cgbur, @coastalwhite, @corwinjoy, @deanm0000, @henryharbeck, @ion-elgreco, @jqnatividad, @krasnobaev, @liufeimath, @markxwang, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego, @sunadase, @thomascamminady and @wence-

Don't miss a new polars release

NewReleases is sending notifications on new releases.