github pola-rs/polars py-0.19.4
Python Polars 0.19.4

latest releases: py-1.13.0, rs-0.44.2, rs-0.44.1...
13 months ago

🏆 Highlights

⚠️ Deprecations

  • Add disable_string_cache (#11020)

🚀 Performance improvements

  • improve dynamic_groupby_iter (#11341)
  • improve and fix rolling windows by linear scanning (#11326)
  • faster init from pydantic models that have a small number of fields, and support direct init from SQLModel data (often used with FastAPI) (#11263)
  • improve outer join materialization (#11241)
  • use ryu and itoa for primitive serialization (#11193)
  • use try-binary-elementwise instead of try-binary-elementwise-values in dt_truncate (#11189)
  • Using cache for str.contains regex compilation (#11183)

✨ Enhancements

  • introduce 'label' instead of 'truncate' in group_by_dynamic, which can take label='right' (#11337)
  • Expressify list.shift (#11320)
  • top_k and bottom_k supports pass an expr (#11344)
  • add "pyxlsb" engine support to read_excel (for excel binary workbook files) (#11248)
  • support 'hive partitioning' aware readers (#11284)
  • str.strip_chars supports take an expr argument (#11313)
  • sample n can take an expr (#11257)
  • Add disable_string_cache (#11020)
  • clip supports expr arguments and physical numeric dtype (#11288)
  • Introduce list.drop_nulls (#11272)
  • str.splitn and split_exact can take an expr argument by (#11275)
  • introduce ambiguous option for dt.round (#11269)
  • Adds NULLIF and COALESCE SQL functions (#11124)
  • better tree-formatting representation (#11176)
  • natively support reading parquet for aws, gcp and azure (#11210)
  • Expressify str.strip_prefix & suffix (#11197)
  • Add support for Iceberg (#10375)
  • list.join's separator can be expression (#11167)
  • argument every of datetime.truncate can be expression (#11155)

🐞 Bug fixes

  • Fix Series.__contains__ for None values and implement is_in for null Series (#11345)
  • don't panic on multi-nodes in streaming conversion (#11343)
  • ensure trailing quote is written for temporal data when CSV quote_style is non-numeric (#11328)
  • clarify has_validity docstring and fix several cases where the presence of a bitmask was used to incorrectly infer the existence of null values (#11319)
  • fix empty Series construction edge-case with Struct dtype (#11301)
  • DataFrame init from collections.namedtuple values (#11314)
  • Exclude functools wrapper frames in find_stacklevel (#11292)
  • set partitions independent of thread pool (#11304)
  • address VSCode issue with autocomplete on selector expressions in editor/console (#11235)
  • consume duplicates in rolling_by window (#11261)
  • handle url encoded paths in objectpath creation (#11240)
  • use POOL when writing csv (#11222)
  • don't conflate saved Config JSON string with file path (#11098)
  • is_in for bool evaluate has_false incorrectly (#11217)
  • improve handling of database drivers that can return arrow data (#11201)
  • fix nullable filter mask in group_by (#11207)
  • replace n-th in filter (#11206)
  • fix translation of Series-nested datetime/date values for scan_pyarrow predicates (#11195)
  • address unexpected expression name from use of unary - or + operators (#11158)
  • impl hash for more function expr (#11182)
  • list.join's separator can be expression (#11167)
  • Add some missing expr type hint for series (#11171)
  • consistently use negative every as the default for offset in group_by_dynamic (#11164)
  • Make pl.struct serializable (#11169)
  • only raise on actual parameter collision when "dtypes" specified in read_excel "read_csv_options" (#11162)
  • propagate null value for str/binary starts/ends_with and contains (#11141)

🛠️ Other improvements

  • simplify/clarify group_by_dynamic examples (#11335)
  • tighten assert_frame_equal for LazyFrames (don't collect until after the schema has been checked) (#11331)
  • unify display for namespaced function expr (#11342)
  • add lazy pivot example (#11325)
  • Use GITHUB_TOKEN to get contributor information for docs (#11321)
  • Enable version warning banner (#11322)
  • cross-reference null_count from has_validity (clarifies the correct way to check for nulls) (#11323)
  • Pin pydantic in dev requirements <2.4.0 (#11312)
  • remove default auto-explode for map_many_private (#11270)
  • Add type alias IntoExprColumn (#11296)
  • update a few dependencies (#11283)
  • Properly skip ADBC test (#11282)
  • Fix some minor Makefile issues (#11276)
  • update sponsors (#11271)
  • parametric tests for group_by_rolling (#11262)
  • Make some list function expr non-anonymous (#11230)
  • Mention the performant feature only once (#11223)
  • remove unneeded indirection (#11233)
  • remove unneeded mutex around object-store (#11224)
  • clarify every/period/offset in group_by_dynamic (#11175)
  • Fix read_database batch_size docstring (#11132)

Thank you to all our contributors for making this release possible!
@ByteNybbler, @Cheukting, @Fokko, @Hofer-Julian, @MarcoGorelli, @SeanTroyUWO, @alexander-beedie, @billylanchantin, @jonashaag, @mcrumiller, @orlp, @ptiza, @reswqa, @ritchie46, @stinodego and @universalmind303

Don't miss a new polars release

NewReleases is sending notifications on new releases.