github snowflakedb/snowpark-python v1.30.0
Release

6 days ago

1.30.0 (2024-03-27)

Snowpark Python API Updates

New Features

  • Added Support for relaxed consistency and ordering guarantees in Dataframe.to_snowpark_pandas by introducing the new parameter relaxed_ordering.
  • DataFrameReader.dbapi (PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.

Improvements

  • Improved query generation for Dataframe.stat.sample_by to generate a single flat query that scales well with large fractions dictionary compared to older method of creating a UNION ALL subquery for each key in fractions. To enable this feature, set session.conf.set("use_simplified_query_generation", True).
  • Improved performance of DataFrameReader.dbapi by enable vectorized option when copy parquet file into table.
  • Improved query generation for DataFrame.random_split in the following ways. They can be enabled by setting session.conf.set("use_simplified_query_generation", True):
    • Removed the need to cache_result in the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.
    • The seed argument now behaves as expected with repeatable results across multiple calls and sessions.
  • DataFrame.fillna and DataFrame.replace now both support fitting int and float into Decimal columns if include_decimal is set to True.
  • Added documentation for the following UDF and stored procedure functions in files.py as a result of their General Availability.
    • SnowflakeFile.write
    • SnowflakeFile.writelines
    • SnowflakeFile.writeable
  • Minor documentation changes for SnowflakeFile and SnowflakeFile.open()

Bug Fixes

  • Fixed a bug for the following functions that raised errors .cast() is applied to their output
    • from_json
    • size

Snowpark Local Testing Updates

Bug Fixes

  • Fixed a bug in aggregation that caused empty groups to still produce rows.
  • Fixed a bug in Dataframe.except_ that would cause rows to be incorrectly dropped.
  • Fixed a bug that caused to_timestamp to fail when casting filtered columns.

Snowpark pandas API Updates

New Features

  • Added support for list values in Series.str.__getitem__ (Series.str[...]).
  • Added support for pd.Grouper objects in group by operations. When freq is specified, the default values of the sort, closed, label, and convention arguments are supported; origin is supported when it is start or start_day.
  • Added support for relaxed consistency and ordering guarantees in pd.read_snowflake for both named data sources (e.g., tables and views) and query data sources by introducing the new parameter relaxed_ordering.

Improvements

  • Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASE is found to be set, ask user to unset it.
  • Improved how a missing index_label in DataFrame.to_snowflake and Series.to_snowflake is handled when index=True. Instead of raising a ValueError, system-defined labels are used for the index columns.
  • Improved error message for groupby or DataFrame or Series.agg when the function name is not supported.

Don't miss a new snowpark-python release

NewReleases is sending notifications on new releases.