snowflakedb/snowpark-python v1.30.0 on GitHub

1.30.0 (2024-03-27)

Added Support for relaxed consistency and ordering guarantees in Dataframe.to_snowpark_pandas by introducing the new parameter relaxed_ordering.
DataFrameReader.dbapi (PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.

Improved query generation for Dataframe.stat.sample_by to generate a single flat query that scales well with large fractions dictionary compared to older method of creating a UNION ALL subquery for each key in fractions. To enable this feature, set session.conf.set("use_simplified_query_generation", True).
Improved performance of DataFrameReader.dbapi by enable vectorized option when copy parquet file into table.
Improved query generation for DataFrame.random_split in the following ways. They can be enabled by setting session.conf.set("use_simplified_query_generation", True):
- Removed the need to cache_result in the internal implementation of the input dataframe resulting in a pure lazy dataframe operation.
- The seed argument now behaves as expected with repeatable results across multiple calls and sessions.
DataFrame.fillna and DataFrame.replace now both support fitting int and float into Decimal columns if include_decimal is set to True.
Added documentation for the following UDF and stored procedure functions in files.py as a result of their General Availability.
- SnowflakeFile.write
- SnowflakeFile.writelines
- SnowflakeFile.writeable
Minor documentation changes for SnowflakeFile and SnowflakeFile.open()

Fixed a bug for the following functions that raised errors .cast() is applied to their output
- from_json
- size

Fixed a bug in aggregation that caused empty groups to still produce rows.
Fixed a bug in Dataframe.except_ that would cause rows to be incorrectly dropped.
Fixed a bug that caused to_timestamp to fail when casting filtered columns.

Added support for list values in Series.str.__getitem__ (Series.str[...]).
Added support for pd.Grouper objects in group by operations. When freq is specified, the default values of the sort, closed, label, and convention arguments are supported; origin is supported when it is start or start_day.
Added support for relaxed consistency and ordering guarantees in pd.read_snowflake for both named data sources (e.g., tables and views) and query data sources by introducing the new parameter relaxed_ordering.

Raise a warning whenever QUOTED_IDENTIFIERS_IGNORE_CASE is found to be set, ask user to unset it.
Improved how a missing index_label in DataFrame.to_snowflake and Series.to_snowflake is handled when index=True. Instead of raising a ValueError, system-defined labels are used for the index columns.
Improved error message for groupby or DataFrame or Series.agg when the function name is not supported.