1.30.0 (2024-03-27)
Snowpark Python API Updates
New Features
- Added Support for relaxed consistency and ordering guarantees in
Dataframe.to_snowpark_pandas
by introducing the new parameterrelaxed_ordering
. DataFrameReader.dbapi
(PrPr) now accepts a list of strings for the session_init_statement parameter, allowing multiple SQL statements to be executed during session initialization.
Improvements
- Improved query generation for
Dataframe.stat.sample_by
to generate a single flat query that scales well with largefractions
dictionary compared to older method of creating a UNION ALL subquery for each key infractions
. To enable this feature, setsession.conf.set("use_simplified_query_generation", True)
. - Improved performance of
DataFrameReader.dbapi
by enable vectorized option when copy parquet file into table. - Improved query generation for
DataFrame.random_split
in the following ways. They can be enabled by settingsession.conf.set("use_simplified_query_generation", True)
:- Removed the need to
cache_result
in the internal implementation of the input dataframe resulting in a pure lazy dataframe operation. - The
seed
argument now behaves as expected with repeatable results across multiple calls and sessions.
- Removed the need to
DataFrame.fillna
andDataFrame.replace
now both support fittingint
andfloat
intoDecimal
columns ifinclude_decimal
is set to True.- Added documentation for the following UDF and stored procedure functions in
files.py
as a result of their General Availability.SnowflakeFile.write
SnowflakeFile.writelines
SnowflakeFile.writeable
- Minor documentation changes for
SnowflakeFile
andSnowflakeFile.open()
Bug Fixes
- Fixed a bug for the following functions that raised errors
.cast()
is applied to their outputfrom_json
size
Snowpark Local Testing Updates
Bug Fixes
- Fixed a bug in aggregation that caused empty groups to still produce rows.
- Fixed a bug in
Dataframe.except_
that would cause rows to be incorrectly dropped. - Fixed a bug that caused
to_timestamp
to fail when casting filtered columns.
Snowpark pandas API Updates
New Features
- Added support for list values in
Series.str.__getitem__
(Series.str[...]
). - Added support for
pd.Grouper
objects in group by operations. Whenfreq
is specified, the default values of thesort
,closed
,label
, andconvention
arguments are supported;origin
is supported when it isstart
orstart_day
. - Added support for relaxed consistency and ordering guarantees in
pd.read_snowflake
for both named data sources (e.g., tables and views) and query data sources by introducing the new parameterrelaxed_ordering
.
Improvements
- Raise a warning whenever
QUOTED_IDENTIFIERS_IGNORE_CASE
is found to be set, ask user to unset it. - Improved how a missing
index_label
inDataFrame.to_snowflake
andSeries.to_snowflake
is handled whenindex=True
. Instead of raising aValueError
, system-defined labels are used for the index columns. - Improved error message for
groupby or DataFrame or Series.agg
when the function name is not supported.