1.27.0 (2025-02-03)
Snowpark Python API Updates
New Features
- Added support for the following functions in
functions.py
array_reverse
divnull
map_cat
map_contains_key
map_keys
nullifzero
snowflake_cortex_sentiment
acosh
asinh
atanh
bit_length
bitmap_bit_position
bitmap_bucket_number
bitmap_construct_agg
cbrt
equal_null
from_json
ifnull
localtimestamp
max_by
min_by
nth_value
nvl
octet_length
position
regr_avgx
regr_avgy
regr_count
regr_intercept
regr_r2
regr_slope
regr_sxx
regr_sxy
regr_syy
try_to_binary
base64
base64_decode_string
base64_encode
editdistance
hex
hex_encode
instr
log1p
log2
log10
percentile_approx
unbase64
- Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe
. - Added support for
DataFrameWriter.insert_into/insertInto
. This method also supports local testing mode. - Added support for
DataFrame.create_temp_view
to create a temporary view. It will fail if the view already exists. - Added support for multiple columns in the functions
map_cat
andmap_concat
. - Added an option
keep_column_order
for keeping original column order inDataFrame.with_column
andDataFrame.with_columns
. - Added options to column casts that allow renaming or adding fields in StructType columns.
- Added support for
contains_null
parameter to ArrayType. - Added support for creating a temporary view via
DataFrame.create_or_replace_temp_view
from a DataFrame created by reading a file from a stage. - Added support for
value_contains_null
parameter to MapType. - Added
interactive
to telemetry that indicates whether the current environment is an interactive one. - Allow
session.file.get
in a Native App to read file paths starting with/
from the current version - Added support for multiple aggregation functions after
DataFrame.pivot
.
Experimental Features
- Added
Catalog
class to manage snowflake objects. It can be accessed viaSession.catalog
.snowflake.core
is a dependency required for this feature.
- Allow user input schema when reading JSON file on stage.
- Added support for specifying a schema string (including implicit struct syntax) when calling
DataFrame.create_dataframe
.
Improvements
- Updated README.md to include instructions on how to verify package signatures using
cosign
.
Bug Fixes
- Fixed a bug in local testing mode that caused a column to contain None when it should contain 0.
- Fixed a bug in
StructField.from_json
that prevented TimestampTypes withtzinfo
from being parsed correctly. - Fixed a bug in function
date_format
that caused an error when the input column was date type or timestamp type. - Fixed a bug in dataframe that null value can be inserted in a non-nullable column.
- Fixed a bug in
replace
andlit
which raised type hint assertion error when passingColumn
expression objects. - Fixed a bug in
pandas_udf
andpandas_udtf
wheresession
parameter was erroneously ignored. - Fixed a bug that raised incorrect type conversion error for system function called through
session.call
.
Snowpark pandas API Updates
New Features
- Added support for
Series.str.ljust
andSeries.str.rjust
. - Added support for
Series.str.center
. - Added support for
Series.str.pad
. - Added support for applying Snowpark Python function
snowflake_cortex_sentiment
. - Added support for
DataFrame.map
. - Added support for
DataFrame.from_dict
andDataFrame.from_records
. - Added support for mixed case field names in struct type columns.
- Added support for
SeriesGroupBy.unique
- Added support for
Series.dt.strftime
with the following directives:- %d: Day of the month as a zero-padded decimal number.
- %m: Month as a zero-padded decimal number.
- %Y: Year with century as a decimal number.
- %H: Hour (24-hour clock) as a zero-padded decimal number.
- %M: Minute as a zero-padded decimal number.
- %S: Second as a zero-padded decimal number.
- %f: Microsecond as a decimal number, zero-padded to 6 digits.
- %j: Day of the year as a zero-padded decimal number.
- %X: Locale’s appropriate time representation.
- %%: A literal '%' character.
- Added support for
Series.between
. - Added support for
include_groups=False
inDataFrameGroupBy.apply
. - Added support for
expand=True
inSeries.str.split
. - Added support for
DataFrame.pop
andSeries.pop
. - Added support for
first
andlast
inDataFrameGroupBy.agg
andSeriesGroupBy.agg
. - Added support for
Index.drop_duplicates
. - Added support for aggregations
"count"
,"median"
,np.median
,
"skew"
,"std"
,np.std
"var"
, andnp.var
in
pd.pivot_table()
,DataFrame.pivot_table()
, andpd.crosstab()
.
Improvements
- Improve performance of
DataFrame.map
,Series.apply
andSeries.map
methods by mapping numpy functions to snowpark functions if possible. - Added documentation for
DataFrame.map
. - Improve performance of
DataFrame.apply
by mapping numpy functions to snowpark functions if possible. - Added documentation on the extent of Snowpark pandas interoperability with scikit-learn.
- Infer return type of functions in
Series.map
,Series.apply
andDataFrame.map
if type-hint is not provided. - Added
call_count
to telemetry that counts method calls including interchange protocol calls.