1.41.0 (2025-10-23)
Snowpark Python API Updates
New Features
- Added a new function
serviceinsnowflake.snowpark.functionsthat allows users to create a callable representing a Snowpark Container Services (SPCS) service. - Added
connection_parametersparameter toDataFrameReader.dbapi()(PuPr) method to allow passing keyword arguments to thecreate_connectioncallable. - Added support for
Session.begin_transaction,Session.commitandSession.rollback. - Added support for the following functions in
functions.py:- Geospatial functions:
st_interpolatest_intersectionst_intersection_aggst_intersectsst_isvalidst_lengthst_makegeompointst_makelinest_makepolygonst_makepolygonorientedst_disjointst_distancest_dwithinst_endpointst_envelopest_geohashst_geomfromgeohashst_geompointfromgeohashst_hausdorffdistancest_makepointst_npointsst_perimeterst_pointnst_setsridst_simplifyst_sridst_startpointst_symdifferencest_transformst_unionst_union_aggst_withinst_xst_xmaxst_xminst_yst_ymaxst_yminst_geogfromgeohashst_geogpointfromgeohashst_geographyfromwkbst_geographyfromwktst_geometryfromwkbst_geometryfromwkttry_to_geographytry_to_geometry
- Geospatial functions:
- Added a parameter to enable and disable automatic column name aliasing for
interval_day_time_from_partsandinterval_year_month_from_partsfunctions.
Bug Fixes
- Fixed a bug that
DataFrameReader.xmlfails to parse XML files with undeclared namespaces whenignoreNamespaceisTrue. - Added a fix for floating point precision discrepancies in
interval_day_time_from_parts. - Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with
to_snowflakewould raiseKeyError. - Fixed a bug that
DataFrameReader.dbapi(PuPr) is not compatible with oracledb 3.4.0. - Fixed a bug where
modinwould unintentionally be imported during session initialization in some scenarios. - Fixed a bug where
session.udf|udtf|udaf|sproc.registerfailed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.
Improvements
- The default maximum length for inferred StringType columns during schema inference in
DataFrameReader.dbapiis now increased from 16MB to 128MB in parquet file based ingestion.
Dependency Updates
- Updated dependency of
snowflake-connector-python>=3.17,<5.0.0.
Snowpark pandas API Updates
New Features
- Added support for the
dtypesparameter ofpd.get_dummies - Added support for
nuniqueindf.pivot_table,df.aggand other places where aggregate functions can be used. - Added support for
DataFrame.interpolateandSeries.interpolatewith the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQLINTERPOLATE_LINEAR,INTERPOLATE_FFILL, andINTERPOLATE_BFILLfunctions (PuPr).
Improvements
- Improved performance of
Series.to_snowflakeandpd.to_snowflake(series)for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variablemodin.config.PandasToSnowflakeParquetThresholdBytes. - Enhanced autoswitching functionality from Snowflake to native Pandas for methods with unsupported argument combinations:
get_dummies()withdummy_na=True,drop_first=True, or customdtypeparameterscumsum(),cummin(),cummax()withaxis=1(column-wise operations)skew()withaxis=1ornumeric_only=Falseparametersround()withdecimalsparameter as a Seriescorr()withmethod!=pearsonparameter
- Set
cte_optimization_enabledto True for all Snowpark pandas sessions. - Add support for the following in faster pandas:
isinisnaisnullnotnanotnullstr.containsstr.startswithstr.endswithstr.slicedt.datedt.timedt.hourdt.minutedt.seconddt.microseconddt.nanoseconddt.yeardt.monthdt.daydt.quarterdt.is_month_startdt.is_month_enddt.is_quarter_startdt.is_quarter_enddt.is_year_startdt.is_year_enddt.is_leap_yeardt.days_in_monthdt.daysinmonthsort_valuesloc(setting columns)to_datetimerenamedropinvertduplicatedilocheadcolumns(e.g., df.columns = ["A", "B"])aggminmaxcountsummeanmedianstdvargroupby.agggroupby.mingroupby.maxgroupby.countgroupby.sumgroupby.meangroupby.mediangroupby.stdgroupby.vardrop_duplicates
- Reuse row count from the relaxed query compiler in
get_axis_len.
Bug Fixes
- Fixed a bug where the row count was not getting cached in the ordered dataframe each time count_rows() is called.