github snowflakedb/snowpark-python v1.41.0
Release

one day ago

1.41.0 (2025-10-23)

Snowpark Python API Updates

New Features

  • Added a new function service in snowflake.snowpark.functions that allows users to create a callable representing a Snowpark Container Services (SPCS) service.
  • Added connection_parameters parameter to DataFrameReader.dbapi() (PuPr) method to allow passing keyword arguments to the create_connection callable.
  • Added support for Session.begin_transaction, Session.commit and Session.rollback.
  • Added support for the following functions in functions.py:
    • Geospatial functions:
      • st_interpolate
      • st_intersection
      • st_intersection_agg
      • st_intersects
      • st_isvalid
      • st_length
      • st_makegeompoint
      • st_makeline
      • st_makepolygon
      • st_makepolygonoriented
      • st_disjoint
      • st_distance
      • st_dwithin
      • st_endpoint
      • st_envelope
      • st_geohash
      • st_geomfromgeohash
      • st_geompointfromgeohash
      • st_hausdorffdistance
      • st_makepoint
      • st_npoints
      • st_perimeter
      • st_pointn
      • st_setsrid
      • st_simplify
      • st_srid
      • st_startpoint
      • st_symdifference
      • st_transform
      • st_union
      • st_union_agg
      • st_within
      • st_x
      • st_xmax
      • st_xmin
      • st_y
      • st_ymax
      • st_ymin
      • st_geogfromgeohash
      • st_geogpointfromgeohash
      • st_geographyfromwkb
      • st_geographyfromwkt
      • st_geometryfromwkb
      • st_geometryfromwkt
      • try_to_geography
      • try_to_geometry
  • Added a parameter to enable and disable automatic column name aliasing for interval_day_time_from_parts and interval_year_month_from_parts functions.

Bug Fixes

  • Fixed a bug that DataFrameReader.xml fails to parse XML files with undeclared namespaces when ignoreNamespace is True.
  • Added a fix for floating point precision discrepancies in interval_day_time_from_parts.
  • Fixed a bug where writing Snowpark pandas dataframes on the pandas backend with a column multiindex to Snowflake with to_snowflake would raise KeyError.
  • Fixed a bug that DataFrameReader.dbapi (PuPr) is not compatible with oracledb 3.4.0.
  • Fixed a bug where modin would unintentionally be imported during session initialization in some scenarios.
  • Fixed a bug where session.udf|udtf|udaf|sproc.register failed when an extra session argument was passed. These methods do not expect a session argument; please remove it if provided.

Improvements

  • The default maximum length for inferred StringType columns during schema inference in DataFrameReader.dbapi is now increased from 16MB to 128MB in parquet file based ingestion.

Dependency Updates

  • Updated dependency of snowflake-connector-python>=3.17,<5.0.0.

Snowpark pandas API Updates

New Features

  • Added support for the dtypes parameter of pd.get_dummies
  • Added support for nunique in df.pivot_table, df.agg and other places where aggregate functions can be used.
  • Added support for DataFrame.interpolate and Series.interpolate with the "linear", "ffill"/"pad", and "backfill"/bfill" methods. These use the SQL INTERPOLATE_LINEAR, INTERPOLATE_FFILL, and INTERPOLATE_BFILL functions (PuPr).

Improvements

  • Improved performance of Series.to_snowflake and pd.to_snowflake(series) for large data by uploading data via a parquet file. You can control the dataset size at which Snowpark pandas switches to parquet with the variable modin.config.PandasToSnowflakeParquetThresholdBytes.
  • Enhanced autoswitching functionality from Snowflake to native Pandas for methods with unsupported argument combinations:
    • get_dummies() with dummy_na=True, drop_first=True, or custom dtype parameters
    • cumsum(), cummin(), cummax() with axis=1 (column-wise operations)
    • skew() with axis=1 or numeric_only=False parameters
    • round() with decimals parameter as a Series
    • corr() with method!=pearson parameter
  • Set cte_optimization_enabled to True for all Snowpark pandas sessions.
  • Add support for the following in faster pandas:
    • isin
    • isna
    • isnull
    • notna
    • notnull
    • str.contains
    • str.startswith
    • str.endswith
    • str.slice
    • dt.date
    • dt.time
    • dt.hour
    • dt.minute
    • dt.second
    • dt.microsecond
    • dt.nanosecond
    • dt.year
    • dt.month
    • dt.day
    • dt.quarter
    • dt.is_month_start
    • dt.is_month_end
    • dt.is_quarter_start
    • dt.is_quarter_end
    • dt.is_year_start
    • dt.is_year_end
    • dt.is_leap_year
    • dt.days_in_month
    • dt.daysinmonth
    • sort_values
    • loc (setting columns)
    • to_datetime
    • rename
    • drop
    • invert
    • duplicated
    • iloc
    • head
    • columns (e.g., df.columns = ["A", "B"])
    • agg
    • min
    • max
    • count
    • sum
    • mean
    • median
    • std
    • var
    • groupby.agg
    • groupby.min
    • groupby.max
    • groupby.count
    • groupby.sum
    • groupby.mean
    • groupby.median
    • groupby.std
    • groupby.var
    • drop_duplicates
  • Reuse row count from the relaxed query compiler in get_axis_len.

Bug Fixes

  • Fixed a bug where the row count was not getting cached in the ordered dataframe each time count_rows() is called.

Don't miss a new snowpark-python release

NewReleases is sending notifications on new releases.