0.4.0 (2022-02-15)
New Features
- You can now specify which Anaconda packages to use when defining UDFs.
- Added
add_packages()
,get_packages()
,clear_packages()
, andremove_package()
, to classSession
. - Added
add_requirements()
toSession
so you can use a requirements file to specify which packages this session will use. - Added parameter
packages
to functionsnowflake.snowpark.functions.udf()
and methodUserDefinedFunction.register()
to indicate UDF-level Anaconda package dependencies when creating a UDF. - Added parameter
imports
tosnowflake.snowpark.functions.udf()
andUserDefinedFunction.register()
to specify UDF-level code imports.
- Added
- Added a parameter
session
to functionudf()
andUserDefinedFunction.register()
so you can specify which session to use to create a UDF if you have multiple sessions. - Added types
Geography
andVariant
tosnowflake.snowpark.types
to be used as type hints for Geography and Variant data when defining a UDF. - Added support for Geography geoJSON data.
- Added
Table
, a subclass ofDataFrame
for table operations:- Methods
update
anddelete
update and delete rows of a table in Snowflake. - Method
merge
merges data from aDataFrame
to aTable
. - Override method
DataFrame.sample()
with an additional parameterseed
, which works on tables but not on view and sub-queries.
- Methods
- Added
DataFrame.to_local_iterator()
andDataFrame.to_pandas_batches()
to allow getting results from an iterator when the result set returned from the Snowflake database is too large. - Added
DataFrame.cache_result()
for caching the operations performed on aDataFrame
in a temporary table.
Subsequent operations on the originalDataFrame
have no effect on the cached resultDataFrame
. - Added property
DataFrame.queries
to get SQL queries that will be executed to evaluate theDataFrame
. - Added
Session.query_history()
as a context manager to track SQL queries executed on a session, including all SQL queries to evaluateDataFrame
s created from a session. Both query ID and query text are recorded. - You can now create a
Session
instance from an existing establishedsnowflake.connector.SnowflakeConnection
. Use parameterconnection
inSession.builder.configs()
. - Added
use_database()
,use_schema()
,use_warehouse()
, anduse_role()
to classSession
to switch database/schema/warehouse/role after a session is created. - Added
DataFrameWriter.copy_into_table()
to unload aDataFrame
to stage files. - Added
DataFrame.unpivot()
. - Added
Column.within_group()
for sorting the rows by columns with some aggregation functions. - Added functions
listagg()
,mode()
,div0()
,acos()
,asin()
,atan()
,atan2()
,cos()
,cosh()
,sin()
,sinh()
,tan()
,tanh()
,degrees()
,radians()
,round()
,trunc()
, andfactorial()
tosnowflake.snowflake.functions
. - Added an optional argument
ignore_nulls
in functionlead()
andlag()
. - The
condition
parameter of functionwhen()
andiff()
now accepts SQL expressions.
Improvements
- All function and method names have been renamed to use the snake case naming style, which is more Pythonic. For convenience, some camel case names are kept as aliases to the snake case APIs. It is recommended to use the snake case APIs.
- Deprecated these methods on class
Session
and replaced them with their snake case equivalents:getImports()
,addImports()
,removeImport()
,clearImports()
,getSessionStage()
,getDefaultSchema()
,getDefaultSchema()
,getCurrentDatabase()
,getFullyQualifiedCurrentSchema()
. - Deprecated these methods on class
DataFrame
and replaced them with their snake case equivalents:groupingByGroupingSets()
,naturalJoin()
,withColumns()
,joinTableFunction()
.
- Deprecated these methods on class
- Property
DataFrame.columns
is now consistent withDataFrame.schema.names
and the Snowflake databaseIdentifier Requirements
. Column.__bool__()
now raises aTypeError
. This will ban the use of logical operatorsand
,or
,not
onColumn
object, for instancecol("a") > 1 and col("b") > 2
will raise theTypeError
. Use(col("a") > 1) & (col("b") > 2)
instead.- Changed
PutResult
andGetResult
to subclassNamedTuple
. - Fixed a bug which raised an error when the local path or stage location has a space or other special characters.
- Changed
DataFrame.describe()
so that non-numeric and non-string columns are ignored instead of raising an exception.
Dependency updates
- Updated
snowflake-connector-python
to 2.7.4.