Summary
This release optimises performance in Databricks for the base_sessions_lifecycle_manifest
, bringing the behaviour in line with the mobile package. We also add standard actions to aid development and categorize issues better.
Features
Add missing start_tstamp_date for base_sessions_lifecycle_manifest
on Databricks (Close #132)
Add standard actions and templates
Upgrading
To upgrade simply bump the snowplow-web version in your packages.yml
file. If you are running the web package on Databricks, you will need to run the following SQL to take advantage of the performance optimizations in this release. Be sure to replace {catalog_name}
with your catalog name if your environment is UC enabled, and remove it if not. Also be sure to replace {manifest_schema}
with the name of the schema where your manifest is currently found.
CREATE TABLE {catalog_name}.{manifest_schema}.snowplow_web_base_sessions_lifecycle_manifest_tmp
USING DELTA
PARTITIONED BY (start_tstamp_date)
tblproperties ('delta.autoOptimize.optimizeWrite' = 'true' , 'delta.autoOptimize.autoCompact' = 'true'
) AS
SELECT *,
DATE(start_tstamp) as start_tstamp_date
FROM {catalog_name}.{manifest_schema}.snowplow_web_base_sessions_lifecycle_manifest;
DROP TABLE IF EXISTS {catalog_name}.{manifest_schema}.snowplow_web_base_sessions_lifecycle_manifest;
ALTER TABLE {catalog_name}.{manifest_schema}.snowplow_web_base_sessions_lifecycle_manifest_tmp RENAME TO {catalog_name}.{manifest_schema}.snowplow_web_base_sessions_lifecycle_manifest;