Now validates incoming event and context JSONs (using JSON Schema), and then automatically shreds those JSONs into dedicated tables in Amazon Redshift.
Trackers
- Ruby Tracker: added git submodule. Version 0.1.0 (#645)
- Java Tracker: added git submodule. Version 0.2.0 (#843)
- JavaScript Tracker: bumped git submodule to 2.0.0 (#635)
- Python Tracker: bumped Python Tracker git submodule to 0.4.0 (#634)
Scala Hadoop Shred
- Added. Version 0.1.0
EmrEtlRunner
- Bumped to 0.8.0
- Updated S3DistCp steps to use new S3DistCpStep from Elasticity (#629)
- Added --skip s3distcp option (#313)
- Added ability to start Lingual in EmrEtlRunner (#623)
- Added ability to start HBase in EmrEtlRunner (#622)
- Improved load performance by switching ETL to write out to HDFS (#278)
- Now invoking Scala Hadoop Shredder after main job (#644)
- Added :iglu: section to config.yml for Scala Hadoop Shred (#814)
- Updated to run Scala Hadoop Shred following Hadoop Enrich (#815)
- Added --skip shred option (#659)
StorageLoader
- Bumped to 0.3.0
- Bumped Sluice to 0.2.1 (#881)
- Added initial Ruby.contracts support (#391)
- Updated config.yml to support shredding (#897)
- Added ACCEPTINVCHARS to StorageLoader (#411)
- Wrote JSON Path files for ad_* events (#642)
- Wrote JSON Path file for link_click (#599)
- Wrote JSON Path file for screen_view (#643)
- Wrote JSON Path file for schema.org's WebPage (#772)
- Added :jsonpath_assets: setting for StorageLoader (#606)
- Added ability to load custom tables using JSON Paths (#607)
- Added --skip shred option (#660)
- Added :in: hint on StorageLoader configuration, thanks @joaolcorreia! (#755)
Redshift
- Added Redshift DDL for ad_* events (#639)
- Added Redshift DDL for link_click events (#600)
- Added Redshift DDL for screen_view events (#640)
- Added Redshift DDL for schema.org's WebPage (#771)