Loads bad rows in batch pipeline into Elasticsearch, and formally separates the Snowplow enriched event format from the TSV format used to load Redshift.
EmrEtlRunner
- Bumped to 0.19.0
- Added hadoop_elasticsearch to config.yml.sample (#2124)
- Added support for Elasticsearch in targets section of config (#826)
- Bumped Elasticity to 6.0.5 (#2026)
- Stopped skipping the whole job just because enrich and shred are being skipped (#2049)
Scala Common Enrich
- Bumped Iglu Scala Client to 0.3.1 (#2079)
- Bumped version to 0.18.0
- Moved ScalazArgs into shared library (#2010)
- Removed executable bit from Scala source files (#2022)
- Removed JSON length checks (#2041)
- Removed truncation code (#2044)
- Stopped attempting to catch fatal errors (#2045)
Scala Hadoop Enrich
- Bumped to 1.3.0
- Bumped Scala Common Enrich to 0.18.0 (#2015)
- Added Iglu Scala Client as an explicit dependency (#2115)
- Added .forceToDisk to speed up run (#859)
- Started using Scala Common Enrich's version of ScalazArgs (#2013)
Scala Hadoop Shred
- Bumped to 0.6.0
- Added .forceToDisk to common to speed up run (#2039)
- Bumped Iglu Scala Client to 0.3.1 (#2081)
- Bumped Scala Common Enrich to 0.18.0 (#2016)
- Applied truncation logic to atomic-events TSV (#2042)
- Processed enriched events for atomic.events removing JSON fields (#1731)
- Started using Scala Common Enrich's version of ScalazArgs (#2014)
Storage
Hadoop Elasticsearch Sink
- Added. (#824)
StorageLoader
- Bumped to 0.6.0
- Added tcpKeepAlive=true to JDBC for long-running COPYs via NAT (#2145)
- Fixed setup guide link in README, thanks @diamondo25! (#2025)
- Loaded atomic.events from shredded folder (#1795)
Postgres
Redshift
- Added migration script for 0.4.0 to 0.8.0 (#2155)
- Added migration script for 0.5.0 to 0.8.0 (#2119)
- Added migration script for 0.6.0 to 0.8.0 (#2120)
- Added migration script for 0.7.0 to 0.8.0 (#2048)
- Removed JSON fields from atomic.events (#1849)