Release Notes for Chaos Genius 0.1.3

What's New
New Connector(s)
New Features
Bug Fixes

✨ What's New?

We're excited to announce the release of new and improved Chaos Genius 0.1.3. We want to sincerely thank our early users for their feedback (shout to @gxu-kangaroo @davidhayter-karhoo, @mvaerle, @nitsujri, @miike, @coindcx-gh) and to all our contributors for their relentless effort towards improving Chaos Genius.

Chaos Genius 0.1.3 is focused on improving the onboarding process and improving compatibility for large datasets and varied sub-population types.

Key highlights being:

Amazon Redshift Integration
Global Configuration to support handling large datasets (aggregated views upto 10M rows) and varied sub-populations (1-250 subgroups)
Optimized data fetching for large datasets
Improving Anomaly Detection via handling missing data points in time series, higher number of drill-downs, higher cardinality support (1000+) and enhancements in Anomaly Detector Configuration
DeepDrills bug fixes
Improved logging
Other bug fixes

🔌 New Connector(s)

With the 0.1.3 release, Chaos Genius now supports Amazon Redshift as a data source. With this Chaos Genius now works with the 3 major data warehouses - Snowflake, BigQuery and Amazon Redshift.

Please find the documentation for Redshift here.

We will soon release public data sets on Redshift for our community to test out!

Add the redshift connector (#348)

🎉 New Features

Global Configuration to support large datasets & varied sub-group characteristics

Using a global configuration setting, Chaos Genius can now enable support for aggregated views upto 10M rows and varied sub-group characteristics (1-250+ subgroups). This will enable config control over the statistical filtering calculations that are carried out while running both DeepDrills and Anomaly Detection at a sub-group level.

Chaos Genius team will be happy to help you set up the configuration.

Fine Grained control on Anomaly Detection for different series_type (#324)
Add support for subgroup calculation global config in anomaly detection core (#341)
Make population calculation & statistical filtering parameters globally configurable (#340)
Make population calculation & statistical filtering parameters globally configurable (#340)

Anomaly Detection Enhancements

Missing Data in Time Series

Handling missing data points in time-series analysis is a hairy problem. Chaos Genius 0.1.3 now handles missing data points as zero while plotting the time-series graphs and identifying anomalies. We will continue to invest more deeply on this going forward by adding missing data alerts which might get undetected in certain algorithms.

Handle completely missing data in time series as zero (#367)

Higher Cardinality support for Dimensions definition

We've further optimized subgroup time series creation to handle higher cardinality dimensions. We now support dimensions with 1000+ cardinality. Earlier large cardinality dimensions were excluded from the analysis. We'll continue to optimize it further over upcoming releases.

Refactor anomaly detection subgroup detection to handle higher cardinality (#350)

Higher Number of Drill-downs in Anomaly Detection

While investigating Anomalies via Drill Downs, Chaos Genius now gives 10 most relevant sub-groups sorted by relevance (mix of anomaly severity & sub-group population) - this number is also configurable. We also upgraded the algorithms used to create these sub-groups.

Going forward, we will enable this by configuration and also enable multi-dimensional drill downs as you detect the top drivers causing anomalies in your time-series.

Enable support for a higher number of drill downs (#319)
Create new algorithm for subgroup list generation (#351)

Support for Multivariate Subdimensional groups

Chaos Genius now supports the ability to detect anomalies on multivariate subdimensional groups that are mutually exclusive. All possible permutations for selected dimensions are selected, statistically filtered based on population characteristics - anomaly detection & drill downs are now available for these as an option.

We'll continue investing in subdimensional anomaly detection including clustering & grouping for subdimensions that behave alike.

Configurable sub-dimension settings for anomaly detection (#349)

Improved UX for Anomaly Settings for hourly time-series

Chaos Genius now offers improved UX for setting anomaly detection configuration for hourly time-series.

You can now specify the historical data by number of days instead of units of frequency of the time-series - e.g. 7 days instead of 168 hours if you need to train hourly time-series data for last week :)

Set anomaly period's value in days, irrespective of frequency (#336)

Optimized Data Fetching for Large Datasets

In the current release, we've added optimization for fetching large datasets by adding chunk size specifications. Data is fetched in chunks (currently param is set to 50,000) and then merged into a single dataframe.

Benchmark & enable chunk size for pandas data fetching (#332)

Enhanced Logging

We're working extensively to improve the logging for Chaos Genius. In the current release, we've centralized the logging, added an option for Fluentd logs for persistence and now also include data params in the logs in order to identify edge cases where the analytics might be failing to run.

Centralized logging and spawning of loggers throughout the flask app (#313)
Fluentd for persistence
Data params passed in logs for easier replication of edge case issues

In subsequent releases, we plan to enable the status of all the tasks.

Other enhancements

Added nginx based front-end deployment
- Update docker-compose for 0.1.3 release (#419)
Global configuration for multidimensional drill-downs
- Make multidimensional drill down to be configurable for DeepDrills (#369)
Improved Error Message copy in UI
- fix: Update the error messages, disable event kpi alert fix, anomaly setting fixes (#321)
- Error message & integer type changed (#284)

🐛 Bug Fixes

DeepDrill UI fixes
- Count & size columns in the DeepDrills table are swapped (#310)
Anomaly interface fixes
- Anomaly drill down graphs only display integer values (#306)
- Changes in the Edit Anomaly Settings (#311)
- Make analytics charts more descriptive and consistent. (#372)
Snowflake metadata ingestion issue raised by Grant Xu
- Using Snowflake timestamp when casted can create issues while adding KPI (#320)
Handle when KPI only has 1 subgroup
- UnboundLocalError when a KPI has only one subgroup (#342)
Handle edge cases data with multiple frequencies
Fix the edit functionality for data sources
- Data Source isn't being updated properly (#308)
Other UI fixes
- Modified timestamp isn't coming in the alert (#309)
Fixes to improve handling of missing or incomplete data
- RCA Infinite Loop when KPI is query based (#344)
- Data Padding causes issue with anomaly detection values (#353)
- RCA saving fails if there is a NaN value present (#347)
Fixes to handle anomaly & DeepDrill edge cases
- Incorrect confidence intervals for anomaly detection after the first training session (#388)
- Inconsistent analytics occurs between hourly panel metrics, DeepDrill & anomaly data (#390)
- Wrong Start Dates for Anomaly (#399)
- Anomaly training not till expected end date (#400)
- Missing data point in DeepDrills when we have missing data (#413)
- Inconsistent Last Updated between Anomaly and Deepdrills (#411)
Validation sequence logic & platform update for KPI addition
- Adding KPI with incorrect columns does not produce the correct errors (#391)
- SQL error while adding snowflake KPI (#397)
- Truncated error output for out of bounds error in KPI validation (#398)
- KPI validation for datetime column does not work (#405)

chaos-genius/chaos_genius v0.1.3 chaos-genius v0.1.3-alpha on GitHub