New Features and Enhancements
-
Introduces the ChaosSchedule CRD & Controller to execute background chaos jobs with a variety of scheduling policies: Immediate, at specific timestamp or between a defined start & end time. Supports both randomized as well as strictly scheduled execution of chaos.
-
Introduces argo-based Chaos Workflows as a means to help users construct complex scenarios around chaos experiments such as ability to parallelize benchmark runs with chaos operations. The initial commits include workflows to gauge impact of pod failures on the performance of a multi-replica nginx deployment.
-
Introduces litmus-go - a repo to hold experiments and chaoslib written in golang, with an alpha litmus-go SDK that has the ability to scaffold go experiments, complete with all artefacts, including the chaosexperiment custom resources. Also introduces litmus-python, which primarily holds chaostoolkit-based chaos experiments.
-
Introduces an alpha Validation Webhook for Litmus to offload experiment dependency validation checks from chaos-operator & chaos-runner components.
-
Adds support for chaos on DeploymentConfig resources on OpenShift
-
Introduces ability to insert user-defined annotations into chaos resources (chaos-runner, experiment pods) via chaosengine
-
Adds support for instance specific metadata (id) definition by users to specify the purpose/track chaos experiment and lend uniqueness to the chaosresult via chaosengine environment variable
-
Refactors the chaos exporter metrics to provide aggregated cluster level chaos metrics with improved naming convention.
-
Introduces a suite of standard observability resources to aid with visualization & monitoring of chaos experiments - including events (heptio eventrouter-prometheus-grafana, metricbeat-elasticsearch-kibana), metrics (chaos-exporter-prometheus-grafana) & logs (promtail-loki-grafana).
-
Homogenizes chaos experiments to use LIB model to invoke chaos injection functions
-
Improves the litmus helm chart to support admin mode installation. Also includes optional install of chaos-exporter.
-
Updates to use stress-ng over stress in chaos libraries to support greater chaos support
-
Adds helm chart testing in CI for litmus-helm repo
-
Updates the litmus-e2e gitlab job scripts to function on on-prem Kubernetes clusters over NAT
-
Shifts to Go Modules for dependency management across litmus components
-
Improves general & troubleshooting FAQs on litmus-docs around failed chaos experiment execution.
Major Bug Fixes
-
Fixes inability to run litmus experiment containers in OpenShift due to “AnsibleError: Unable to create local directories” by generating resource manifests from jinja templates into /tmp.
-
Fixes disk-fill experiment execution on Gravity Kubernetes cluster via dynamic container data path.
-
Fixes exceptions seen in chaos-operator due to lack of resource permissions for replicasets
-
Fixes “unable to update resource” / “operation cannot be fulfilled” transient errors on chaos-operator
-
Fixes broken BDD tests in chaos-runner, chaos-operator CI pipelines
-
Enforces hard stop of pod-delete chaos experiment at total_chaos_duration via chaos timestamp comparisons
-
Fixes algolia-based search functionality in litmus-docs
-
Fixes the analytics count round off issue for operator installation & experiment run count in the charthub
Getting Started
Prerequisites to install
- Make sure you have a healthy Kubernetes Cluster.
- Kubernetes 1.12+ is installed
Installation
kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-v1.4.0.yaml
Verify your installation
-
Verify if the chaos operator is running
kubectl get pods -n litmus
-
Verify if chaos CRDs are installed
kubectl get crds | grep chaos
For more details refer to the documentation at Docs