github robusta-dev/robusta 0.10.0

latest releases: 0.20.0, 0.20.0-alpha1, 0.20.0-alpha...
2 years ago

Overview

This release makes it easier than ever to monitor Kubernetes changes and remediate Prometheus alerts automatically.

Conceptually, every Robusta automation has three parts:

  • Triggers that identify a problem, like a crashing pod
  • Actions that gather data about the problem (e.g. fetch logs) or fix it automatically (e.g. restart a pod)
  • Sinks that send notifications to Slack, MS Teams, and other destinations

This release focuses on adding new triggers and actions. In the coming weeks, we will focus on adding new sinks as well.

What's New

Run Kubernetes jobs in response to alerts

You can now create a Kubernetes job whenever a specific Prometheus alert fires. After the job is created, you will receive a notification like the following:

run kubernetes job on prometheus alert

Enrich OOM Kills with extra data

Jump start your investigation of OOM Kills with extra data right in your messaging app:

why do oom kills happen on kubernetes

By default, this also sends graphs of memory usage for easier troubleshooting:

kubernetes pod memory

Finally, we've added a new Robusta trigger on_pod_oom_killed for custom automations. For example:

customPlaybooks:
- triggers:
  - on_pod_oom_killed: {}
  actions:
  - pod_graph_enricher:
      resource_type: Memory
      display_limits: true

Automate the response to failed Kubernetes jobs

This implements a widely requested feature - a new trigger for failing Kubernetes jobs. You can use this to notify whenever specific jobs fail or to take automated actions.

customPlaybooks:
- triggers:
  - on_job_failure:
      namespace_prefix: robusta
  actions:
  - create_finding:
      title: "Job $name on namespace $namespace failed"
      aggregation_key: "Job Failure"
  - job_events_enricher: { }

Above you can also see the new create_finding action. This can be used to customize the message for Robusta notifications.

Launch self-hosting beta

The Robusta SaaS platform is now available for self-hosting via our commercial plans. Contact support@robusta.dev if you're interested.

As always, the Robusta open source can be used without the SaaS platform, in which case everything already runs in-cluster.

Most users should continue to use the cloud version of the Robusta UI instead of self hosting.

Support for additional community requests

We've added a --dry-run flag to robusta playbooks trigger in response to a request by Subramanyeswara Bhavirisetty

We've also added support for running debug pods as specific service accounts in response to a request by @SamAlex0808

Friendly reminder: we love hearing from users! Let us know what you like and what we can improve.

Breaking changes

This isn't new, but Robusta can fetch and render graphs from Grafana in response to events in your cluster. For example, you can send a report to your messaging app 15 minutes after a new deployment is rolled out with important graphs.

Supporting this feature adds some memory and cpu overhead even when unused. Therefore, we've changed it to be disabled by default.

If you use this feature, simply set grafanaRenderer.enableContainer: true in your Helm values.

Adopters.md

We will soon be adding an ADOPTERS.md file listing Robusta users. Let us know if we can list your company there!

Additional Changes

Full Changelog: 0.9.17...0.10.0

Don't miss a new robusta release

NewReleases is sending notifications on new releases.