github grafana/beyla v1.9.0

7 hours ago

What's Changed

Beyla 1.9.0 is released with major internal changes, in preparation to what's coming for the future Beyla 2.0 release.

Breaking changes 🔨

Removed override_instance_id configuration option

This option was aimed uniquely for debugging purposes.

More info: #1125

Fix instance and job in Prometheus exporter

Renaming target_instance Prometheus attribute to instance. Also, the job attribute has been added to Prometheus.

Now, all the metrics are consistent, no matter they are exported via OTEL or Prometheus.

More info: #1130

Set OTEL service name and namespace from application environment variables

If the application has set the OTEL_SERVICE_NAME or OTEL_SERVICE_NAMESPACE variables in its environment,
Beyla will use them to set the reported service name and namespace.

If the variables are not there, Beyla will use the previously existing mechanism to set service name and namespace.

Bug fixes 🐞

Fix cgroup ID parsing in newest Docker versions

More info: #1287

Fix OS capability checking

There were few bugs in the OS capability checking which are being fixed with this PR:

  1. If SYS_ADMIN is present, it effectively means all capabilities.
  2. If we have kernel older than 5.8, SYS_ADMIN is a must, the others weren't split off yet.
  3. If we have NET_ADMIN we also have NET_RAW, so we can relax that check.

More info: #1131

What's new

Introduce option for high volume request tracking

Beyla tracks the full request completion time, this typically means we look to see if the application is responding
with more data after the first HTTP response. One example would be a large file download, where the majority of the time
is actually serializing the data on the wire. When the client uses keep-alive, we don't necessarily see the connection
close event, but we tell by new pushed requests that we should terminate an earlier request.

This approach doesn't work well in when there's high volume of requests, e.g. beyond our current map sizing. The delayed
requests will likely be booted out of the map before we have a chance to complete them.

The BEYLA_BPF_HIGH_REQUEST_VOLUME configuration option forces Beyla to complete the request as soon as the response
is finished. It will produce less accurate accounting for large file downloads, but it will avoid no data for high
volume of requests.

More info: #1192

Use scratch as the base to build the Beyla docker images

It provides smaller images, as well as removing the risk for any potential vulnerability in the base image.

More info: #1367

Kubernetes: no need for a privileged init container anymore

The way Beyla internally mounts and shares some eBPF data structures has changed. This removes the necessity of
giving Beyla elevated privileges, or creating a privileged init container to mount the BPF file system.

More info: #1251

Experimental: Kubernetes API cache service

⚠️ This is an experimental service aimed only for developer preview. Expect breaking changes. Make sure that the
deployed image of the cache service (grafana/beyla-k8s-cache:1.9.x) matches the
version of the Beyla image

To decorate the traces and metrics with Kubernetes metadata, each Beyla instance establishes a connection to the
Kubernetes cache service. On big clusters (500+ nodes, 500+ Beyla instances), this action could greatly overload the
Kubernetes API because listening for cluster-global resources is really expensive.

Experimentally, you can configure Beyla to move the Kube API subscription logic to an external service (with fewer
instances), and connect Beyla to the Kubernetes API cache service instead of the Kubernetes API directly.

The easiest way to enable this service is via our latest Helm chart, in values.yml:

k8sCache:
  replicas: <typically 1 cache replica for 50 Beyla instances>

Other changes/additions

New Contributors

Full Changelog: v1.8.8...v1.9.0

Don't miss a new beyla release

NewReleases is sending notifications on new releases.