What's Changed
Beyla 1.9.0 is released with major internal changes, in preparation to what's coming for the future Beyla 2.0 release.
Breaking changes 🔨
Removed override_instance_id
configuration option
This option was aimed uniquely for debugging purposes.
More info: #1125
Fix instance and job in Prometheus exporter
Renaming target_instance
Prometheus attribute to instance
. Also, the job
attribute has been added to Prometheus.
Now, all the metrics are consistent, no matter they are exported via OTEL or Prometheus.
More info: #1130
Set OTEL service name and namespace from application environment variables
If the application has set the OTEL_SERVICE_NAME
or OTEL_SERVICE_NAMESPACE
variables in its environment,
Beyla will use them to set the reported service name and namespace.
If the variables are not there, Beyla will use the previously existing mechanism to set service name and namespace.
Bug fixes 🐞
Fix cgroup ID parsing in newest Docker versions
More info: #1287
Fix OS capability checking
There were few bugs in the OS capability checking which are being fixed with this PR:
- If SYS_ADMIN is present, it effectively means all capabilities.
- If we have kernel older than 5.8, SYS_ADMIN is a must, the others weren't split off yet.
- If we have NET_ADMIN we also have NET_RAW, so we can relax that check.
More info: #1131
What's new
Introduce option for high volume request tracking
Beyla tracks the full request completion time, this typically means we look to see if the application is responding
with more data after the first HTTP response. One example would be a large file download, where the majority of the time
is actually serializing the data on the wire. When the client uses keep-alive, we don't necessarily see the connection
close event, but we tell by new pushed requests that we should terminate an earlier request.
This approach doesn't work well in when there's high volume of requests, e.g. beyond our current map sizing. The delayed
requests will likely be booted out of the map before we have a chance to complete them.
The BEYLA_BPF_HIGH_REQUEST_VOLUME
configuration option forces Beyla to complete the request as soon as the response
is finished. It will produce less accurate accounting for large file downloads, but it will avoid no data for high
volume of requests.
More info: #1192
Use scratch
as the base to build the Beyla docker images
It provides smaller images, as well as removing the risk for any potential vulnerability in the base image.
More info: #1367
Kubernetes: no need for a privileged init container anymore
The way Beyla internally mounts and shares some eBPF data structures has changed. This removes the necessity of
giving Beyla elevated privileges, or creating a privileged init container to mount the BPF file system.
More info: #1251
Experimental: Kubernetes API cache service
⚠️ This is an experimental service aimed only for developer preview. Expect breaking changes. Make sure that the
deployed image of the cache service (grafana/beyla-k8s-cache:1.9.x
) matches the
version of the Beyla image
To decorate the traces and metrics with Kubernetes metadata, each Beyla instance establishes a connection to the
Kubernetes cache service. On big clusters (500+ nodes, 500+ Beyla instances), this action could greatly overload the
Kubernetes API because listening for cluster-global resources is really expensive.
Experimentally, you can configure Beyla to move the Kube API subscription logic to an external service (with fewer
instances), and connect Beyla to the Kubernetes API cache service instead of the Kubernetes API directly.
The easiest way to enable this service is via our latest Helm chart, in values.yml
:
k8sCache:
replicas: <typically 1 cache replica for 50 Beyla instances>
Other changes/additions
- Add 'watch services' permission to unprivileged example by @marevers in #1126
- Deduplicate instance ids and restore target_instance in Prometheus by @mariomac in #1129
- Update OTEL collector library to v0.108.1 by @mariomac in #1133
- Helm chart: allow unprivileged deployment of Beyla by @marevers in #1128
- Update OTEL collector library to v0.108.1 (1.8 backport) by @mariomac in #1134
- Automatic update of offsets.json by @github-actions in #1136
- Docs: Fix link to 'Beyla and Kubernetes walkthrough' by @marevers in #1141
- Update rust test dependencies versions by @rafaelroquetto in #1142
- Automatic update of offsets.json by @github-actions in #1149
- Refactor to have only one Go tracer by @marctc in #1132
- Update rails test Dockerfile by @rafaelroquetto in #1148
- Add target for ARM integration tests by @rafaelroquetto in #1139
- Avoid that a Pod update removes the container metadata by @mariomac in #1156
- Add Linux Traffic Control probes for App O11y by @grcevski in #1160
- Increase buffer size to 192 to capture longer URLs by @marevers in #1150
- Process metrics dashboard by @mariomac in #1109
- Automatic update of offsets.json by @github-actions in #1163
- Propagate context through TCP packets by @grcevski in #1161
- Allow filtering by client/server in application traces by @mariomac in #1166
- Fixing Docker Generator build action by @mariomac in #1164
- feat(helm): additional labels for ServiceMonitor by @nlamirault in #1167
- Revert OTel expiration code by @grcevski in #1143
- Fix bounds check in kafka parsing by @grcevski in #1171
- Enforce clang-format for C source files by @rafaelroquetto in #1177
- Fix clang-format-check workflow file by @rafaelroquetto in #1179
- Support for RHEL 4.18 kernels by @rafaelroquetto in #1175
- Add two ports to service, daemonset and servicemonitor conditionally by @marevers in #1168
- Split eBPF load and attach for Go programs by @grcevski in #1169
- Add some default settings for beyla application metrics by @xujiaxj in #1184
- Use git-lfs to track .o files by @rafaelroquetto in #1183
- Use clang-tidy on ebpf code by @rafaelroquetto in #1180
- Automatic update of offsets.json by @github-actions in #1191
- Add clang-tidy make target by @rafaelroquetto in #1189
- Add quickstart build instructions to the README file by @rafaelroquetto in #1188
- Move bin files back to git lfs by @rafaelroquetto in #1193
- Introduce option for high volume request tracking by @rafaelroquetto in #1196
- Add workflow for checking git-lfs files by @rafaelroquetto in #1194
- Use struct with pid and Go routine addr for Go BPF maps by @marctc in #1182
- Fix linting/compilation on Darwin environments by @mariomac in #1199
- Add metrics to measure latency of k8s informer by @marctc in #1200
- Extract ReplicaSet name from pod name by @mariomac in #1202
- Try to fix unmounting of BPF FS during integration tests by @mariomac in #1205
- Remove ReplicaSet informer by @mariomac in #1204
- Use struct with pid and Go routine addr for Go BPF maps by @marctc in #1201
- Discover service names from process env vars by @grcevski in #1195
- Add option to skip ConfigMap check by @marevers in #1208
- Use only the required informers by @mariomac in #1210
- Allow configuring informer resync time by @mariomac in #1216
- Automatic update of offsets.json by @github-actions in #1220
- update helm chart to use Beyla 1.8.4 by @mariomac in #1223
- Account for deleted files in workflow files by @rafaelroquetto in #1218
- Always decorate k8s_owner_name by @mariomac in #1226
- Make EBPF tracer config visible by @mariomac in #1222
- Move already instrumented executable log messages to Debug level by @mariomac in #1227
- Automatic update of offsets.json by @github-actions in #1230
- Revert "Add metrics to measure latency of k8s informer (#1200)" by @marctc in #1214
- Revert "Add some default settings for beyla application metrics (#1184)" by @mariomac in #1231
- update helm chart version before re-releasing by @mariomac in #1233
- Don't wait for BPF unmount more than 5 seconds by @grcevski in #1238
- Rework TC context propagation to use the IP options by @grcevski in #1237
- Fix traces sampler by @grcevski in #1240
- Revert: reducing scope of informer by @mariomac in #1245
- Unify HTTP SSL, K probes and NodeJS tracer in a single tracer by @marctc in #1215
- fix flaky K8s network integration test by @mariomac in #1250
- Fix edge condition with kafka request parsing by @grcevski in #1252
- Share bpf maps internally and remove pinning / bpffs requirement by @rafaelroquetto in #1251
- Automatic update of offsets.json by @github-actions in #1257
- Better Java context propagation by @grcevski in #1260
- Update vendored dependencies & fix Darwin compilation by @mariomac in #1262
- Use informer code from beyla-k8s-cache by @mariomac in #1256
- Restoring disabled informers in beyla-k8s-cache by @mariomac in #1264
- parse sql host address and port by @esara in #1255
- Automatic update of offsets.json by @github-actions in #1268
- Docs information architecture refactor pass 1 by @grafsean in #1259
- Split tc programs from generic tracer by @rafaelroquetto in #1267
- Enabling external informer by @mariomac in #1266
- K8s integration tests: export logs before killing beyla by @mariomac in #1274
- flaky language detection test: add extra logs by @mariomac in #1275
- Replace drone by github actions for image publishing by @mariomac in #1271
- Fix wrong language detection by @mariomac in #1276
- Moving here the code from beyla-k8s-meta repository by @mariomac in #1278
- Helm chart: enable profile_port by @mariomac in #1272
- Fix network flows flaky test by @mariomac in #1282
- K8s cache: fix coverage report and add graceful stop by @mariomac in #1281
- Move K8s cache Docker publish actions here by @mariomac in #1284
- Remote K8s meta service: wait for synchronization at startup by @mariomac in #1283
- Check third-party licenses on PR by @mariomac in #1285
- K8s env vars by @grcevski in #1279
- Unblock remote cache synchronization by @mariomac in #1289
- Typo in docs by @duncan485 in #1288
- VM tests: explicit install bash by @rafaelroquetto in #1293
- Fix the return value of bpf_strstr_tp_loop when it does not meet the … by @tsint in #1294
- Try to fix build-push-to-dockerhub by @mariomac in #1292
- Fix repository in docker push action by @mariomac in #1297
- separate image builders by architecture by @mariomac in #1298
- GitHub Actions: revert separate docker publish builders by @mariomac in #1299
- GitHub Action, docker build: replace arm64 runner by amd64 runner by @mariomac in #1300
- regenerate BPF binaries after PR #1294 by @mariomac in #1295
- Fix docker image generation with LFS by @mariomac in #1301
- Fix codecov flags of VM integration tests by @mariomac in #1303
- Tune up versioned docker release scripts by @mariomac in #1302
- Add configuration options to Kube Cache service by @mariomac in #1304
- add namespace to server and peer name across namespaces by @esara in #1247
- Complete the work on TCP packet context propagation by @grcevski in #1290
- Nuke nodejs uretprobes by @rafaelroquetto in #1305
- Change kafka to use Statement instead of Othernamespace by @grcevski in #1306
- Use topic provided in the key first by @grcevski in #1307
- (Experimental) Trace context propagation via HTTP headers by @rafaelroquetto in #1291
- Add K8s metadata cache service to helm chart by @mariomac in #1296
- Update opentelemetry collector library to 0.112.0 by @mariomac in #1310
- Ensure http clients can nest under SQL for Go by @grcevski in #1308
- Fix missing store cleanup on podsByContainer by @grcevski in #1312
- Helm chart: remove duplicity of labels by @mariomac in #1313
- Add Formatting to Variable Name by @SeamusGrafana in #1316
- Helm chart: fix cache port configuration by @mariomac in #1319
- Add missing mutex in kube store functions by @marctc in #1320
- Fix potential deadlock in store.go by @mariomac in #1321
- Refactor TC code for code reuse by @grcevski in #1314
- Refactoring of the L7 CP BPF code and some other tweaks by @grcevski in #1315
- K8s store: fix access to mutex to avoid concurrent map read/write by @mariomac in #1328
- Update github workflows to use upload-artifact@v4 by @marctc in #1329
- Automatic update of offsets.json by @github-actions in #1334
- Informers cache: don't send updates for non-meaningful Pod/Service updates by @mariomac in #1330
- Helm chart: allow setting limits to beyla cache by @mariomac in #1335
- Add default excluded services to our Beyla Helm chart by @grcevski in #1332
- Fix crash on start by @mariomac in #1337
- Automatic update of offsets.json by @github-actions in #1340
- Reverting kubernetes version library by @mariomac in #1341
- Fix memory leak on kubernetes metadata store by @mariomac in #1342
- Improve k8s env parsing by @grcevski in #1339
- Fix imagepullpolicy in helm chart for k8s cache by @mariomac in #1343
- Fix crash on k8s container env parsing by @mariomac in #1344
- Automatic update of offsets.json by @github-actions in #1345
- Stop flooding cache logs on client disconnection/context cancelation by @mariomac in #1347
- Add timeout to grpc server.Send by @mariomac in #1350
- K8s cache: Improving performance of client cancellation by @mariomac in #1353
- make sure cache service connection stops on error by @mariomac in #1355
- Asynchronous synchronization of Beyla cache by @mariomac in #1358
- Enable batching for traces by @marctc in #1352
- Kube meta store: Make sure all the object metadata is deleted by @mariomac in #1359
- Some informer optimizations by @mariomac in #1360
- Revert "K8s cache: Improving performance of client cancellation (#1353)" by @mariomac in #1362
- internal instrumentation for k8s cache by @mariomac in #1365
- Use scratch for the rest of Beyla images by @marctc in #1373
- Ensure ring buf maps have sane max entries values by @rafaelroquetto in #1374
- Fix helm chart selector conflicts and update Beyla version to 1.8.8 by @mariomac in #1376
- Fix flaky unit test by @mariomac in #1378
- Helm chart: removed unneeded exposed port from Cache Service by @mariomac in #1380
- Rename Beyla cache internal metrics to use attributes by @mariomac in #1382
- Fix rounding function for max entries by @rafaelroquetto in #1381
- Fix environment variable name for configuring otel traces features by @bjor-joh in #1387
- Set instance ID from pod:container and let setting metadata from annotations by @mariomac in #1391
- Rename svc.ID to svc.Attrs by @mariomac in #1393
New Contributors
- @duncan485 made their first contribution in #1288
- @tsint made their first contribution in #1294
- @bjor-joh made their first contribution in #1387
Full Changelog: v1.8.8...v1.9.0