github cilium/tetragon v1.2.0

14 days ago

v1.2.0 Releases notes

Upgrade notes

Read the upgrade notes carefully before upgrading Tetragon.
Depending on your setup, changes listed here might require a manual intervention.

Helm Values

  • Tetragon container now uses the gRPC liveness probe by default. To continue using "tetra status" for liveness probe,
    specify tetragon.livenessProbe Helm value. For example:
tetragon:
  livenessProbe:
     timeoutSeconds: 60
     exec:
       command:
       - tetra
       - status
       - --server-address
       - "54321"
       - --retries
       - "5"
  • Deprecated tetragon.skipCRDCreation Helm value is removed. Use crds.installMethod=none instead.

  • tetragon.ociHookSetup Helm value is deprecated. Use tetragon.rthooks instead.

Events (protobuf API)

  • Sensor managing methods have been deprecated:
    • ListSensors
    • EnableSensor
    • DisableSensor
    • RemoveSensor

Metrics

  • tetragon_policyfilter_metrics_total metric is renamed to tetragon_policyfilter_operations_total, and its op
    label is renamed to operation.
  • tetragon_missed_events_total metric is renamed to tetragon_bpf_missed_events_total.
  • Metrics related to ring buffer and events queue are renamed:
    • tetragon_ringbuf_perf_event_errors_total -> tetragon_observer_ringbuf_errors_total
    • tetragon_ringbuf_perf_event_received_total -> tetragon_observer_ringbuf_events_received_total
    • tetragon_ringbuf_perf_event_lost_total -> tetragon_observer_ringbuf_events_lost_total
    • tetragon_ringbuf_queue_received_total -> tetragon_observer_ringbuf_queue_events_received_total
    • tetragon_ringbuf_queue_lost_total -> tetragon_observer_ringbuf_queue_events_lost_total
  • tetragon_errors_total{type="process_cache_evicted"} metric is replaced by tetragon_process_cache_evicted_total.
  • tetragon_errors_total{type=~"process_cache_miss_on_get|process_cache_miss_on_remove"} metrics are replaced by
    tetragon_process_cache_misses_total{operation=~"get|remove"}.
  • tetragon_event_cache_<entry_type>_errors_total metrics are replaced by
    tetragon_event_cache_fetch_failures_total{entry_type="<entry_type>"}.
  • tetragon_event_cache_accesses_total metric is renamed to tetragon_event_cache_inserts_total.
  • tetragon_event_cache_retries_total metric is renamed to tetragon_event_cache_fetch_retries_total.
  • tetragon_errors_total{type="event_missing_process_info"} metric is replaced by
    tetragon_events_missing_process_info_total.
  • tetragon_errors_total{type="handler_error"} metric is removed. Use tetragon_handler_errors_total instead.

Major Changes:

Bugfixes:

  • bpf: use CORE for execve hook (#2399) by @kkourt
  • Don't create PodInfo if the pod is being deleted (#2431) by @michi-covalent
  • tetragon: allow namespaced and non-namespaced policies to have the same name (#2337) by @joshuajorel
  • operator: Don't start metrics server if Helm value tetragonOperator.prometheus.enabled is set to false. (#2484) by @yukinakanaka
  • enforcer: fix issue when using multiple calls with fmod_ret (#2524) by @kkourt
  • Reduce the kernel memory footprint (accounted by the cgroup memory controller) of the stack trace feature when unused. (#2546) by @mtardy
  • Reduce the kernel memory footprint (accounted by the cgroup memory controller) of the ratelimit feature when unused (around ~10MB per kprobe). (#2551) by @mtardy
  • Reduce the kernel memory footprint (accounted by the cgroup memory controller) of the fdinstall feature when unused (around ~11MB per kprobe). (#2563) by @mtardy
  • Do not increase the reference count when we cannot find a parent in kthreads. (#2620) by @tpapagian
  • Reduce the kernel memory footprint (accounted by the cgroup v2 memory controller) of the override feature when unused (around ~3MB per kprobe). (#2692) by @mtardy
  • Fix a bug related to the matchBinaries Prefix operator by increasing the buffer size used by our dentry walk. Now the matchBinaries Prefix operator can correctly trigger a match on any path above 255 chars. (#2764) by @mtardy
  • Fix a bug where the tetra getevents command would timeout even if the connection was successful. (#2765) by @mtardy
  • Fix missing cases in the compact encoder for tetra. (#2819) by @willfindlay
  • add support for pod association via cgroup id (#2776) by @kkourt
  • Allow disabling gRPC either by selecting 'enabled:false' in the helm chart or by passing an empty address to the agent (#2826) by @kkourt
  • Fix tetragon_process_cache_size metric (#2827) by @lambdanis

Minor Changes:

  • proc: set auid to -1 for generated kernel pid 0 (#2400) by @tixxdz
  • Wait for Tetragon's images exist before run test (#2401) by @Trung-DV
  • tetragon: Add cgroup rate support (#2177) by @olsajiri
  • oci-hook: allow users to set a list of namespace exceptions and define default (#2404) by @f1ko
  • test: fix TestTraceKernelModule test (#2433) by @tixxdz
  • tetragon: Add inline function macro (#2452) by @olsajiri
  • helm: Add tetragon.livenessProbe value (#2469) by @michi-covalent
  • tetragon: Use static funcs in few places (#2453) by @olsajiri
  • btf: print original error returned by ebpf btf.TypeByName() (#2458) by @tixxdz
  • tetragon: cache username lookups (#2448) by @tixxdz
  • helm: Remove deprecated tetragon.skipCRDCreation value (#2498) by @lambdanis
  • btf: take first entry on multiple btf validation (#2488) by @tixxdz
  • tetragon: Add LoadProgramOpts function (#2489) by @olsajiri
  • tetragon: Remove bpf_globals object (#2521) by @olsajiri
  • sensors: allow reporting policy status when loading/unloading sensors (#2506) by @kkourt
  • tetragon: Limit max entries of cgroup_rate_map when it's not used (#2555) by @olsajiri
  • tetragon: Factor the maps max entries setup (#2565) by @olsajiri
  • tetragon:username: use login name instead of display name (#2585) by @tixxdz
  • process:bpf: report euid as the process.uid (#2575) by @tixxdz
  • Implement an export filter to target parent process binary name. (#2607) by @willfindlay
  • tetragon: fail if --username-metadata receives invalid value (#2596) by @tixxdz
  • tetragon: resolve uid to username for exec events from /proc fs (#2588) by @tixxdz
  • cmd: Move metrics-docs out of tetra and refactor it (#2611) by @lambdanis
  • Reorg to factor mac entries setup and add a max entries test (#2587) by @olsajiri
  • tetragon: Add debug interface to track cgroups to workload/ns mappings (#2540) by @jrfastab
  • rthooks: support NRI (#2608) by @kkourt
  • helm, doc: Added debug Helm flag for the agent (#2622) by @PhilipSchmid
  • deprecate sensors gRPC API (#2630) by @kkourt
  • helm: Don't give operator permissions to create CRDs if not needed (#2326) by @itsCheithanya
  • store thread leader namespaces at fork and reduce false positives (#2695) by @tixxdz
  • tetragon: make resolving uid to username work with a processapi struct (#2705) by @tixxdz
  • tetra: LSM events compact print support (#2703) by @anfedotoff
  • tetragon: only allow single instance to run on a node (#2747) by @inliquid
  • tetragon: Factor loader tailcall setup (#2719) by @olsajiri
  • tracing: introduce FollowChildren attribute in MatchBinaries selector (#2720) by @kkourt
  • Add missed probes metrics (#1941) by @olsajiri
  • tetragon_policyfilter_metrics_total metric is renamed to tetragon_policyfilter_operations_total, and its op label is renamed to operation. (#2784) by @lambdanis
  • tetragon: persistent monitoring fixes (#2795) by @olsajiri
  • Add the Postfix and NotPostfix operators to the matchBinaries selector. (#2689) by @anfedotoff
  • metrics: Expose go_sched_latencies_seconds (#2802) by @lambdanis
  • tetra: Added dynamic log level change option (#2643) by @PhilipSchmid
  • cgidmap: fix initialization bug (#2829) by @kkourt
  • helm: Add tetragon_pod label to metrics via ServiceMonitor (#2828) by @lambdanis
  • Expose kernel ringbuffer errors in metrics (#2839) by @lambdanis
  • Refactor & rename ringbuf metrics (#2833) by @lambdanis
  • helm: Support adding extra labels to ServiceMonitors (#2830) by @lambdanis
  • metrics: Expose more errors in tetragon_bpf_missed_events_total counter (#2855) by @lambdanis
  • Replace process cache evictions and misses metrics (#2857) by @lambdanis
  • Refactor and rename eventcache metrics (#2861) by @lambdanis
  • Replace missing process info metric (#2863) by @lambdanis
  • Remove tetragon_errors_total{type="handler_error"} metric (#2862) by @lambdanis
  • tetragon: fixes (#2823) by @olsajiri

CI Changes:

  • TestLabelsDemoApp: Replace isovalent/jobs-app by Opentelemetry demo app (#2345) by @Trung-DV
  • renovate: add v1.1 in stable branches in config (#2432) by @mtardy
  • tetragon: debugging map duplication extending prog/map testers (#2455) by @jrfastab
  • Minor improvements to the release process (#2482) by @lambdanis
  • vmtests: deduplicating code using LVH library for arm64 support (#2333) by @mtardy
  • renovate: switch to get Go version from toolchain directive (#2494) by @mtardy
  • renovate: update Go version properly for v1.1 (#2509) by @mtardy
  • renovate: fix Go postUpgradeTasks for stable branches (#2514) by @mtardy
  • renovate: group all go updates together and fix a rule (#2522) by @mtardy
  • docs: ignore some index.html link from link checker (#2526) by @mtardy
  • Increase maximum number of tries in WaitForTracingPolicy (#2547) by @tpapagian
  • Uninstall Tetragon after each e2e test. (#2541) by @tpapagian
  • docs: update docs dev deps for security fixes (#2577) by @mtardy
  • policyfiletr K8s test fix (#2629) by @kkourt
  • rthook: finish renovate config update due to rename (#2655) by @mtardy
  • tests/e2e: clone proto event in rpcchecker (#2688) by @willfindlay
  • workflows: fix the PR link checker script for raw GitHub links (#2712) by @mtardy
  • Increase timeout in WaitForTracingPolicy. (#2755) by @tpapagian
  • CI: Changed lint Helm CI trigger (#2804) by @PhilipSchmid
  • CI: Improved K8s Kubeconformance validation (#2811) by @PhilipSchmid
  • CI: Helm lint: Remove pipenv dependency (#2837) by @PhilipSchmid
  • fork_test: remove pid export filter (#2831) by @kkourt
  • verify.sh: Handle when bpf_verride_return is unavailable (#2838) by @russellb
  • CI: Improve/stabilize lint Helm CI workflow (#2847) by @PhilipSchmid
  • renovate: add 'make metrics-docs' to post upgrade cmds (#2864) by @mtardy

Documentation changes:

Dependency updates:

  • chore(deps): update docker.io/golangci/golangci-lint docker tag to v1.59.0 (main) (#2415) by @cilium-renovate[bot]
  • update github.com/cilium/ebpf (#2717) by @lmb
  • Update Dockerfiles to build clang-18 images (#2814) by @mtardy
  • Upgrade to Cilium 1.16.1 and Kubernetes 1.31.0 (#2820) by @mtardy
  • Revert "Upgrade to Cilium 1.16.1 and Kubernetes 1.31.0" (#2849) by @mtardy
  • chore(deps): update docker.io/golangci/golangci-lint docker tag to v1.60.3 (main) (#2813) by @cilium-renovate[bot]

Misc Changes:

Don't miss a new tetragon release

NewReleases is sending notifications on new releases.