Release v1.23.0
The v1.23.0 release of the Netdata Agent is all about unlocking new depths of visibility for your applications, services, and systems. We have Kubernetes service discovery, new eBPF metrics like virtual filesystem switch and bandwidth per process out of the Linux kernel at event frequency, more interoperability with your monitoring stack thanks to a new exporting engine, and much more.
This release contains 2 new collectors, 1 new exporting connector, 1 new alarm notification method, 55 improvements, 45 documentation updates, and 40 bug fixes.
At a glance
Our service discovery collector detects Kubernetes (k8s) pods and immediately collects metrics from 22 different services as the associated pods are created, destroyed, and scaled. Service discovery is installed when you use our Helm chart, which means you can now collect and visualize service-, pod-, Kubelet-, kube-proxy-, and node-level k8s metrics with one helm install
command and zero configuration. All our Kubernetes monitoring components are open source and free for clusters of any size.
Our low-level Linux kernel monitoring via eBPF is now supercharged. Thanks to an integration with apps.plugin
, you can now monitor how a specific application interacts with the Linux kernel. This update also includes new metrics, such as virtual filesystem switch, bandwidth per process, and much more. Netdata collects these metrics at an event frequency, even better than our famous 1s granularity, so that you can debug applications or anomalies with pinpoint accuracy. The eBPF collector is also now installed and enabled by default except on static builds.
Read our guide on troubleshooting apps with eBPF metrics for more details.
Netdata is now more interoperable with your existing monitoring stack thanks to the exporting engine, which replaces the backends system. You can now export to multiple external databases through Graphite, Google Cloud Pub/Sub, Prometheus remote write, MongoDB, and JSON connectors, plus others. Send metrics as soon as they're collected to enrich single pane of glass views or analyze Netdata's metrics with machine learning.
Read our guide on exporting metrics to Graphite for specifics on just one of many pipelines you can set up to archive your Netdata metrics.
We're also releasing an improvement for the availability of your monitoring and metrics: persistent metadata. The Agent now writes metadata to disk alongside metrics to allow access to non-active charts from Netdata Cloud and enable future features.
We added some enhancements to our documentation site, including a new guides section. We'll continue to populate with more use case- and scenario-based content to help you monitor, troubleshoot, visualize, and export your Netdata metrics.
Acknowledgments
- okias for adding support for Matrix notifications.
- elelayan for adding an OSD size collection chart to the Ceph collector.
- vsc55 for fixing the required packages for Gentoo builds.
- rushikeshjadhav for fixing the Xenstat collector to correctly track the last number of vCPUs.
- Saruspete for removing conflicting EPEL packages.
- MrFreezeex for fixing suid bits in Debian packaging.
- Neamar for fixing a typo in the dashboard's description of the
mem.kernel
chart. - jeffgdotorg for fixing incorrectly formatted TYPE lines in the Prometheus backend/exporter.
- tnyeanderson for continuing to improve his
dash.html
custom dashboard. - dpsy4 for fixing our Swagger API file.
- araemo for fixing alarms around RAM usage in ZFS systems.
- slavaGanzin for implementing a fix to the PostgreSQL collector.
- pkrasam, thoggs, oneoneonepig, Steve8291, stephenrauch, waybeforenow, zvarnes, electropup42, cherouvim, thenktor, webash and gruentee for contributing documentation changes.
Improvements
- Added libuv thread names support to FATAL log level. (#9382) by mfundul
- Updated the React dashboard to v1.0.14_2. (#9350) by jacekkolasa
- Improved PR guidelines for developers and contributors. (#8809) by prologic
- Removed master-slave verbiage and replaced it with parent-child. (#9323) by amoss, (#9312) by joelhans
- Added support for persistent metadata. (#9324) by stelfrag
- Add verbose prints when spawn server fails to spawn. (#9305) by mfundul
- Updated streaming protocol calculate clock-slew and gap-size when child nodes reconnect to a parent. (#9214) by amoss
- Implemented a new incremental parser for internal plugins and child nodes. (#9074) by stelfrag
- Improved database engine by reducing its minimum size to 64 MiB. (#9094) by mfundul
- Added alphabetical sort and automatic scroll to
dash.html
. (#8762) by tnyeanderson - Added a spawn server to improved Agent scalability by reducing the impact of alarm execution and notification to critical sections in the main health thread. (#8407) by mfundul
Netdata Cloud
- Added metrics for ACLK performance and status to the Netdata Monitoring section of the dashboard. (#9269) by underhood
- Improved the node re-claiming process by regenerating the topic base. (#9044) by amoss
Collectors
- Updated the Go orchestrator to v0.19.2. (#9340) by ilyam8
- Added the
agent-service-discovery
collector plugin toapps_group.conf
. (#9315) by ilyam8 - Improved consistency of Kubernetes cgroup names. (#9303) by cakrit
- Updated the Go orchestrator to v0.19.1. (#9309) by ilyam8
- Added imunify and lsphp to
apps_groups.conf
. (#9284) by thiagoftsm - Updated the Go orchestrator to v0.19.0. (#9294) by ilyam8
- Added support for the eBPF collector in static installations (
kickstart-static64.sh
). (#8879) by prologic - Updated the eBPF kernel-collector to v0.4.0. See the changelog for details. (#9212) by Ferroin
- Added integration between
ebpf.plugin
andapps.plugin
. (#9178) by thiagoftsm - Converted the eBPF collector into a modular design to allow multiple eBPF programs to run in parallel. (#9148) by thiagoftsm
- Added an OSD size collection chart to the Ceph collector. (#8649) by elelayan
- Updated the eBPF kernel-collector to v0.2.0. See the changelog for details. (#9118) by prologic
- Improved
system-info.sh
to better handle certain cases when gathering info on the system's disk capacity. (#7902) by Ferroin - Changed the eBPF collector to install and enable it by default. (#8665) by Ferroin
- Enhanced the Samba collector to only use
sudo
when not running as the root user. (#9038) by Duffyx - Renamed the eBPF collector from
ebpf_process.plugin
toebpf.plugin
. (#8822) by thiagoftsm - Added more command line options to the eBPF collector to support upcoming features. (#8879) by thiagoftsm
- Added compatibility for Varnish Cache Plus in the
varnish
collector. (#8940) by pgjavier
Packaging/installation
- Added new streaming files into CMake build. (#9316) by underhood
- Added support for macOS/Homebrew in
install-required-packages.sh
. (#8286) by Ferroin - Improved reliability of checksums for
kickstart.sh
/kickstart-static64.sh
installation scripts. (#9165) by prologic - Added required bundle for libuuid on ClearLinux. (#9060) by Ferroin
- Removed conflicting EPEL packages. (#9108) by Saruspete
Exporting
- Moved
nc
backend to exporting. (#9030) by thiagoftsm - Added missing checks to exporting engine. (#9034) by thiagoftsm
- Added new alarms for exporting engine resource usage and deprecation of backends. (#9075) by thiagoftsm
- Added an error report to the AWS Kinesis connector. (#9048) by thiagoftsm
- Added memory cleanup to remaining exporting connectors. (#9098) by thiagoftsm
- Added a warning if the exporting engine's update interval is not a multiple of the database's update interval. (#9131) by vlvkobal
- Added anonymous statistics to exporting engine to collect usage data. (#9125) by vlvkobal
- Improved dynamic memory cleanup for Pub/Sub exporting connector. (#9112) by vlvkobal
- Improved dynamic memory cleanup for the MongoDB exporting connector. (#9103) by vlvkobal
- Finalized the main cleanup function for the exporting engine. (#9099) by vlvkobal
- Added a function to help clean up memory on exit. (#9081) by vlvkobal
- Added a Google Cloud Pub/Sub connector to the exporting engine. (#8855) by vlvkobal
Notifications
CI/CD
- Removed Gentoo from CI checks. (#9327) by prologic
- Added a random offset to the update script when running non-interactively. (#9245) by Ferroin
- Added a CI check for building against LibreSSL. (#9216) by prologic
- Added a health check functionality to Docker images. (#9172) by Ferroin
- Added CI for static builds of the Netdata Agent (used by
kickstart-static64.sh
). (#9130) by prologic - Removed deprecated documentation Dockerfile and associated Docker Hub image. (#9126) by prologic
- Removed deprecated documentation tooling. (#8783) by prologic
- Added a CI job to check Markdown links during PRs. (#9003) by joelhans
- Removed Polyverse Polymorphic Linux from Docker builds to reduce the image size. (#8802) by Ferroin
Documentation
- Fixed a typo in the Synology installation documentation. (#9400) by pkrasam
- Added a guide for troubleshooting with eBPF metrics. (#9352) by joelhans
- Improved the FreeBSD installation documentation. (#9116) by thoggs
- Added a missing slash to the claiming documentation. (#9257) by oneoneonepig
- Changed the recommended repository for CentOS 8 users. (#9308) by Ferroin
- Added a guide for exporting metrics to Graphite. (#9285) by joelhans
- Added a link in the eBPF documentation to the kernel documentation for ftrace. (#9211) by Steve8291
- Fixed curly to straight apostrophe. (#8723) by zack-shoylev
- Added documentation and dashboard information for new eBPF-apps.plugin integration. (#9199) by thiagoftsm
- Moved and refactored docs to accomodate new Guides section on Learn. (#9266) by joelhans
- Removed outdated information/links from main README and registry doc. (#9265) by joelhans
- Added notes/known issues section to installation page. (#9053) by joelhans
- Fixed ambiguity in health reference for
of
andforeach
options in lookup line. (#9255) by underhood - Added a new "home base" document for the exporting engine. (#9246) by joelhans
- Improved database engine documentation for streaming setups. (#9177) by joelhans
- Fixed typo in eBPF collector
README.md
. (#9205) by Steve8291 - Fixed typo in
README.md
. (#9151) by stephenrauch - Removed the "experimental" label from the exporting engine documentation. (#9171) by vlvkobal
- Fixed typo in step 3 of step-by-step guide. (#9150) by waybeforenow
- Added a Certbot troubleshooting section to step 10 of the step-by-step guide. (#9000) by Jelmerrevers
- Updated eBPF documentation to reflect default enabled status. (#9105) by joelhans
- Added ACLK connection details. (#9047) by zack-shoylev
- Added CMake to the list of packages to install on FreeBSD installations. (#9031) by zvarnes
- Improved Synology installation document with better formatting and instructions. (#8658) by thenktor
- Updated pfSense installation document with new packages and processes. (#8544) by electropup42
- Updated documentation contributing guidelines and Netdata style guide. (#8781) by joelhans
- Added links to promote database engine calculator. (#9067) by joelhans
- Updated exporting engine documentation to prepare for enabling it by default. (#9066) by vlvkobal
- Added requirements to the ProxySQL collector documentation. (#9071) by ilyam8
- Added proc.plugin configuration example for high-processor systems. (#9062) by joelhans
- Added frontmatter for exporting connectors. (#9052) by joelhans
- Fixed grammar error in HAProxy documentation. (#8703) by cherouvim
- Updated FreeBSD package installation documentation. (#8643) by thenktor
- Fixed
docker run
instruction in claiming document. (#9058) by ilyam8 - Added a note about restarting a node during reclaiming. (#9049) by zack-shoylev
- Removed mentions of old Cloud and replaced them with new Cloud/dashboard. (#8874) by joelhans
- Fixed broken link in web server log guide on GitHub. (#9033) by joelhans
- Removed emoji from step-by-step guide. (#8872) by MeganBishopMoore
- Added text to claiming documentation about reclaiming. (#9027) by joelhans
- Updated daemon output with new URLs and dates. (#8965) by joelhans
- Added
netdatalib
andnetdatacache
volumes to the Docker-with-Caddy documentation. (#8999) by webash - Fixed an incorrect file name in the Go-based web log collector. (#8964) by gruentee
- Removed incorrect
UNUSED
from flood protection configuration options documentation. (#8964) by mfundul - Fixed internal links and removed obsolete admonitions. (#8946) by joelhans
- Updated docs with go-live claiming and ACLK information. (#8960) by joelhans
Bug fixes
- Fixed a Coverity defect. (#9402) by amoss
- Fix a bug in the simple exporting connector that caused crashes when both
opentsdb:https
and another connector were enabled together. (#9389) by vlvkobal - Fixed missing host variables on stream. (#9396) by thiagoftsm
- Fixed race-hazard in streaming during the shutdown sequence. (#9370) by amoss
- Fixed error handling and recovery during compaction and metadata log replay. (#9354) by stelfrag
- Fixed ACLK shutdown sequence. (#9367) by underhood
- Fixed logging by replacing
assert()
calls with newfatal_assert()
. (#9349) by mfundul - Fixed issues with CentOS 6 installations by getting Netdata execution path early to avoid user permission issues. (#9339) by mfundul
- Fixed issues with ebpf.plugin and apps.plugin integration. (#9333) by thiagoftsm
- Fixed Coverity warnings in database. (#9338) by mfundul
- Fixed compiler warnings from the database when the Agent is compiled with the
--disable-cloud
flag. (#9337) by stelfrag - Fixed invalid memory access in databases to avoid Coverity errors. (#9326) by stelfrag
- Fixed broken updates to do enabling the eBPF collector by default with a dummy
--enable-ebpf
flag. (#9310) by Ferroin - Fixed exporting to Cortex by adding an additional HTTP header to the Prometheus remore write connector. (#9302) by vlvkobal
- Fixed a race hazard causing crashes in streaming configurations. (#9297) by amoss
- Fixed handling of OpenSSL on CentOS/RHEL by bundling a static copy and selecting a configuration directory at install time. (#9263) by Ferroin
- Fixed static installation from overwriting
netdata.conf
. (#9174) by Ferroin - Fixed compilation on older systems (Ubuntu 14.04 LTS, Debian 8, CentOS 6). (#9198) by ktsaou
- Fixed broken unit tests for the exporting engine. (#9183) by vlvkobal
- Fixed an issue with the exporting engine not cleaning a string on exit. (#9188) by vlvkobal
- Fixed issue with incremental parser breaking CMake builds. (#9186) by stelfrag
- Fixed the eBPF collector failing to install on certain systems. (#9182) by prologic
- Fixed Coverity warning. (#9180) by thiagoftsm
- Fixed required packages for Gentoo builds. (#9141) by vsc55
- Fixed Coverity warning. (#9157) by stelfrag
- Fixed broken collector plugins due to bug in parser. (#9158) by stelfrag
- Fixed the Xenstat collector to correctly track the last number of vCPUs. (#8720) by rushikeshjadhav
- Fixed incorrect link in
install-required-packages.sh
to help users submit a GitHub issue. (#8911) by prologic - Fixed enable/start of
netdata
service in Debian package. (#9005) by MrFreezeex - Fixed buffer splitting in the Kinesis exporting connector. (#9122) by vlvkobal
- Fixed suid bits on plugin for Debian packaging. (#8996) by MrFreezeex
- Fixed zombie procesess in Docker image by restoring
SIGCHLD
signal handler. (#9107) by mfundul - Fixed static installation to not overwrite
netdata.conf
when updating. (#9046) by Ferroin - Fixed typo in the dashboard's description of the
mem.kernel
chart. (#9096) by Neamar - Fixed incorrectly formatted TYPE lines in the Prometheus backend/exporter. (#9086) by jeffgdotorg
- Fixed error handling in the exporting connector. (#8910) by vlvkobal
- Added a missing bracket to the Netdata API swagger
.json
file. (#8814) by dpsy4 - Fixed the health entity calculation used for
ram_in_use
andused_ram_to_ignore
in systems using ZFS. (#8913) by araemo - Fixed incorrect hostnames in the exporting engine. (#8892) by vlvkobal
- Fixed an issue with the PostgreSQL collector to correctly ignore template1/template0 databases. (#8929) by slavaGanzin