Agent
Prelude
Release on: 2025-01-13
- Please refer to the 7.61.0 tag on integrations-core for the list of changes on the Core Checks
Upgrade Notes
- Upgraded JMXFetch to 0.49.6 which fixes a
NullPointerException
on JBoss when user and password not set. See 0.49.6 for more details. - Windows containers were updated to use OpenJDK 11.0.25+9.
New Features
- Add metrics origins for Nvidia Nim integration.
- APM: New configuration apm_config.obfuscation.credit_cards.keep_values (DD_APM_OBFUSCATION_CREDIT_CARDS_KEEP_VALUES) can be used to skip specific tag keys that are known to never contain credit card numbers. This is especially useful in cases where a span tag value is a number that triggers false positives from the credit card obfuscator.
- Add new metric,
container.restarts
, which indicates the number of times a container has been restarted due to the restart policy. For more details: https://docs.docker.com/engine/containers/start-containers-automatically/. - APM: Introducing the Error Tracking Standalone config option. Only span chunks that contain errors or exception OpenTelemetry span events are taken into consideration by sampling.
- Add new windows images for LTSC 2019 and LTSC 2022:
- datadog-agent:7-servercore-ltsc2019-amd64
- datadog-agent:7-servercore-ltsc2022-amd64
- datadog-agent:7-servercore-ltsc2019-jmx-amd64
- datadog-agent:7-servercore-ltsc2022-jmx-amd64
- datadog-agent:latest-servercore-ltsc2019-jmx
- datadog-agent:latest-servercore-ltsc2022-jmx
- datadog-agent:latest-servercore-ltsc2019
- datadog-agent:latest-servercore-ltsc2022
- datadog-agent:7.X.Y-ltsc2019
- datadog-agent:7.X.Y-ltsc2022
- datadog-agent:7.X.Y-ltsc2019-jmx
- datadog-agent:7.X.Y-ltsc2022-jmx
- datadog-agent:7.X.Y-servercore-ltsc2019
- datadog-agent:7.X.Y-servercore-ltsc2022
- datadog-agent:7.X.Y-servercore-ltsc2019-jmx
- datadog-agent:7.X.Y-servercore-ltsc2022-jmx
- datadog-agent:latest-ltsc2019
- datadog-agent:latest-ltsc2022
- [ha-agent] Add haagent component used for HA Agent feature.
- The cluster-agent now can collect pod disruption budgets from the cluster.
- Added support for collecting container image metadata when running on a CRI-O runtime.
- USM now monitors TLS traffic encrypted with Go TLS by default. To disable this feature, set the service_monitoring_config.tls.go.enabled configuration option to false.
- USM now monitors traffic encrypted with Istio mTLS by default. To disable this feature, set the service_monitoring_config.tls.istio.enabled configuration option to false.
- Introduced a new configuration variable logs_config.http_protocol, allowing users to enforce HTTP/1.1 for outgoing HTTP connections in the Datadog Agent. This provides better control over transport protocols and improves compatibility with systems that do not support HTTP/2. By default, the log agent will now attempt to use HTTP/2 (unless a proxy is configured) and fall back to the best available protocol if HTTP/2 is not supported.
- Added a new feature flag enable_operation_and_resource_name_logic_v2 in DD_APM_FEATURES. Enabling this flag modifies the logic for computing operation and resource names from OTLP spans to produce shorter, more readable names and improve alignment with OpenTelemetry specifications.
- Add support for PHP Single Step Instrumentation in Kubernetes (not enabled by default)
Enhancement Notes
- [ha-agent] Run HA enabled integrations only on leader Agent
- [ha-agent] Add agent_group tag to datadog.agent.running metric
- Cluster Agent:
DatadogAgent
custom resource, cluster Agent deployment, and node Agent daemonset manifests are now added to the flare archive when the Cluster Agent is deployed with the Datadog Operator (version 1.11.0+). - Add new host tag provider_kind from the value of DD_PROVIDER_KIND for Agents running in GCE.
- Add
query_timeout
to customize the timeout for queries in the Oracle check. Previously, this was fixed at 20,000 seconds. - Add ability to show Agent telemetry payloads to be sent by Agent if the telemetry is enabled. One can run it with the following command: agent diagnose show-metadata agent-telemetry. See docs <https://docs.datadoghq.com/data\_security/agent/#telemetry-collection> for more details.
- Convert Prometheus style Counters and Histograms used in Agent telemetry from monotonically increasing to non-monotonic values (reset on each scrape). In addition de-accumulate Prometheus Histogram bucket values on each scrape.
- Added support for more than 100 Aurora clusters in a user's account when using database autodiscovery
- Adds some information about the SNMP autodiscovery status in the Agent status.
- Adds a dedicated CRI-O Workloadmeta collector, enabling metadata collection for containers running on a CRI-O runtime.
- Enables a cache for SQL and MongoDB obfuscation. This cache is enabled by default but can be disabled by setting apm_config.obfuscation.cache.enabled to false.
- Improved logging to add visibility for latency and transport protocol
- Add a new configuration option
log_level
for commands where the logger is disabled by default. - Adds initial Windows support for TCP probes in Network Path.
- Query Aurora instances per cluster to allow up to 100 instances per cluster rather than 100 instances total.
- The AWS Lambda Extension is now able to read the full 128-bit trace ID from the headers of the end-invocation HTTP request made by dd-trace or the datadog-lambda-go library.
- Standardized cluster check tagging across all environments, allowing DD_TAGS, DD_EXTRA_TAGS, DD_CLUSTER_CHECKS_EXTRA_TAGS, and DD_ORCHESTRATOR_EXPLORER_EXTRA_TAGS to apply to all cluster check data when operating on the Cluster Agent, Node Agent, or Cluster Checks Runner.
Deprecation Notes
- Deprecates the apm_config.obfuscation.sql.cache option in favor of apm_config.obfuscation.cache.
- Remove deprecated config otlp_config.metrics.instrumentation_library_metadata_as_tags. Use otlp_config.metrics.instrumentation_scope_metadata_as_tags instead.
- The remote tagger will attempt to connect to the core agent indefinitely until it is successful. The
remote_tagger_timeout_seconds
configuration is removed, and the timeout is no longer configurable. - The remote tagger for the trace-agent and security-agent is now always enabled and can not be disabled
apm_config.remote_tagger
,security_agent.remote_tagger
, andevent_monitoring_config.remote_tagger
config entries are removed.
Security Notes
- Fix CVE-2025-21613
- Update
golang.org/x/crypto
to fix CVE-2024-45337.
Bug Fixes
- Cluster Agent: Don't overwrite the LD_PRELOAD environment variable if it's already set, append the path to Datadog's injection library instead.
- Fix an issue where the remote workloadmeta was not receiving some unset events for ECS containers, causing incorrect billing in CWS, CSPM, CSM Pro, CSM Enterprise, and DevSecOps Enterprise Containers.
- Corrects the method call for gauges to be Set instead of Add.
- Fix Oracle execution plan collection failures caused by an out-of-range position column, which can occur if the execution plan is excessively large.
- Fix excessive number of rows coming from active session history.
- OTLP ingestion: Stop prefixing http_server_duration, http_server_request_size and http_server_response_size with otelcol.
- Fixes the issue of disabled services producing an error message in the event log on start. Now produces an informational message.
- Change kubernetes.memory.working_set and kubernetes.memory.usage metrics to be of type gauge instead of rate.
Other Notes
- Add metric origins for Platform Integrations: Fly.io, Kepler, Octopus Deploy, and Scaphandre.
- Extend Agent Telemetry to start reporting
logs.sender_latency
metric. - The enable_receive_resource_spans_v2 flag now defaults to true in Converged Agent. This enables the refactored version of the OTLP span receiver in trace agent, improves performance by 10%, and deprecates the following functionality:
- No longer checks for information about the resource in HTTP headers (ContainerID, Lang, LangVersion, Interpreter, LangVendor).
- No longer checks for resource-related values (container, env, hostname) in span attributes. This previous behavior did not follow the OTel spec.
- Bumps the default value for kube_cache_sync_timeout_seconds from 5 to 10 seconds.
- Added origin for new Milvus integration.
Datadog Cluster Agent
Prelude
Released on: 2025-01-10 Pinned to datadog-agent v7.61.0: CHANGELOG.
New Features
- Implements the Kubernetes Admission Events webhooks. This new webhooks will emit Datadog Events when receving Validation Admission requests. It will track deployments operations made by non-system users. The webhook is controlled by using the admission_controller.kubernetes_admission_events.enabled setting.
Bug Fixes
- The auto-instrumentation webhook no longer injects the default environment variables when disabled.