github DataDog/datadog-agent 7.74.0

latest release: 7.74.0-installer-0.14.0
4 days ago

Agent

Prelude

Release on: 2026-01-07

Upgrade Notes

  • Added the agent workloadfilter verify-cel subcommand, which validates CEL rules from a YAML file.
  • Migrate from batch processor to exporter helper batch configs in DDOT. The batch processor is deprecated upstream and is now removed from the default DDOT config.

New Features

  • Added the agent workloadfilter CLI command, which shows the active workload filter bundles,
    their load status, and the effective filter configuration.

  • Adds new Cluster Autoscaling controller in Cluster Agent.

  • Adds the hpflare extension, which provides flare information for the host-profiler.

  • Introduce a new Health Platform component that provides a unified way to detect, collect, and report host system health issues. The component runs health checks periodically and exposes telemetry for monitoring detected problems.

  • The datadog-agent now uses datadog-secret-backend v1.4.0 which supports GCP secrets via Google Secret Manager.

  • Create an inferred span to represent the entire duration of a Cloud Run Job task.

  • Checks can be scheduled only once with run_once configuration

  • Data Streams Kafka actions perform actions on Kafka clusters

  • gpu: add count metrics for NVIDIA ECC errors

  • Logs Agent is able to restart its pipeline in place to enable switching between endpoint types (HTTP/TCP) without full Agent restart.

  • The OTEL logs agent exporter now supports exporting Kubernetes orchestrator data. The exporter consumes Kubernetes resource manifests from the k8sobjectsreceiver and forwards them to Datadog's orchestrator endpoint. This enables Kubernetes cluster visibility through the OTEL agent pipeline.

    Use the OrchestratorConfig section to configure cluster name, API key, site, endpoint, and enablement toggle.

  • The SNMP integration now automatically performs a default device scan for each configured and auto-discovered device.

  • Adds a new argument, DD_INSTALL_ONLY, to the Windows MSI. Set DD_INSTALL_ONLY=true to install the Agent without starting the services.

Enhancement Notes

  • The Agent's embedded Python has been upgraded from 3.13.7 to 3.13.10

  • Provide FIPS-compliant builds for the Datadog distribution of OpenTelemetry (DDOT).

  • Expose new OTLP -> DD semantic transformation methods in the opentelemetry-mapping-go package.

  • Adding an 'instance-type' field to the inventoryhost payload.

  • Add Docker log permissions health check that detects when the Agent cannot access container log files due to restrictive filesystem permissions. The check provides remediation guidance and an optional script to fix permission issues.

  • Agents are now built with Go 1.24.10.

  • Agents are now built with Go 1.24.11.

  • Add host tag to associate host to a NodePool

  • Add annotation to associate a replica NodePool to its target

  • Move the chmod operation for the dogstatsd binary from runtime (entrypoint.sh) to build time (Dockerfile).

  • Expand logs file rotation analytics to include more detailed information using telemetry metrics.

  • Add fingerprint configuration information to the Logs Agent status page.

  • Add remote config ID tagging to events generated by kafka_action integration for easy UI filtering

  • Optimized the Kubernetes State Metrics (KSM) check by replacing fmt.Sprintf() calls with direct string concatenation in the ownerTags() function. This reduces memory allocation churn and saves approximately 20% CPU usage for the KSM check.

  • The kubelet pod list cache is now disabled by default to reduce staleness. The Agent lists pods from the kubelet every 5s. Users who explicitly set kubelet_cache_pods_duration retain their existing behavior (the Agent lists pods approximately every 5 + cache duration seconds).

  • [pkg/netflow] Add a new config option network.netflow.aggregator_max_flows_per_flush_interval that controls the maximum number of flows to be sent in a flush interval. Only sends the top flows, by # of bytes in the period up to the value in the config.

  • Add container metric support for any CRI compliant runtime specified in the cri_socket_path configuration.

  • Openmetrics-based checks using send_histograms_buckets now handle histogram resets without emitting a warning.

  • Optimize auto multiline detection JSON aggregator to improve performance and reduce memory usage for single line JSON messages

  • Optimize memory allocation in the KSM Core check by preallocating metric slices and skipping empty metrics in the store's Push() method. This should reduce 15% - 20% memory usage by ksm check, improving performance in clusters with large numbers of pods.

  • The otel-agent can now be told not to contact the core-agent by setting DD_CMD_PORT to 0

  • Add support for batch settings in the OTLP ingest endpoint (logs & metrics).

    • batch.min_size
    • batch.max_size
    • batch.flush_timeout

    These settings can be configured in the Agent config file or by using the environment variables.

  • Change serverless-init default log level to error.

  • Skip noisy Kubernetes metadata error logs in serverless-init.

  • Change startup failure log level from debug to error.

  • Increase the default EVP proxy maximum payload size from 5 MB to 10 MB in the Trace Agent.

  • Fixes missing tags at container startup by buffering spans and APM stats until Kubernetes metadata is resolved.

  • The agent now can automatically triggers a secret refresh when an API key expires or becomes invalid, either through 403 responses or periodic API key validation. The refresh rate is throttled by secret_refresh_on_api_key_failure_interval configuration option (in minutes).

  • Enforces that the DDOT service is stopped by the core Agent service.

  • Included tags for TLS offered versions and TLS chosen version as part of TCP connections stats on Windows.

Deprecation Notes

  • Remove the OpenTelemetry Collector ecstaskobserver extension from DDOT. This extension has been removed from upstream OpenTelemetry Collector Contrib repo.

Bug Fixes

  • [DBM] Bump go-sqllexer to v0.1.10 to fix the following bugs:
    • Fixes a normalization bug in SqlServer parameterized queries containing multiple depths of parentheses.
    • Fixes identifier quote removal to preserve quotes for aliases that aren't strictly alphanumeric.
  • For ECS Managed Instances, the Agent no longer overrides the runtime to ECS. The runtime is now left for Docker to determine, ensuring correct backend configuration.
  • Fixed a bug which caused events from pause containers to not be filtered out when using the containerd collector
  • Refactored the KSM custom resource handling to support wildcard matching of version/kind and CRD discovery.
  • Exclude the 'aws-fargate-pause' container from the default pause container exclusion list.
  • Ensure NodePool spec hash has not changed before updating
  • Creates replica NodePool by copying the target NodePool Spec, rather than creating from scratch
  • Update reconcile to requeue when delays in LeaderElection may occur
  • Fixed a file descriptor leak in the log file tailer where rotated tailers were not being properly removed from the active tailer container, causing them to remain active indefinitely. Rotated tailers now drain their remaining content while allowing new tailers to be created for the rotated file.
  • Fixes ReplaceBindParameter obfuscation config handling in Python DBMS checks.
  • Fix a bug preventing Fleet Automation from updating configurations files with the same YAML key set multiple times.
  • GPU: Fixed some bugs that could cause incorrect container/process tags for Docker workloads.
  • APM: Limit the size of the buffers used to decode the request body in the Trace Agent. This prevents the Agent from allocating memory for requests that are too large.
  • Send ECS task lifecycle event for ECS Managed Instances when agent is deployed as a sidecar.
  • When OTel db spans contain a db.statement or db.query.text that differ from the resource name, perform a separate obfuscation to avoid overriding their contents.
  • CPU and wall clock time collection in Python profiling is re-enabled.
  • Fixes a panic that occurs when running a manual cli rtprocess check via datadog-agent processchecks rtprocess
  • Fixed a deadlock in the workloadmeta event pipeline where goroutines could block indefinitely when sending events to subscribers. Added a timeout to channel send operations to prevent blocking when a subscriber's channel buffer is full.

Other Notes

  • Add metrics origins for 2025 Q4 Agent integrations.
    • ControlM
    • N8N
    • Nutanix
    • Palo Alto Panorama
    • Perfect
  • Add new telemetry metric health_platform.issues_detected tagged by health_check_id to track the number of detected health issues over time.
  • Reverts RunOnce added in #43325 (not released)

Datadog Cluster Agent

Prelude

Released on: 2026-01-07 Pinned to datadog-agent v7.74.0: CHANGELOG.

New Features

  • Add KSM Resource Type Sharding for improved performance in large Kubernetes clusters. This feature automatically splits the kubernetes_state_core check into multiple shards based on resource type groups (pods, nodes, others), enabling parallel execution across multiple Cluster Check Runners.

Enhancement Notes

  • In the Helm check, the "helm_status" tag is now always set to "uninstalled" in delete events.

Bug Fixes

  • Fixed a deadlock in the Cluster Agent language detection handler that could cause event drops with the error "collector language-detection-follower dropped event(s) after 10s timeout". The fix releases the mutex before pushing events to workloadmeta to prevent blocking while holding the lock.

Don't miss a new datadog-agent release

NewReleases is sending notifications on new releases.