github opensearch-project/data-prepper 2.16.0

pre-release3 hours ago

2026-07-02 Version 2.16.0


Breaking Changes

  • Default to point-in-time for the OpenSearch source on Amazon OpenSearch Serverless (#6335)

Features

  • Add experimental pull-based ingestion to write to an existing OpenSearch index through Kafka (#6835)
  • Continuously tail files with the file source, including offset tracking, rotation detection, and glob patterns (#6782)
  • Add filter_list processor to keep only the array elements matching a condition (#6610)
  • Scrape metrics from Prometheus endpoints with the pull-based Prometheus source (#1997)
  • Support writing metrics to OpenSearch TSDB indices with index_type: tsdb in the OpenSearch sink (#6644)
  • Convert OpenTelemetry traces into span events with a new codec (#6650)

Enhancements

  • Create the log group and log stream automatically in the CloudWatch Logs sink (#6861)
  • Attach Entity attributes to requests in the CloudWatch Logs sink (#6860)
  • Read the Confluence and Jira bearer token from a secrets manager (#6844)
  • Support legacy MD5 checksum validation for S3-compatible storage (#6780)
  • Support log signals, a configurable SigV4 signing service, and additional headers in the OTLP sink (#6763)
  • Add a source-layer shuffle to the Iceberg source for correct and scalable CDC processing (#6666)
  • Connect to an OpenSearch instance behind a reverse proxy in the OpenSearch sink (#6654)
  • Support Confluence and Jira Data Center by allowing local addresses (#6496)
  • Filter S3 objects by prefix and suffix for both SQS and scan in the S3 source (#6386)
  • Support path-style access in the S3 source (#6340)
  • Discover indexes with a single scan in the OpenSearch source (#6169)
  • Support named credentials in the AWS extension (#4637)
  • Support conditional script updates of documents in the OpenSearch sink (#3563)
  • Support client certificate authentication for OpenSearch (#633)
  • Split array fields into separate events with the split_event processor (#5707)
  • Configure the sort fields used for pagination in the OpenSearch source (#6332)
  • Look up private IP addresses in the GeoIP processor (#6079)

Bug Fixes

  • Fix the file source re-reading a file indefinitely when using a codec in non-tail mode (#6934)
  • Prevent the CloudWatch Logs sink uploader thread from silently terminating on unchecked errors (#6887)
  • Safely handle non-String PluginConfigVariable values in the Confluence and Jira OAuth2 configuration (#6874)
  • Resolve derived.environment from resource attributes in the otel_apm_service_map processor instead of always returning generic:default (#6786)
  • Fix cardinality explosion and Prometheus compatibility in otel_apm_service_map metrics (#6710)
  • Fix the Iceberg source initial-load completion detection race between leader and worker (#6686)
  • Continue reading from the OpenSearch source when some documents fail to load, logging and counting the failures (#6337)
  • Accept escaped JSON pointer syntax in processor keys such as rename_keys (#5121)
  • Fix delete_source removing the parsed field when writing to root in the parse_json processor (#6443)

Security

Maintenance

  • Update the release process for OpenSearch project organization changes (#6912)
  • Support automatic plugin loading in Data Prepper core (#4838)
  • Support experimental features within Data Prepper plugins (#6811)
  • Fix the flaky DefaultAcknowledgementSetManagerTests (#6719)
  • Fix the KafkaSourceJsonTypeIT ClassCastException on the kafka_headers cast (#6865)
  • Move common HTTP Basic and Bearer Token authentication into a shared module (#6767)
  • Generate the OpenSearch sink mTLS test certificates at runtime instead of committing them (#6826)

Don't miss a new data-prepper release

NewReleases is sending notifications on new releases.