github apache/druid druid-24.0.0
Druid 24.0.0

latest releases: druid-31.0.0, druid-31.0.0-rc2, druid-31.0.0-rc1...
2 years ago

Apache Druid 24.0.0 contains over 300 new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from 67 contributors. See the complete set of changes for additional details.

# New Features

# Multi-stage query task engine

SQL-based ingestion for Apache Druid uses a distributed multi-stage query architecture, which includes a query engine called the multi-stage query task engine (MSQ task engine). The MSQ task engine extends Druid's query capabilities, so you can write queries that reference external data as well as perform ingestion with SQL INSERT and REPLACE. Essentially, you can perform SQL-based ingestion instead of using JSON ingestion specs that Druid's native ingestion uses. In addition to the easy-to-use syntax, the SQL interface lets you perform transformations that involve multiple shuffles of data.

SQL-based ingestion using the multi-stage query task engine is the recommended solution starting in Druid 24.0.0. Alternative ingestion solutions such as native batch and Hadoop-based ingestion systems will still be supported. We recommend you read all known issues and test the feature in a development environment before rolling out in production. Using the multi-stage query task engine with SELECT statements that do not write to a datasource is experimental.

The extension for it (druid-multi-stage-query) is loaded by default. If you're upgrading from an earlier version of Druid or you're using Docker, you'll need to add the extension to druid.extensions.loadlist in your common.runtime.properties file.

For more information, see the overview for the multi-stage query architecture.

#12524
#12386
#12523
#12589

# Nested columns

Druid now supports directly storing nested data structures in a newly added COMPLEX<json> column type. COMPLEX<json> columns store a copy of the structured data in JSON format as well as specialized internal columns and indexes for nested literal values—STRING, LONG, and DOUBLE types. An optimized virtual column allows Druid to read and filter these values at speeds consistent with standard Druid LONG, DOUBLE, and STRING columns.

Newly added Druid SQL, native JSON functions, and virtual column allow you to extract, transform, and create COMPLEX<json> values in at query time. You can also use the JSON functions in INSERT and REPLACE statements in SQL-based ingestion, or in a transformSpec in native ingestion as an alternative to using a flattenSpec object to "flatten" nested data for ingestion.

See SQL JSON functions, native JSON functions, Nested columns, virtual columns, and the feature summary for more detail.

#12753
#12714
#12753
#12920

# Updated Java support

Java 11 is fully supported is no longer experimental. Java 17 support is improved.

#12839

# Query engine updates

# Updated column indexes and query processing of filters

Reworked column indexes to be extraordinarily flexible, which will eventually allow us to model a wide range of index types. Added machinery to build the filters that use the updated indexes, while also allowing for other column implementations to implement the built-in index types to provide adapters to make use indexing in the current set filters that Druid provides.

#12388

# Time filter operator

You can now use the Druid SQL operator TIME_IN_INTERVAL to filter query results based on time. Prefer TIME_IN_INTERVAL over the SQL BETWEEN operator to filter on time. For more information, see Date and time functions.

#12662

# Null values and the "in" filter

If a values array contains null, the "in" filter matches null values. This differs from the SQL IN filter, which does not match null values.

For more information, see Query filters and SQL data types.
#12863

# Virtual columns in search queries

Previously, a search query could only search on dimensions that existed in the data source. Search queries now support virtual columns as a parameter in the query.

#12720

# Optimize simple MIN / MAX SQL queries on __time

Simple queries like select max(__time) from ds now run as a timeBoundary queries to take advantage of the time dimension sorting in a segment. You can set a feature flag to enable this feature.

#12472
#12491

# String aggregation results

The first/last string aggregator now only compares based on values. Previously, the first/last string aggregator’s values were compared based on the _time column first and then on values.

If you have existing queries and want to continue using both the _time column and values, update your queries to use ORDER BY MAX(timeCol).

#12773

# Reduced allocations due to Jackson serialization

Introduced and implemented new helper functions in JacksonUtils to enable reuse of
SerializerProvider objects.

Additionally, disabled backwards compatibility for map-based rows in the GroupByQueryToolChest by default, which eliminates the need to copy the heavyweight ObjectMapper. Introduced a configuration option to allow administrators to explicitly enable backwards compatibility.

#12468

# Updated IPAddress Java library

Added a new IPAddress Java library dependency to handle IP addresses. The library includes IPv6 support. Additionally, migrated IPv4 functions to use the new library.

#11634

# Query performance improvements

Optimized SQL operations and functions as follows:

  • Vectorized numeric latest aggregators (#12439)
  • Optimized isEmpty() and equals() on RangeSets (#12477)
  • Optimized reuse of Yielder objects (#12475)
  • Operations on numeric columns with indexes are now faster (#12830)
  • Optimized GroupBy by reducing allocations. Reduced allocations by reusing entry and key holders (#12474)
  • Added a vectorized version of string last aggregator (#12493)
  • Added Direct UTF-8 access for IN filters (#12517)
  • Enabled virtual columns to cache their outputs in case Druid calls them multiple times on the same underlying row (#12577)
  • Druid now rewrites a join as a filter when possible in IN joins (#12225)
  • Added automatic sizing for GroupBy dictionaries (#12763)
  • Druid now distributes JDBC connections more evenly amongst brokers (#12817)

# Streaming ingestion

# Kafka consumers

Previously, consumers that were registered and used for ingestion persisted until Kafka deleted them. They were only used to make sure that an entire topic was consumed. There are no longer consumer groups that linger.

#12842

# Kinesis ingestion

You can now perform Kinesis ingestion even if there are empty shards. Previously, all shards had to have at least one record.

#12792

# Batch ingestion

# Batch ingestion from S3

You can now ingest data from endpoints that are different from your default S3 endpoint and signing region.
For more information, see S3 config.
#11798

# Improvements to ingestion in general

This release includes the following improvements for ingestion in general.

# Increased robustness for task management

Added setNumProcessorsPerTask to prevent various automatically-sized thread pools from becoming unreasonably large. It isn't ideal for each task to size its pools as if it is the only process on the entire machine. On large machines, this solves a common cause of OutOfMemoryError due to "unable to create native thread".

#12592

# Avatica JDBC driver

The JDBC driver now follows the JDBC standard and uses two kinds of statements, Statement and PreparedStatement.

#12709

# Eight hour granularity

Druid now accepts the EIGHT_HOUR granularity. You can segment incoming data to EIGHT_HOUR buckets as well as group query results by eight hour granularity.
#12717

# Ingestion general

# Updated Avro extension

The previous Avro extension leaked objects from the parser. If these objects leaked into your ingestion, you had objects being stored as a string column with the value as the .toString(). This string column will remain after you upgrade but will return Map.toString() instead of GenericRecord.toString. If you relied on the previous behavior, you can use the Avro extension from an earlier release.

#12828

# Sampler API

The sampler API has additional limits: maxBytesInMemory and maxClientResponseBytes. These options augment the existing options numRows and timeoutMs. maxBytesInMemory can be used to control the memory usage on the Overlord while sampling. maxClientResponseBytes can be used by clients to specify the maximum size of response they would prefer to handle.

#12947

# SQL

# Column order

The DruidSchema and SegmentMetadataQuery properties now preserve column order instead of ordering columns alphabetically. This means that query order better matches ingestion order.

#12754

# Converting JOINs to filter

You can improve performance by pushing JOINs partially or fully to the base table as a filter at runtime by setting the enableRewriteJoinToFilter context parameter to true for a query.

Druid now pushes down join filters in case the query computing join references any columns from the right side.

#12749
#12868

# Add is_active to sys.segments

Added is_active as shorthand for (is_published = 1 AND is_overshadowed = 0) OR is_realtime = 1). This represents "all the segments that should be queryable, whether or not they actually are right now".

#11550

# useNativeQueryExplain now defaults to true

The useNativeQueryExplain property now defaults to true. This means that EXPLAIN PLAN FOR returns the explain plan as a JSON representation of equivalent native query(s) by default. For more information, see Broker Generated Query Configuration Supplementation.

#12936

# Running queries with inline data using druid query engine

Some queries that do not refer to any table, such as select 1, are now always translated to a native Druid query with InlineDataSource before execution. If translation is not possible, for queries such as SELECT (1, 2), then an error occurs. In earlier versions, this query would still run.

#12897

# Coordinator/Overlord

# You can configure the Coordinator to kill segments in the future

You can now set druid.coordinator.kill.durationToRetain to a negative period to configure the Druid cluster to kill segments whose interval_end is a date in the future. For example, PT-24H would allow segments to be killed if their interval_end date was 24 hours or less into the future at the time that the kill task is generated by the system.
A cluster operator can also disregard the druid.coordinator.kill.durationToRetain entirely by setting a new configuration, druid.coordinator.kill.ignoreDurationToRetain=true. This ignores interval_end date when looking for segments to kill, and can instead kill any segment marked unused. This new configuration is turned off by default, and a cluster operator should fully understand and accept the risks before enabling it.

# Improved Overlord stability

Reduced contention between the management thread and the reception of status updates from the cluster. This improves the stability of Overlord and all tasks in a cluster when there are large (1000+) task counts.

#12099

# Improved Coordinator segment logging

Updated Coordinator load rule logging to include current replication levels. Added missing segment ID and tier information from some of the log messages.

#12511

# Optimized overlord GET tasks memory usage

Addressed the significant memory overhead caused by the web-console indirectly calling the Overlord’s GET tasks API. This could cause unresponsiveness or Overlord failure when the ingestion tab was opened multiple times.

#12404

# Reduced time to create intervals

In order to optimize segment cost computation time by reducing time taken for interval creation, store segment interval instead of creating it each time from primitives and reduce memory overhead of storing intervals by interning them. The set of intervals for segments is low in cardinality.

#12670

# Brokers/Overlord

Brokers now have a default of 25MB maximum queued per query. Previously, there was no default limit. Depending on your use case, you may need to increase the value, especially if you have large result sets or large amounts of intermediate data. To adjust the maximum memory available, use the druid.broker.http.maxQueuedBytes property.
For more information, see Configuration reference.

# Web console

Prepare to have your Web Console experience elevated! - @vogievetsky

# New query view (WorkbenchView) with tabs and long running query support

You can use the new query view to execute multi-stage, task based, queries with the /druid/v2/sql/task and /druid/indexer/v1/task/* APIs as well as native and sql-native queries just like the old Query view. A key point of the sql-msq-task based queries is that they may run for a long time. This inspired / necessitated many UX changes including, but not limited to the following:

# Tabs

You can now have many queries stored and running at the same time, significantly improving the query view UX.

You can open several tabs, duplicate them, and copy them as text to paste into any console and reopen there.

# Progress reports (counter reports)

Queries run with the multi-stage query task engine have detailed progress reports shown in the summary progress bar and the in detail execution table that provides summaries of the counters for every step.

# Error and warning reports

Queries run with the multi-stage query task engine present user friendly warnings and errors should anything go wrong.
The new query view has components to visualize these with their full detail including a stack-trace.

# Recent query tasks panel

Queries run with the multi-stage query task engine are tasks. This makes it possible to show queries that are executing currently and that have executed in the recent past.

For any query in the Recent query tasks panel you can view the execution details for it and you can also attach it as a new tab and continue iterating on the query. It is also possible to download the "query detail archive", a JSON file containing all the important details for a given query to use for troubleshooting.

# Connect external data flow

Connect external data flow lets you use the sampler to sample your source data to, determine its schema and generate a fully formed SQL query that you can edit to fit your use case before you launch your ingestion job. This point-and-click flow will save you much typing.

# Preview button

The Preview button appears when you type in an INSERT or REPLACE SQL query. Click the button to remove the INSERT or REPLACE clause and execute your query as an "inline" query with a limi). This gives you a sense of the shape of your data after Druid applies all your transformations from your SQL query.

# Results table

The query results table has been improved in style and function. It now shows you type icons for the column types and supports the ability to manipulate nested columns with ease.

# Helper queries

The Web Console now has some UI affordances for notebook and CTE users. You can reference helper queries, collapsable elements that hold a query, from the main query just like they were defined with a WITH statement. When you are composing a complicated query, it is helpful to break it down into multiple queries to preview the parts individually.

# Additional Web Console tools

More tools are available from the ... menu:

  • Explain query - show the query plan for sql-native and multi-stage query task engine queries.
  • Convert ingestion spec to SQL - Helps you migrate your native batch and Hadoop based specs to the SQL-based format.
  • Open query detail archive - lets you open a query detail archive downloaded earlier.
  • Load demo queries - lets you load a set of pre-made queries to play around with multi-stage query task engine functionality.

# New SQL-based data loader

The data loader exists as a GUI wizard to help users craft a JSON ingestion spec using point and click and quick previews. The SQL data loader is the SQL-based ingestion analog of that.

Like the native based data loader, the SQL-based data loader stores all the state in the SQL query itself. You can opt to manipulate the query directly at any stage. See (#12919) for more information about how the data loader differs from the Connect external data workflow.

# Other changes and improvements

  • The query view has so much new functionality that it has moved to the far left as the first view available in the header.
  • You can now click on a datasource or segment to see a preview of the data within.
  • The task table now explicitly shows if a task has been canceled in a different color than a failed task.
  • The user experience when you view a JSON payload in the Druid console has been improved. There’s now syntax highlighting and a search.
  • The Druid console can now use the column order returned by a scan query to determine the column order for reindexing data.
  • The way errors are displayed in the Druid console has been improved. Errors no longer appear as a single long line.

See (#12919) for more details and other improvements

# Metrics

# Sysmonitor stats for Peons

Sysmonitor stats, like memory or swap, are no longer reported since Peons always run on the same host as MiddleManagerse. This means that duplicate stats will no longer be reported.

#12802

# Prometheus

You can now include the host and service as labels for Prometheus by setting the following properties to true:

  • druid.emitter.prometheus.addHostAsLabel
  • druid.emitter.prometheus.addServiceAsLabel

#12769

# Rows per segment

(Experimental) You can now see the average number of rows in a segment and the distribution of segments in predefined buckets with the following metrics: segment/rowCount/avg and segment/rowCount/range/count.
Enable the metrics with the following property: org.apache.druid.server.metrics.SegmentStatsMonitor
#12730

# New sqlQuery/planningTimeMs metric

There’s a new sqlQuery/planningTimeMs metric for SQL queries that computes the time it takes to build a native query from a SQL query.

#12923

# StatsD metrics reporter

The StatsD metrics reporter extension now includes the following metrics:

  • coordinator/time
  • coordinator/global/time
  • tier/required/capacity
  • tier/total/capacity
  • tier/replication/factor
  • tier/historical/count
  • compact/task/count
  • compactTask/maxSlot/count
  • compactTask/availableSlot/count
  • segment/waitCompact/bytes
  • segment/waitCompact/count
  • interval/waitCompact/count
  • segment/skipCompact/bytes
  • segment/skipCompact/count
  • interval/skipCompact/count
  • segment/compacted/bytes
  • segment/compacted/count
  • interval/compacted/count
    #12762

# New worker level task metrics

Added a new monitor, WorkerTaskCountStatsMonitor, that allows each middle manage worker to report metrics for successful / failed tasks, and task slot usage.

#12446

# Improvements to the JvmMonitor

The JvmMonitor can now handle more generation and collector scenarios. The monitor is more robust and works properly for ZGC on both Java 11 and 15.

#12469

# Garbage collection

Garbage collection metrics now use MXBeans.

#12481

# Metric for task duration in the pending queue

Introduced the metric task/pending/time to measure how long a task stays in the pending queue.

#12492

# Emit metrics object for Scan, Timeseries, and GroupBy queries during cursor creation

Adds vectorized metric for scan, timeseries and groupby queries.

#12484

# Emit state of replace and append for native batch tasks

Druid now emits metrics so you can monitor and assess the use of different types of batch ingestion, in particular replace and tombstone creation.

#12488
#12840

# KafkaEmitter emits queryType

The KafkaEmitter now properly emits the queryType property for native queries.

#12915

# Security

You can now hide properties that are sensitive in the API response from /status/properties, such as S3 access keys. Use the druid.server.hiddenProperties property in common.runtime.properties to specify the properties (case insensitive) you want to hide.

#12950

# Other changes

  • You can now configure the retention period for request logs stored on disk with the druid.request.logging.durationToRetain property. Set the retention period to be longer than P1D (#12559)
  • You can now specify liveness and readiness probe delays for the historical StatefulSet in your values.yaml file. The default is 60 seconds (#12805)
  • Improved exception message for native binary operators (#12335)
  • ​​Improved error messages when URI points to a file that doesn't exist (#12490)
  • ​​Improved build performance of modules (#12486)
  • Improved lookups made using the druid-kafka-extraction-namespace extension to handle records that have been deleted from a kafka topic (#12819)
  • Updated core Apache Kafka dependencies to 3.2.0 (#12538)
  • Updated ORC to 1.7.5 (#12667)
  • Updated Jetty to 9.4.41.v20210516 (#12629)
  • Added Zstandard compression library to CompressionStrategy (#12408)
  • Updated the default gzip buffer size to 8 KB to for improved performance (#12579)
  • Updated the default inputSegmentSizeBytes in Compaction configuration to 100,000,000,000,000 (~100TB)

# Bug fixes

Druid 24.0 contains over 68 bug fixes. You can find the complete list here

# Upgrading to 24.0

# Permissions for multi-stage query engine

To read external data using the multi-stage query task engine, you must have READ permissions for the EXTERNAL resource type. Users without the correct permission encounter a 403 error when trying to run SQL queries that include EXTERN.

The way you assign the permission depends on your authorizer. For example, with [basic security]((/docs/development/extensions-core/druid-basic-security.md) in Druid, add the EXTERNAL READ permission by sending a POST request to the roles API.

The example adds permissions for users with the admin role using a basic authorizer named MyBasicMetadataAuthorizer. The following permissions are granted:

  • DATASOURCE READ
  • DATASOURCE WRITE
  • CONFIG READ
  • CONFIG WRITE
  • STATE READ
  • STATE WRITE
  • EXTERNAL READ
curl --location --request POST 'http://localhost:8081/druid-ext/basic-security/authorization/db/MyBasicMetadataAuthorizer/roles/admin/permissions' \
--header 'Content-Type: application/json' \
--data-raw '[
{
  "resource": {
    "name": ".*",
    "type": "DATASOURCE"
  },
  "action": "READ"
},
{
  "resource": {
    "name": ".*",
    "type": "DATASOURCE"
  },
  "action": "WRITE"
},
{
  "resource": {
    "name": ".*",
    "type": "CONFIG"
  },
  "action": "READ"
},
{
  "resource": {
    "name": ".*",
    "type": "CONFIG"
  },
  "action": "WRITE"
},
{
  "resource": {
    "name": ".*",
    "type": "STATE"
  },
  "action": "READ"
},
{
  "resource": {
    "name": ".*",
    "type": "STATE"
  },
  "action": "WRITE"
},
{
  "resource": {
    "name": "EXTERNAL",
    "type": "EXTERNAL"
  },
  "action": "READ"
}
]'

# Behavior for unused segments

Druid automatically retains any segments marked as unused. Previously, Druid permanently deleted unused segments from metadata store and deep storage after their duration to retain passed. This behavior was reverted from 0.23.0.
#12693

# Default for druid.processing.fifo

The default for druid.processing.fifo is now true. This means that tasks of equal priority are treated in a FIFO manner. For most use cases, this change can improve performance on heavily loaded clusters.

#12571

# Update to JDBC statement closure

In previous releases, Druid automatically closed the JDBC Statement when the ResultSet was closed. Druid closed the ResultSet on EOF. Druid closed the statement on any exception. This behavior is, however, non-standard.
In this release, Druid's JDBC driver follows the JDBC standards more closely:
The ResultSet closes automatically on EOF, but does not close the Statement or PreparedStatement. Your code must close these statements, perhaps by using a try-with-resources block.
The PreparedStatement can now be used multiple times with different parameters. (Previously this was not true since closing the ResultSet closed the PreparedStatement.)
If any call to a Statement or PreparedStatement raises an error, the client code must still explicitly close the statement. According to the JDBC standards, statements are not closed automatically on errors. This allows you to obtain information about a failed statement before closing it.
If you have code that depended on the old behavior, you may have to change your code to add the required close statement.

#12709

# Known issues

# Credits

@2bethere
@317brian
@a2l007
@abhagraw
@abhishekagarwal87
@abhishekrb19
@adarshsanjeev
@aggarwalakshay
@AmatyaAvadhanula
@BartMiki
@capistrant
@chenrui333
@churromorales
@clintropolis
@cloventt
@CodingParsley
@cryptoe
@dampcake
@dependabot[bot]
@dherg
@didip
@dongjoon-hyun
@ektravel
@EsoragotoSpirit
@exherb
@FrankChen021
@gianm
@hellmarbecker
@hwball
@iandr413
@imply-cheddar
@jarnoux
@jasonk000
@jihoonson
@jon-wei
@kfaraz
@LakshSingla
@liujianhuanzz
@liuxiaohui1221
@lmsurpre
@loquisgon
@machine424
@maytasm
@MC-JY
@Mihaylov93
@nishantmonu51
@paul-rogers
@petermarshallio
@pjfanning
@rockc2020
@rohangarg
@somu-imply
@suneet-s
@superivaj
@techdocsmith
@tejaswini-imply
@TSFenwick
@vimil-saju
@vogievetsky
@vtlim
@williamhyun
@wiquan
@writer-jill
@xvrl
@yuanlihan
@zachjsh
@zemin-piao

Don't miss a new druid release

NewReleases is sending notifications on new releases.