What Changed

This release comes with several Improvements and Bug Fixes for the Multistage Engine, Upserts and Compaction. There are a ton of other small features and general bug fixes.

Multistage Engine Improvements

Features

New Window Functions: LEAD, LAG, FIRST_VALUE, LAST_VALUE #12878 #13340

LEAD allows you to access values after the current row in a frame.
LAG allows you to access values before the current row in a frame.
FIRST_VALUE and LAST_VALUE return the respective extremal values in the frame.

Support for Logical Database in V2 Engine #12591 #12695

V2 Engine now supports a "database" construct, enabling table namespace isolation within the same Pinot cluster.
Improves user experience when multiple users are using the same Pinot Cluster.
Access control policies can be set at the database level.
Database can be selected in a query using a SET statement, such as SET database=my_db;.

Improved Multi-Value (MV) and Array Function Support

Added array sum aggregation functions for point-wise array operations #13324.
Added support for valueIn MV transform function #13443.
Fixed bug in numeric casts for MV columns in filters #13425.
Fixed NPE in ArrayAgg when a column contains no data #13358.
Fixed array literal handling #13345.

Support for WITHIN GROUP Clause and ListAgg #13146

WITHIN GROUP Clause can be used to process rows in a given order within a group.
One of the most common use-cases for this is the ListAgg function, which when combined with WITHIN GROUP can be used to concatenate strings in a given order.

Scalar/Transform Function and Set Operation Improvements

Added Geospatial Scalar Function support for use in intermediate stage in the v2 query engine #13457.
Fix 'WEEK' transform function #13483.
Support EXTRACT as a scalar function #13463.
Added support for ALL modifier for INTERSECT and EXCEPT Set Operations #13151 #13166.

Improved Literal Handling Support

Fixed bug in handling literal arguments in aggregation functions like Percentile #13282.
Allow INT and FLOAT literals #13078.
Fixed literal handling for all types #13344 #13345.
Fixed null literal handling for null intolerant functions #13255.

Metrics Improvements

Added new metrics for tracking queries executed globally and at the table level #12982.
New metrics to track join counts and window function counts #13032.
Multiple meters and timers to track Multistage Engine Internals #13035.

Notable Improvements and Bug Fixes

Improved Window operators resiliency, with new checks to make sure the window doesn't grow too large #13180 #13428 #13441.
Optimized Group Key generation #12394.
Fixed SortedMailboxReceiveOperator to honor convention of pulling at most 1 EOS block #12406.
Improvement in how execution stats are handled #12517 #12704 #13136.
Use Protobuf instead of Reflection for Plan Serialization #13221.

Upsert Compaction and Minion Improvements

Features and Improvements

Minion Resource Isolation #12459 #12786

Minions now support resource isolation based on an instance tag.
Instance tag is configured at table level, and can be set for each task on a table.
This enables you to implement arbitrary resource isolation strategies, i.e. you can use a set of Minion Nodes for running any set of tasks across any set of tables.

Greedy Upsert Compaction Scheduling #12461

Upsert compaction now schedules segments for compaction based on the number of invalid docs.
This helps the compaction task to handle arbitrary temporal distribution of invalid docs.

Notable Improvements

Minions can now download segments from servers when deepstore copy is missing. This feature is enabled via a cluster level config allowDownloadFromServer #12960 #13247.
Added support for TLS Port in Minions #12943.
New metrics added for Minions to track segment/record processing information #12710.

Bug Fixes

Minions can now handle invalid instance tags in Task Configs gracefully. Prior to this change, Minions would be stuck in IN_PROGRESS state until task timeout #13092.
Fix bug to return validDocIDsMetadata from all servers #12431.
Upsert compaction doesn't retain maxLength information and trims string fields #13157.

Upsert Improvements

Features and Improvements

Consistent Table View for Upsert Tables #12976

Adds different modes of consistency guarantees for Upsert tables.
Adds a new UpsertConfig called consistencyMode which can be set to NONE, SYNC, SNAPSHOT.
SYNC is optimized for data freshness but can lead to elevated query latencies and is best for low-qps use-cases. In this mode, the ingestion threads will take a WLock when updating validDocID bitmaps.
SNAPSHOT mode can handle high-qps/high-ingestion use-cases by getting the list of valid docs from a snapshot of validDocID. The snapshot can be refreshed every few seconds and the tolerance can be set via a query option upsertViewFreshnessMs.

Pluggable Partial Upsert Merger #11983

Partial Upsert merges the old record and the new incoming record to generate the final ingested record.
Pinot now allows users to customize how this merge of an old row and the new row is computed.
This allows a column value in the new row to be an arbitrary function of the old and the new row.

Support for Uploading Externally Partitioned Segments for Upsert Backfill 13107

Segments uploaded for Upsert Backfill can now explicitly specify the Kafka partition they belong to.
This enables backfilling an Upsert table where the externally generated segments are partitioned using an arbitrary hash function on an arbitrary primary key.

Misc Improvements and Bug Fixes

Fixed a Bug in Handling Equal Comparison Column Values in Upsert, which could lead to data inconsistency (#12395)
Upsert snapshot will now snapshot only those segments which have updates. #13285.

Notable Features

JSON Support Improvements

JSON Index can now be used for evaluating Regex and Range Predicates. #12568
jsonExtractIndex now supports contextual array filters. #12683 #12531.
JSON column type now supports filter predicates like =, !=, IN and NOT IN. This is convenient for scenarios where the JSON values are very small. #13283.
JSON_MATCH now supports exclusive predicates correctly. For instance, you can use predicates such as JSON_MATCH(person, '"$.addresses[*].country" != ''us''' to find all people who have at least one address that is not in the US. #13139.
jsonExtractIndex supports extracting Multi-Value JSON Fields, and also supports providing any default value when the key doesn't exist. #12748.
Added isJson UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603.
Fix ArrayIndexOutOfBoundsException in jsonExtractIndex. #13479.

Lucene and Text Search Improvements

Improved Segment Build Time for Lucene Text Index by 40-60%. This improvement is realized when a consuming segment commits and changes to an ImmutableSegment. This significantly helps in lowering ingestion lag at commit time due to a large text index #12744 #13094 #13050.
Phrase Search can run 3x faster when the Lucene Index Config enablePrefixSuffixMatchingInPhraseQueries is set to true. This is achieved by rewriting phrase search query to a wildcard and prefix matching query #12680.
Fixed bug in TextMatchFilterOptimizer that was not applying precedence to the filter expressions properly, which could lead to incorrect results. #13009.
Fixed bug in handling NOT text_match which could have returned incorrect results. #12372.
Added SchemaConformingTranformerV2 to enhance text search abilities. #12788.
Added metrics to track Lucene NRT Refresh Delay #13307.
Switched to NRTCachingDirectory for Realtime segments and prevented duplicates in the Realtime Lucene Index to avoid IndexOutOfBounds query time exceptions. #13308.
Lucene Version is upgraded to 9.11.1. #13505.

New Funnel Functions #13176 #13231 #13228

Added funnelMaxStep function which can be used to calculate max funnel steps for a given sliding window .
Added funnelCompleteCount to calculate the number of completed funnels, and funnelMatchStep to get the funnel match array.

Support for Interning for OnHeapByteDictionary #12342

This can reduce the heap usage of a dictionary encoded byte column, for a certain distribution of duplicate values. See #12223 for details.

Column Major Builder On By Default for New Tables #12770

Prior to this feature, on a segment commit, Pinot would convert all the columnar data from the Mutable Segment to row-major, and then re-build column major Immutable Segments.
This feature skips the row-major conversion and is expected to be both space and time efficient.
It can help lower ingestion lag from segment commits, especially helpful when your segments are large.

Support for SQL Formatting in Query Editor #11725

You can now prettify SQL right in the Controller UI!

Hash Function for UUID Primary Keys #12538

Added a new lossless hash-function for Upsert Primary Keys optimized for UUIDs.
The hash function can reduce Old Gen by up to 30%.
It maps a UUID to a 16 byte array, vs encoding it in a UTF string which would take 36 bytes.

Column Level Index Skip Query Option #12414

Convenient for debugging impact of indexes on query performance or results.
You can add the skipIndexes option to your query to skip any number of indexes. e.g. SET skipIndexes=inverted,range;

New UDFs and Scalar Functions

New GeoHash functions: encodeGeoHash, decodeGeoHash, decodeGeoHashLatitude and decodeGeoHashLongitude.
dateBin can be used to align a timestamp to the nearest time bucket.
prefixes, suffixes and uniqueNgrams UDFs for generating all respective string subsequences from a string input. #12392.
Added isJson UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603.
splitPart UDF has minor improvements. #12437.

CLP Compression Codec in Forward Indexes #12504

CLP is a compressed log processor which has really high compression ratio for certain log types.
To enable this, you can set the compressionCodec in the fieldConfigList of the column you want to target.

Misc. Improvements

Enable segment preloading at partition level #12451.
Use Temurin instead of AdoptOpenJdk #12533
Adding record reader config/context param to record transformer #12520
Removing legacy commons-lang dependency #13480
12508: Feature add segment rows flush config #12681
ADSS Race Condition and update to client error codes #13104
Add ExceptionMapper to convert Exception to Response Object for Broker REST API's #13292
Add FunnelMaxStepAggregationFunction and FunnelCompleteCountAggregationFunction #13231
Add GZIP Compression Codec (#11434) #12668
Add PodDisruptionBudgets to the Pinot Helm chart #13153
Add Postgres compliant name aliasing for String Functions. #12795
Add SchemaConformingTransformerV2 to enhance text search abilities #12788
Add a benchmark to measure multi-stage block serde cost #13336
Add a plan version field to QueryRequest Protobuf Message #13267
Add a post-validator visitor that verifies there are no cast to bytes #12475
Add a safe version of CLStaticHttpHandler that disallows path traversal. #13124
Add ability to track filtered messages offset #12602
Add back 'numRowsResultSet' to BrokerResponse, and retain it when result table id hidden #13198
Add back profile for shade #12979
Add back some exclude deps from hadoop-mapreduce-client-core #12638
Add backward compatibility regression test suite for multi-stage query engine #13193
Add base class for custom object accumulator #12685
Add clickstream example table for funnel analysis #13379
Add config option for timezone #12386
Add config to skip record ingestion on string column length exceeding configured max schema length #13103
Add controller API to get allLiveInstances #12498
Add isJson UDF #12603
Add list of collaborators to asf.yaml #13346
Add locking logic to get consistent table view for upsert tables #12976
Add metric to track number of segments missed in upsert-snapshot #12581
Add metrics for SEGMENTS_WITH_LESS_REPLICAS monitoring #12336
Add mode to allow adding dummy events for non-matching steps #13382
Add offset based lag metrics #13298
Add protobuf codegen decoder #12980
Add retry policy to wait for job id to persist during rebalancing #13372
Add round-robin logic during downloadSegmentFromPeer #12353
Add schema as input to the decoder. #12981
Add splitPartWithLimit and splitPartFromEnd UDFs #12437
Add support for creating raw derived columns during segment reload #13037
Add support for raw JSON filter predicates #13283
Add the possibility of configuring ForwardIndexes with compressionCodec #12218
Add upsert-snapshot timer metric #12383
Add validation check for forward index disabled if it's a REALTIME table #12838
Added PR compatability test against release 1.1.0 #12921
Added kafka partition number to metadata. #13447
Added pinot-error-code header in query response #12338
Added tests for additional data types in SegmentPreProcessorTest.java #12755
Adding a cluster config to enable instance pool and replica group configuration in table config #13131
Adding batch api support for WindowFunction #12993
Adding bytes string data type integration tests #12387
Adding registerExtraComponents to allow registering additional components in various services #13465
Adding support of insecure TLS #12416
Adding support to insecure TLS when creating SSLFactory #12425
Adds AGGREGATE_CASE_TO_FILTER rule #12643
Adds per-column, query-time index skip option #12414
Allow Aggregations in Case Expressions #12613
Allow PintoHelixResourceManager subclasses to be used in the controller starter by providing an overridable PinotHelixResouceManager object creator function #13495
Allow RequestContext to consider http-headers case-insensitivity #13169
Allow Server throttling just before executing queries on server to allow max CPU and disk utilization #12930
Allow all raw index config in star-tree index #13225
Allow apply both environment variables and system properties to user and table configs, Environment variables take precedence over system properties #13011
Allow configurable queryWorkerThreads in Pinot server side GrpcQueryServer #13404
Allow dynamically setting the log level even for loggers that aren't already explicitly configured #13156
Allow passing custom record reader to be inited/closed in SegmentProcessorFramework #12529
Allow passing database context through database http header #12417
Allow stop to interrupt the consumer thread and safely release the resource #13418
Allow user configurable regex library for queries #13005
Allow using 'serverReturnFinalResult' to optimize server partitioned table #13208
Assign default value to newly added derived column upon reload #12648
Avoid port conflict in integration tests #13390
Better handling of null tableNames #12654
CLP as a compressionCodec #12504
Change helm app version to 1.0.0 for Apache Pinot latest release version #12436
Clean Google Dependencies #13297
Clean up BrokerRequestHandler and BrokerResponse #13179
Clean up arbitrary sleep in /GrpcBrokerClusterIntegrationTest #12379
Cleaning up vector index comments and exceptions #13150
Cleanup HTTP components dependencies and upgrade Thrift #12905
Cleanup Javax and Jakarta dependencies #12760
Cleanup deprecated query options #13040
Cleanup the consumer interfaces and legacy code #12697
Cleanup unnecessary dependencies under pinot-s3 #12904
Cleanup unused aggregate internal hint #13295
Consistency in API response for live broker #12201
Consolidate bouncycastle libraries #12706
Consolidate nimbus-jose-jwt version to 9.37.3 #12609
ControllerRequestClient accepts headers. Useful for authN tests #13481
Custom configuration property reader for segment metadata files #12440
Delete database API #12765
Deprecate PinotHelixResourceManager#getAllTables() in favour of getAllTables(String databaseName) #12782
Detect expired messages in Kafka. Log and set a gauge. #12608
Do not hard code resource class in BaseClusterIntegrationTest #13400
Do not pause ingestion when upsert snapshot flow errors out #13257
Don't drop original field during flatten #13490
Don't enforce -realTimeInstanceCount and -offlineInstanceCount options when creating broker tenants #13236
Egalpin/skip indexes minor changes #12514
Emit Metrics for Broker Adaptive Server Selector type #12482
Emit table size related metrics only in lead controller #12747
Enable complexType handling in SegmentProcessFramework #12942
Enable more integration tests to run on the v2 multi-stage query engine #13467
Enabling avroParquet to read Int96 as bytes #12484
Enhance Kinesis consumer #12806
Enhance Parquet Test #13082
Enhance ProtoSerializationUtils to handle class move #12946
Enhance Pulsar consumer #12812
Enhance PulsarConsumerTest #12948
Enhance commit threshold to accept size threshold without setting rows to 0 #12684
Enhance json index to support regexp and range predicate evaluation #12568
Enhancement: Sketch value aggregator performance #13020
Ensure FieldConfig.getEncodingType() is never null #12430
Ensure all the lists used in PinotQuery are ArrayList #13017
Ensure brokerId and requestId are always set in BrokerResponse #13200
Enter segment preloading at partition level #12451
Exclude dimensions from star-tree index stored type check #13355
Expose more helper API in TableDataManager #13147
Extend compatibility verifier operation timeout from 1m to 2m to reduce flakiness #13338
Extract json individual array elements from json index for the transform function jsonExtractIndex #12466
Fetch query quota capacity utilization rate metric in a callback function #12767
First with time #12235
GitHub Actions checkout v4 #12550
Gzip compression, ensure uncompressed size can be calculated from compressed buffer #12802
Handle errors gracefully during multi-stage stats collection in the broker #13496
Handle shaded classes in all methods of kafka factory #13087
Hash Function for UUID Primary Keys #12538
Ignore case when checking for Direct Memory OOM #12657
Improve Retention Manager Segment Lineage Clean Up #13232
Improve error message for max rows in join limit breach #13394
Improve exception logging when we fail to index / transform message #12594
Improve logging in range index handler for index updates #13381
Improve upsert compaction threshold validations #13424
Improve warn logs for requesting validDocID snapshots #13280
Improved metrics for server grpc query #13177
Improved null check for varargs #12673
Improved segment build time for Lucene text index realtime to offline conversion #12744
In ClusterTest, make start port higher to avoid potential conflict with Kafka #13402
Introduce PinotLogicalAggregate and remove internal hint #13291
Introduce retries while creating stream message decoder for more robustness #13036
Isolate bad server configs during broker startup phase #12931
Issue #12367 #12922
Json extract index filter support #12683
Json extract index mv #12532
Keep get tables API with and without database #12804
Lint failure #12294
Logging a warn message instead of throwing exception #12546
Made the error message around dimension table size clearer #13163
Make Helix state transition handling idempotent #12886
Make KafkaConsumerFactory method less restrictive to avoid incompatibility #12815
Make task manager APIs database aware #12766
Metric for count of tables configured with various tier backends #12940
Metric for upsert tables count #12505
Metrics for Realtime Rows Fetched and Stream Consumer Create Exceptions #12522
Minmaxrange null #12252
Modify consumingSegmentsInfo endpoint to indicate how many servers failed #12523
Move offset validation logic to consumer classes #13015
Move package org.apache.calcite to org.apache.pinot.calcite #12837
Move resolveComparisonTies from addOrReplaceSegment to base class #13396
Move some mispositioned tests under pinot-core #12884
Move wildfly-openssl dependency management to root pom #12597
Moving deleteSegment call from POST to DELETE call #12663
Optimize unnecessary extra array allocation and conversion for raw derived column during segment reload #13115
Pass explicit TypeRef when evaluating MV jsonPath #12524
Percentile operations supporting null #12271
Prepare for next development iteration #12530
Propagate Disable User Agent Config to Http Client #12479
Properly handle complex type transformer in segment processor framework #13258
Properly return response if SegmentCompletion is aborted #13206
Publish helm 0.2.8 #12465
Publish helm 0.2.9 #13230
Pull janino dependency to root pom #12724
Pull pulsar version definitaion into root POM #13002
Query response opt #13420
Re-enable the Spotless plugin for Java 21 #12992
Readme - How to setup Pinot UI for development #12408
Record enricher #12243
Refactor PinotTaskManager class #12964
Refactored CommonsConfigurationUtils for loading properties configuration. #13201
Refactored compatibility-verifier module #13359
Refactoring removeSegment flow in upsert #13449
Refine PeerServerSegmentFinder #12933
Refine SegmentFetcherFactory #12936
Replace custom fmpp plugin with fmpp-maven-plugin #12737
Reposition query submission spot for adaptive server selection #13327
Reset controller port when stopping the controller in ControllerTest #13399
Rest Endpoint to Create ZNode #12497
Return clear error message when no common broker found for multi-stage query with tables from different tenants #13235
Returning tables names failing authorization in Exception of Multi State Engine Queries #13195
Revert " Adding record reader config/context param to record transformer (#12520)" #12526
Revert "Using local copy of segment instead of downloading from remote (#12863)" #13114
Short circuit SubPlanFragmenter because we don't support multiple sub-plans yet #13306
Simplify Google dependencies by importing BOM #12456
Specify version for commons-validator #12935
Support NOT in StarTree Index #12988
Support empty strings as json nodes^ #12555
Supporting human-readable format when configuring broker response size #12510
Use ArrayList instead of LinkedList in SortOperator #12783
Use a two server setup for multi-stage query engine backward compatibility regression test suite #13371
Use more efficient variants of URLEncoder::encode and URLDecoder::decode #13030
Use parameterized log messages instead of string concatenation #13145
Use separate action for /tasks/scheduler/jobDetails API #13054
Use try-with-resources to close file walk stream in LocalPinotFS #13029
Using local copy of segment instead of downloading from remote #12863
[Adaptive Server Selector] Add metrics for Stats Manager Queue Size #12340
[Cleanup] Move classes in pinot-common to the correct package #13478
[Feature] Add Support for SQL Formatting in Query Editor #11725
[HELM]: Added additional probes options and startup probe. #13165
[HELM]: Added checksum config annotation in stateful set for broker, controller and server #13059
[HELM]: Added namespace support in K8s deployment. #13380
[HELM]: zookeeper chart upgrade to version 13.2.0 #13083
[Minor] Add Nullable annotation to HttpHeaders in BrokerRequestHandler #12816
[Minor] Small refactor of raw index creator constructor to be more clear #13093
[Multi-stage] Clean up RelNode to Operator handling #13325
[null-aggr] Add null handling support in mode aggregation #12227
[partial-upsert] configure early release of _partitionGroupConsumerSemaphore in RealtimeSegmentDataManager #13256
[spark-connector] Add option to fail read when there are invalid segments #13080
add Netty arm64 dependencies #12493
add Netty unit test #12486
add SegmentContext to collect validDocIds bitmaps for many segments together #12694
add skipUnavailableServers query option #13387
add insecure mode when Pinot uses TLS connections #12525
add instrumentation to json index getMatchingFlattenedDocsMap() #13164
add jmx to promethues metric exporting rule for realtimeRowsFiltered #12759
add metrics for IdeaState update #13266
add some metrics for upsert table preloading #12722
add some tests on jsonPathString #12954
add test cases in RequestUtilsTest #12557
add unit test for JsonAsyncHttpPinotClientTransport #12633
add unit test for QueryServer #12599
add unit test for ServerChannels #12616
add unit test for StringFunctions encodeUrl #13391
add unit tests for pinot-jdbc-client #13137
add url assertion to SegmentCompletionProtocolTest #13373
adjust the llc partition consuming metric reporting logic #12627
allow passing null http headers object to translateTableName #12764
allow to set segment when use SegmentProcessorFramework #13341
auto renew jvm default sslconext when it's loaded from files #12462
avoid useless intermediate byte array allocation for VarChunkV4Reader's getStringMV #12978
aws sdk 2.25.3 #12562
build-helper-maven-plugin 3.5.0 #12548
cache ssl contexts and reuse them #12404
clean up jetbrain nullable annotation #13427
cleanup: maven no transfer progress #12444
close JDBC connections #12494
do not fail on duplicate relaxed vars (#13214)z
dropwizard metrics 4.2.25 #12600
dynamic chunk sizing for v4 raw forward index #12945
enable Netty leak detection #12483
enable parallel Maven in pinot linter script #12751
ensure inverse And/OrFilterOperator implementations match the query #13199
exclude .mvn directory from source assembly #12558
extend CompactedPinotSegmentRecordReader so that it can skip deleteRecord #13352
get startTime outside the executor task to avoid flaky time checks #13250
handle absent segments so that catchup checker doesn't get stuck on them #12883
handle overflow for MutableOffHeapByteArrayStore buffer starting size #13215
handle segments not tracked by partition mgr and add skipUpsertView query option #13415
handle table name translation on missed api resources #12792
hash4j version upgrade to 0.17.0 #12968
including the underlying exception in the logging output #13248
int96 parity with native parquet reader #12496
jsonExtractIndex support array of default values #12748
log the log rate limiter rate for dropped broker logs #13041
make http listener ssl config swappable #12455
make reflection calls compatible with 0.9.11 [#12958](https://github.com/apache/
maven: no transfer progress #12528
missed to delete the temp dir #12637
move shouldReplaceOnComparisonTie to base class to be more reusable #13353
reduce Java enum .values() usage in TimerContext #12579
reduce logging for SpecialValueTransformer #12970
reduce regex pattern compilation in Pinot jdbc #13138
refactor TlsUtils class #12515
refine when to registerSegment while doing addSegment and replaceSegment for upsert tables for better data consistency #12709
reformat AdminConsoleIntegrationTest.java #12552
reformat ClusterTest.java #12531
release segment mgrs more reliably #13216
replaced getServer with getServers #12545
report rebalance job status for the early returns like noops #13281
require noDictionaryColumns with aggregationConfigs #12464
share the same table config object #12463
track segments for snapshotting even if they lost all comparisons #13388
untrack the segment out of TTL #12449
update ControllerJobType from enum to string #12518
update RewriterConstants so that expr min max would not collide with columns start with "parent" #13357
update access control check error handling to catch throwable and log errors #13209

Bug Fixes

Use gte(lte) to replace between() which has a bug #12595
Fix the ConcurrentModificationException for And/Or DocIdSet #12611
Upgrade RoaringBitmap to 1.0.5 to pick up the fix for RangeBitmap.between() #12604
bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539
Bug Fix createDictionaryForColumn does not take into account inverted index #13048
fix Cluster Manager error #12632
fix for quick start Cluster Manager issue #12610
Adding config for having suffix for client ID for realtime consumer #13168
Addressed comments and fixed tests from pull request 12389. /uptime and /start-time endpoints working all components #12512
Bigfix. Added missing paramName #13060
Bug fix: Do not ignore scheme property #12332
Bug fix: Handle missing shade config overwrites for Kafka #13437
BugFix: Fix merge result from more than one server #12778
Bugfix. Allow tenant rebalance with downtime as true #13246
Bugfix. Avoid passing null table name input to translation util #12726
Bugfix. Correct wrong method call from scheduleTask() to scheduleTaskForDatabase() #12791
Bugfix. Maintain literal data type during function evaluation #12607
Cleanup: Fix grammar in error message, also improve readability. #13451
Fix Bug in Handling Equal Comparison Column Values in Upsert #12395
Fix ColumnMinMaxValueGenerator #12502
Fix JavaEE related dependencies #13058
Fix Logging Location for CPU-Based Query Killing #13318
Fix PulsarUtils to not share buffer #12671
Fix URI construction so that AddSchema command line tool works when override flag is set to true #13320
Fix [Type]ArrayList elements() method usage #13354
Fix a typo when calculating query freshness #12947
Fix an overflow in PinotDataBuffer.readFrom #13152
Fix bug in logging in UpsertCompaction task #12419
Fix bug to return validDocIDsMetadata from all servers #12431
Fix connection issues if using JDBC and Hikari (#12267) #12411
Fix controller host / port / protocol CLI option description for admin commands #13237
Fix environment variables not applied when creating table #12560
Fix error message for insufficient number of untagged brokers during tenant creation #13234
Fix few metric rules which were affected by the database prefix handling #13290
Fix file handle leaks in Pinot Driver (#12263) #12356
Fix flakiness of ControllerPeriodicTasksIntegrationTest #13337
Fix issue with startree index metadata loading for columns with '__' in name #12554
Fix metric rule pattern regex #12856
Fix pinot-parquet NoClassFound issue #12615
Fix segment size check in OfflineClusterIntegrationTest #13389
Fix some resource leak in tests #12794
Fix the NPE from IS update metrics #13313
Fix the NPE when metadataTTL is enabled without delete column #13262
Fix the ServletConfig loading issue with swagger. #13122
Fix the issue that map flatten shouldn't remove the map field from the record #13243
Fix the race condition for H3InclusionIndexFilterOperator #12487
Fix the time segment pruner on TIMESTAMP data type #12789
Fix time stats in SegmentIndexCreationDriverImpl #13429
Fixed infer logical type name from avro union schema #13224
Fixing instance type to resolve #12677 and #12678
Helm: bug fix for chart rendering issue. #13264
Try to amend kafka common package with pinot shaded package prefix #13056
Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes #13314
Upgrade com.microsoft.azure:msal4j from 1.3.5 to 1.3.10 for CVE fixing #12580
[bugfix] Handling null value for kafka client id suffix #13279
bugfix: fixing jdbc client sql feature not supported exception #12480
bugfix: re-add support for not text_match #12372
bugfix: reduce enum array allocation in QueryLogger #12478
bugfix: use consumerDir during lucene realtime segment conversion #13094
cleanup: fix apache rat violation #12476
fix GuavaRateLimiter acquire method #12500
fix fieldsToRead class not in decoder #13186
fix flakey test, avoid early finalization #13095
fix merging null multi value in partial upsert #13031
fix race condition in ScalingThreadPoolExecutor #13360
fix shared buffer, tests #12587
fix(build): update node version to 16 #12924
fixing CVE critical issues by resolving kerby/jline and wildfly libraries #12566
fixing pinot-adls high severity CVEs #12571
fixing swagger setup using localhost as host name #13254
swagger-ui upgrade to 5.15.0 Fixes #12908
upgrade jettison version to fix CVE #12567

apache/pinot release-1.2.0 Apache Pinot Release 1.2.0 on GitHub

What Changed

Multistage Engine Improvements

Features

New Window Functions: LEAD, LAG, FIRST_VALUE, LAST_VALUE #12878 #13340

Support for Logical Database in V2 Engine #12591 #12695

Improved Multi-Value (MV) and Array Function Support

Support for WITHIN GROUP Clause and ListAgg #13146

Scalar/Transform Function and Set Operation Improvements

Improved Literal Handling Support

Metrics Improvements

Notable Improvements and Bug Fixes

Upsert Compaction and Minion Improvements

Features and Improvements

Minion Resource Isolation #12459 #12786

Greedy Upsert Compaction Scheduling #12461

Notable Improvements

Bug Fixes

Upsert Improvements

Features and Improvements

Consistent Table View for Upsert Tables #12976

Pluggable Partial Upsert Merger #11983

Support for Uploading Externally Partitioned Segments for Upsert Backfill 13107

Misc Improvements and Bug Fixes

Notable Features

JSON Support Improvements

Lucene and Text Search Improvements

New Funnel Functions #13176 #13231 #13228

Support for Interning for OnHeapByteDictionary #12342

Column Major Builder On By Default for New Tables #12770

Support for SQL Formatting in Query Editor #11725

Hash Function for UUID Primary Keys #12538

Column Level Index Skip Query Option #12414

New UDFs and Scalar Functions

CLP Compression Codec in Forward Indexes #12504

Misc. Improvements

Bug Fixes

apache/pinot release-1.2.0
Apache Pinot Release 1.2.0

on GitHub