What Changed
This release comes with several Improvements and Bug Fixes for the Multistage Engine, Upserts and Compaction. There are a ton of other small features and general bug fixes.
Multistage Engine Improvements
Features
New Window Functions: LEAD, LAG, FIRST_VALUE, LAST_VALUE #12878 #13340
- LEAD allows you to access values after the current row in a frame.
- LAG allows you to access values before the current row in a frame.
- FIRST_VALUE and LAST_VALUE return the respective extremal values in the frame.
Support for Logical Database in V2 Engine #12591 #12695
- V2 Engine now supports a "database" construct, enabling table namespace isolation within the same Pinot cluster.
- Improves user experience when multiple users are using the same Pinot Cluster.
- Access control policies can be set at the database level.
- Database can be selected in a query using a SET statement, such as
SET database=my_db;
.
Improved Multi-Value (MV) and Array Function Support
- Added array sum aggregation functions for point-wise array operations #13324.
- Added support for
valueIn
MV transform function #13443. - Fixed bug in numeric casts for MV columns in filters #13425.
- Fixed NPE in ArrayAgg when a column contains no data #13358.
- Fixed array literal handling #13345.
Support for WITHIN GROUP Clause and ListAgg #13146
WITHIN GROUP
Clause can be used to process rows in a given order within a group.- One of the most common use-cases for this is the
ListAgg
function, which when combined withWITHIN GROUP
can be used to concatenate strings in a given order.
Scalar/Transform Function and Set Operation Improvements
- Added Geospatial Scalar Function support for use in intermediate stage in the v2 query engine #13457.
- Fix 'WEEK' transform function #13483.
- Support
EXTRACT
as a scalar function #13463. - Added support for ALL modifier for INTERSECT and EXCEPT Set Operations #13151 #13166.
Improved Literal Handling Support
- Fixed bug in handling literal arguments in aggregation functions like Percentile #13282.
- Allow INT and FLOAT literals #13078.
- Fixed literal handling for all types #13344 #13345.
- Fixed null literal handling for null intolerant functions #13255.
Metrics Improvements
- Added new metrics for tracking queries executed globally and at the table level #12982.
- New metrics to track join counts and window function counts #13032.
- Multiple meters and timers to track Multistage Engine Internals #13035.
Notable Improvements and Bug Fixes
- Improved Window operators resiliency, with new checks to make sure the window doesn't grow too large #13180 #13428 #13441.
- Optimized Group Key generation #12394.
- Fixed
SortedMailboxReceiveOperator
to honor convention of pulling at most 1 EOS block #12406. - Improvement in how execution stats are handled #12517 #12704 #13136.
- Use Protobuf instead of Reflection for Plan Serialization #13221.
Upsert Compaction and Minion Improvements
Features and Improvements
Minion Resource Isolation #12459 #12786
- Minions now support resource isolation based on an instance tag.
- Instance tag is configured at table level, and can be set for each task on a table.
- This enables you to implement arbitrary resource isolation strategies, i.e. you can use a set of Minion Nodes for running any set of tasks across any set of tables.
Greedy Upsert Compaction Scheduling #12461
- Upsert compaction now schedules segments for compaction based on the number of invalid docs.
- This helps the compaction task to handle arbitrary temporal distribution of invalid docs.
Notable Improvements
- Minions can now download segments from servers when deepstore copy is missing. This feature is enabled via a cluster level config
allowDownloadFromServer
#12960 #13247. - Added support for TLS Port in Minions #12943.
- New metrics added for Minions to track segment/record processing information #12710.
Bug Fixes
- Minions can now handle invalid instance tags in Task Configs gracefully. Prior to this change, Minions would be stuck in
IN_PROGRESS
state until task timeout #13092. - Fix bug to return validDocIDsMetadata from all servers #12431.
- Upsert compaction doesn't retain maxLength information and trims string fields #13157.
Upsert Improvements
Features and Improvements
Consistent Table View for Upsert Tables #12976
- Adds different modes of consistency guarantees for Upsert tables.
- Adds a new UpsertConfig called
consistencyMode
which can be set toNONE, SYNC, SNAPSHOT
. SYNC
is optimized for data freshness but can lead to elevated query latencies and is best for low-qps use-cases. In this mode, the ingestion threads will take a WLock when updating validDocID bitmaps.SNAPSHOT
mode can handle high-qps/high-ingestion use-cases by getting the list of valid docs from a snapshot of validDocID. The snapshot can be refreshed every few seconds and the tolerance can be set via a query optionupsertViewFreshnessMs
.
Pluggable Partial Upsert Merger #11983
- Partial Upsert merges the old record and the new incoming record to generate the final ingested record.
- Pinot now allows users to customize how this merge of an old row and the new row is computed.
- This allows a column value in the new row to be an arbitrary function of the old and the new row.
Support for Uploading Externally Partitioned Segments for Upsert Backfill 13107
- Segments uploaded for Upsert Backfill can now explicitly specify the Kafka partition they belong to.
- This enables backfilling an Upsert table where the externally generated segments are partitioned using an arbitrary hash function on an arbitrary primary key.
Misc Improvements and Bug Fixes
- Fixed a Bug in Handling Equal Comparison Column Values in Upsert, which could lead to data inconsistency (#12395)
- Upsert snapshot will now snapshot only those segments which have updates. #13285.
Notable Features
JSON Support Improvements
- JSON Index can now be used for evaluating Regex and Range Predicates. #12568
jsonExtractIndex
now supports contextual array filters. #12683 #12531.- JSON column type now supports filter predicates like
=
,!=
,IN
andNOT IN
. This is convenient for scenarios where the JSON values are very small. #13283. JSON_MATCH
now supports exclusive predicates correctly. For instance, you can use predicates such asJSON_MATCH(person, '"$.addresses[*].country" != ''us'''
to find all people who have at least one address that is not in the US. #13139.jsonExtractIndex
supports extracting Multi-Value JSON Fields, and also supports providing any default value when the key doesn't exist. #12748.- Added
isJson
UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603. - Fix
ArrayIndexOutOfBoundsException
injsonExtractIndex
. #13479.
Lucene and Text Search Improvements
- Improved Segment Build Time for Lucene Text Index by 40-60%. This improvement is realized when a consuming segment commits and changes to an
ImmutableSegment
. This significantly helps in lowering ingestion lag at commit time due to a large text index #12744 #13094 #13050. - Phrase Search can run 3x faster when the Lucene Index Config
enablePrefixSuffixMatchingInPhraseQueries
is set totrue
. This is achieved by rewriting phrase search query to a wildcard and prefix matching query #12680. - Fixed bug in
TextMatchFilterOptimizer
that was not applying precedence to the filter expressions properly, which could lead to incorrect results. #13009. - Fixed bug in handling
NOT text_match
which could have returned incorrect results. #12372. - Added
SchemaConformingTranformerV2
to enhance text search abilities. #12788. - Added metrics to track Lucene NRT Refresh Delay #13307.
- Switched to
NRTCachingDirectory
for Realtime segments and prevented duplicates in the Realtime Lucene Index to avoidIndexOutOfBounds
query time exceptions. #13308. - Lucene Version is upgraded to 9.11.1. #13505.
New Funnel Functions #13176 #13231 #13228
- Added
funnelMaxStep
function which can be used to calculate max funnel steps for a given sliding window . - Added
funnelCompleteCount
to calculate the number of completed funnels, andfunnelMatchStep
to get the funnel match array.
Support for Interning for OnHeapByteDictionary #12342
- This can reduce the heap usage of a dictionary encoded byte column, for a certain distribution of duplicate values. See #12223 for details.
Column Major Builder On By Default for New Tables #12770
- Prior to this feature, on a segment commit, Pinot would convert all the columnar data from the Mutable Segment to row-major, and then re-build column major Immutable Segments.
- This feature skips the row-major conversion and is expected to be both space and time efficient.
- It can help lower ingestion lag from segment commits, especially helpful when your segments are large.
Support for SQL Formatting in Query Editor #11725
- You can now prettify SQL right in the Controller UI!
Hash Function for UUID Primary Keys #12538
- Added a new lossless hash-function for Upsert Primary Keys optimized for UUIDs.
- The hash function can reduce Old Gen by up to 30%.
- It maps a UUID to a 16 byte array, vs encoding it in a UTF string which would take 36 bytes.
Column Level Index Skip Query Option #12414
- Convenient for debugging impact of indexes on query performance or results.
- You can add the
skipIndexes
option to your query to skip any number of indexes. e.g.SET skipIndexes=inverted,range;
New UDFs and Scalar Functions
- New GeoHash functions:
encodeGeoHash
,decodeGeoHash
,decodeGeoHashLatitude
anddecodeGeoHashLongitude
. dateBin
can be used to align a timestamp to the nearest time bucket.prefixes
,suffixes
anduniqueNgrams
UDFs for generating all respective string subsequences from a string input. #12392.- Added
isJson
UDF which increases your options to handle invalid JSONs. This can be used in queries and for filtering invalid json column values in ingestion. #12603. splitPart
UDF has minor improvements. #12437.
CLP Compression Codec in Forward Indexes #12504
- CLP is a compressed log processor which has really high compression ratio for certain log types.
- To enable this, you can set the
compressionCodec
in thefieldConfigList
of the column you want to target.
Misc. Improvements
-
Enable segment preloading at partition level #12451.
-
Use Temurin instead of AdoptOpenJdk #12533
-
Adding record reader config/context param to record transformer #12520
-
Removing legacy commons-lang dependency #13480
-
12508: Feature add segment rows flush config #12681
-
ADSS Race Condition and update to client error codes #13104
-
Add ExceptionMapper to convert Exception to Response Object for Broker REST API's #13292
-
Add FunnelMaxStepAggregationFunction and FunnelCompleteCountAggregationFunction #13231
-
Add PodDisruptionBudgets to the Pinot Helm chart #13153
-
Add Postgres compliant name aliasing for String Functions. #12795
-
Add SchemaConformingTransformerV2 to enhance text search abilities #12788
-
Add a benchmark to measure multi-stage block serde cost #13336
-
Add a plan version field to QueryRequest Protobuf Message #13267
-
Add a post-validator visitor that verifies there are no cast to bytes #12475
-
Add a safe version of
CLStaticHttpHandler
that disallows path traversal. #13124 -
Add ability to track filtered messages offset #12602
-
Add back 'numRowsResultSet' to BrokerResponse, and retain it when result table id hidden #13198
-
Add back profile for shade #12979
-
Add back some exclude deps from hadoop-mapreduce-client-core #12638
-
Add backward compatibility regression test suite for multi-stage query engine #13193
-
Add base class for custom object accumulator #12685
-
Add clickstream example table for funnel analysis #13379
-
Add config option for timezone #12386
-
Add config to skip record ingestion on string column length exceeding configured max schema length #13103
-
Add controller API to get allLiveInstances #12498
-
Add isJson UDF #12603
-
Add list of collaborators to asf.yaml #13346
-
Add locking logic to get consistent table view for upsert tables #12976
-
Add metric to track number of segments missed in upsert-snapshot #12581
-
Add metrics for SEGMENTS_WITH_LESS_REPLICAS monitoring #12336
-
Add mode to allow adding dummy events for non-matching steps #13382
-
Add offset based lag metrics #13298
-
Add protobuf codegen decoder #12980
-
Add retry policy to wait for job id to persist during rebalancing #13372
-
Add round-robin logic during downloadSegmentFromPeer #12353
-
Add schema as input to the decoder. #12981
-
Add splitPartWithLimit and splitPartFromEnd UDFs #12437
-
Add support for creating raw derived columns during segment reload #13037
-
Add support for raw JSON filter predicates #13283
-
Add the possibility of configuring ForwardIndexes with compressionCodec #12218
-
Add upsert-snapshot timer metric #12383
-
Add validation check for forward index disabled if it's a REALTIME table #12838
-
Added PR compatability test against release 1.1.0 #12921
-
Added kafka partition number to metadata. #13447
-
Added pinot-error-code header in query response #12338
-
Added tests for additional data types in SegmentPreProcessorTest.java #12755
-
Adding a cluster config to enable instance pool and replica group configuration in table config #13131
-
Adding batch api support for WindowFunction #12993
-
Adding bytes string data type integration tests #12387
-
Adding registerExtraComponents to allow registering additional components in various services #13465
-
Adding support of insecure TLS #12416
-
Adding support to insecure TLS when creating SSLFactory #12425
-
Adds AGGREGATE_CASE_TO_FILTER rule #12643
-
Adds per-column, query-time index skip option #12414
-
Allow Aggregations in Case Expressions #12613
-
Allow PintoHelixResourceManager subclasses to be used in the controller starter by providing an overridable PinotHelixResouceManager object creator function #13495
-
Allow RequestContext to consider http-headers case-insensitivity #13169
-
Allow Server throttling just before executing queries on server to allow max CPU and disk utilization #12930
-
Allow all raw index config in star-tree index #13225
-
Allow apply both environment variables and system properties to user and table configs, Environment variables take precedence over system properties #13011
-
Allow configurable queryWorkerThreads in Pinot server side GrpcQueryServer #13404
-
Allow dynamically setting the log level even for loggers that aren't already explicitly configured #13156
-
Allow passing custom record reader to be inited/closed in SegmentProcessorFramework #12529
-
Allow passing database context through
database
http header #12417 -
Allow stop to interrupt the consumer thread and safely release the resource #13418
-
Allow user configurable regex library for queries #13005
-
Allow using 'serverReturnFinalResult' to optimize server partitioned table #13208
-
Assign default value to newly added derived column upon reload #12648
-
Avoid port conflict in integration tests #13390
-
Better handling of null tableNames #12654
-
CLP as a compressionCodec #12504
-
Change helm app version to 1.0.0 for Apache Pinot latest release version #12436
-
Clean Google Dependencies #13297
-
Clean up BrokerRequestHandler and BrokerResponse #13179
-
Clean up arbitrary sleep in /GrpcBrokerClusterIntegrationTest #12379
-
Cleaning up vector index comments and exceptions #13150
-
Cleanup HTTP components dependencies and upgrade Thrift #12905
-
Cleanup Javax and Jakarta dependencies #12760
-
Cleanup deprecated query options #13040
-
Cleanup the consumer interfaces and legacy code #12697
-
Cleanup unnecessary dependencies under pinot-s3 #12904
-
Cleanup unused aggregate internal hint #13295
-
Consistency in API response for live broker #12201
-
Consolidate bouncycastle libraries #12706
-
Consolidate nimbus-jose-jwt version to 9.37.3 #12609
-
ControllerRequestClient accepts headers. Useful for authN tests #13481
-
Custom configuration property reader for segment metadata files #12440
-
Delete database API #12765
-
Deprecate PinotHelixResourceManager#getAllTables() in favour of getAllTables(String databaseName) #12782
-
Detect expired messages in Kafka. Log and set a gauge. #12608
-
Do not hard code resource class in BaseClusterIntegrationTest #13400
-
Do not pause ingestion when upsert snapshot flow errors out #13257
-
Don't drop original field during flatten #13490
-
Don't enforce -realTimeInstanceCount and -offlineInstanceCount options when creating broker tenants #13236
-
Egalpin/skip indexes minor changes #12514
-
Emit Metrics for Broker Adaptive Server Selector type #12482
-
Emit table size related metrics only in lead controller #12747
-
Enable complexType handling in SegmentProcessFramework #12942
-
Enable more integration tests to run on the v2 multi-stage query engine #13467
-
Enabling avroParquet to read Int96 as bytes #12484
-
Enhance Kinesis consumer #12806
-
Enhance Parquet Test #13082
-
Enhance ProtoSerializationUtils to handle class move #12946
-
Enhance Pulsar consumer #12812
-
Enhance PulsarConsumerTest #12948
-
Enhance commit threshold to accept size threshold without setting rows to 0 #12684
-
Enhance json index to support regexp and range predicate evaluation #12568
-
Enhancement: Sketch value aggregator performance #13020
-
Ensure FieldConfig.getEncodingType() is never null #12430
-
Ensure all the lists used in PinotQuery are ArrayList #13017
-
Ensure brokerId and requestId are always set in BrokerResponse #13200
-
Enter segment preloading at partition level #12451
-
Exclude dimensions from star-tree index stored type check #13355
-
Expose more helper API in TableDataManager #13147
-
Extend compatibility verifier operation timeout from 1m to 2m to reduce flakiness #13338
-
Extract json individual array elements from json index for the transform function jsonExtractIndex #12466
-
Fetch query quota capacity utilization rate metric in a callback function #12767
-
First with time #12235
-
GitHub Actions checkout v4 #12550
-
Gzip compression, ensure uncompressed size can be calculated from compressed buffer #12802
-
Handle errors gracefully during multi-stage stats collection in the broker #13496
-
Handle shaded classes in all methods of kafka factory #13087
-
Hash Function for UUID Primary Keys #12538
-
Ignore case when checking for Direct Memory OOM #12657
-
Improve Retention Manager Segment Lineage Clean Up #13232
-
Improve error message for max rows in join limit breach #13394
-
Improve exception logging when we fail to index / transform message #12594
-
Improve logging in range index handler for index updates #13381
-
Improve upsert compaction threshold validations #13424
-
Improve warn logs for requesting validDocID snapshots #13280
-
Improved metrics for server grpc query #13177
-
Improved null check for varargs #12673
-
Improved segment build time for Lucene text index realtime to offline conversion #12744
-
In ClusterTest, make start port higher to avoid potential conflict with Kafka #13402
-
Introduce PinotLogicalAggregate and remove internal hint #13291
-
Introduce retries while creating stream message decoder for more robustness #13036
-
Isolate bad server configs during broker startup phase #12931
-
Json extract index filter support #12683
-
Json extract index mv #12532
-
Keep get tables API with and without database #12804
-
Lint failure #12294
-
Logging a warn message instead of throwing exception #12546
-
Made the error message around dimension table size clearer #13163
-
Make Helix state transition handling idempotent #12886
-
Make KafkaConsumerFactory method less restrictive to avoid incompatibility #12815
-
Make task manager APIs database aware #12766
-
Metric for count of tables configured with various tier backends #12940
-
Metric for upsert tables count #12505
-
Metrics for Realtime Rows Fetched and Stream Consumer Create Exceptions #12522
-
Minmaxrange null #12252
-
Modify consumingSegmentsInfo endpoint to indicate how many servers failed #12523
-
Move offset validation logic to consumer classes #13015
-
Move package org.apache.calcite to org.apache.pinot.calcite #12837
-
Move resolveComparisonTies from addOrReplaceSegment to base class #13396
-
Move some mispositioned tests under pinot-core #12884
-
Move wildfly-openssl dependency management to root pom #12597
-
Moving deleteSegment call from POST to DELETE call #12663
-
Optimize unnecessary extra array allocation and conversion for raw derived column during segment reload #13115
-
Pass explicit TypeRef when evaluating MV jsonPath #12524
-
Percentile operations supporting null #12271
-
Prepare for next development iteration #12530
-
Propagate Disable User Agent Config to Http Client #12479
-
Properly handle complex type transformer in segment processor framework #13258
-
Properly return response if SegmentCompletion is aborted #13206
-
Publish helm 0.2.8 #12465
-
Publish helm 0.2.9 #13230
-
Pull janino dependency to root pom #12724
-
Pull pulsar version definitaion into root POM #13002
-
Query response opt #13420
-
Re-enable the Spotless plugin for Java 21 #12992
-
Readme - How to setup Pinot UI for development #12408
-
Record enricher #12243
-
Refactor PinotTaskManager class #12964
-
Refactored CommonsConfigurationUtils for loading properties configuration. #13201
-
Refactored compatibility-verifier module #13359
-
Refactoring removeSegment flow in upsert #13449
-
Refine PeerServerSegmentFinder #12933
-
Refine SegmentFetcherFactory #12936
-
Replace custom fmpp plugin with fmpp-maven-plugin #12737
-
Reposition query submission spot for adaptive server selection #13327
-
Reset controller port when stopping the controller in ControllerTest #13399
-
Rest Endpoint to Create ZNode #12497
-
Return clear error message when no common broker found for multi-stage query with tables from different tenants #13235
-
Returning tables names failing authorization in Exception of Multi State Engine Queries #13195
-
Revert " Adding record reader config/context param to record transformer (#12520)" #12526
-
Revert "Using local copy of segment instead of downloading from remote (#12863)" #13114
-
Short circuit SubPlanFragmenter because we don't support multiple sub-plans yet #13306
-
Simplify Google dependencies by importing BOM #12456
-
Specify version for commons-validator #12935
-
Support NOT in StarTree Index #12988
-
Support empty strings as json nodes^ #12555
-
Supporting human-readable format when configuring broker response size #12510
-
Use ArrayList instead of LinkedList in SortOperator #12783
-
Use a two server setup for multi-stage query engine backward compatibility regression test suite #13371
-
Use more efficient variants of URLEncoder::encode and URLDecoder::decode #13030
-
Use parameterized log messages instead of string concatenation #13145
-
Use separate action for /tasks/scheduler/jobDetails API #13054
-
Use try-with-resources to close file walk stream in LocalPinotFS #13029
-
Using local copy of segment instead of downloading from remote #12863
-
[Adaptive Server Selector] Add metrics for Stats Manager Queue Size #12340
-
[Cleanup] Move classes in pinot-common to the correct package #13478
-
[Feature] Add Support for SQL Formatting in Query Editor #11725
-
[HELM]: Added additional probes options and startup probe. #13165
-
[HELM]: Added checksum config annotation in stateful set for broker, controller and server #13059
-
[HELM]: Added namespace support in K8s deployment. #13380
-
[HELM]: zookeeper chart upgrade to version 13.2.0 #13083
-
[Minor] Add Nullable annotation to HttpHeaders in BrokerRequestHandler #12816
-
[Minor] Small refactor of raw index creator constructor to be more clear #13093
-
[Multi-stage] Clean up RelNode to Operator handling #13325
-
[null-aggr] Add null handling support in
mode
aggregation #12227 -
[partial-upsert] configure early release of _partitionGroupConsumerSemaphore in RealtimeSegmentDataManager #13256
-
[spark-connector] Add option to fail read when there are invalid segments #13080
-
add Netty arm64 dependencies #12493
-
add Netty unit test #12486
-
add SegmentContext to collect validDocIds bitmaps for many segments together #12694
-
add
skipUnavailableServers
query option #13387 -
add insecure mode when Pinot uses TLS connections #12525
-
add instrumentation to json index getMatchingFlattenedDocsMap() #13164
-
add jmx to promethues metric exporting rule for realtimeRowsFiltered #12759
-
add metrics for IdeaState update #13266
-
add some metrics for upsert table preloading #12722
-
add some tests on jsonPathString #12954
-
add test cases in RequestUtilsTest #12557
-
add unit test for JsonAsyncHttpPinotClientTransport #12633
-
add unit test for QueryServer #12599
-
add unit test for ServerChannels #12616
-
add unit test for StringFunctions encodeUrl #13391
-
add unit tests for pinot-jdbc-client #13137
-
add url assertion to SegmentCompletionProtocolTest #13373
-
adjust the llc partition consuming metric reporting logic #12627
-
allow passing null http headers object to translateTableName #12764
-
allow to set segment when use SegmentProcessorFramework #13341
-
auto renew jvm default sslconext when it's loaded from files #12462
-
avoid useless intermediate byte array allocation for VarChunkV4Reader's getStringMV #12978
-
aws sdk 2.25.3 #12562
-
build-helper-maven-plugin 3.5.0 #12548
-
cache ssl contexts and reuse them #12404
-
clean up jetbrain nullable annotation #13427
-
cleanup: maven no transfer progress #12444
-
close JDBC connections #12494
-
do not fail on duplicate relaxed vars (#13214)z
-
dropwizard metrics 4.2.25 #12600
-
dynamic chunk sizing for v4 raw forward index #12945
-
enable Netty leak detection #12483
-
enable parallel Maven in pinot linter script #12751
-
ensure inverse And/OrFilterOperator implementations match the query #13199
-
exclude .mvn directory from source assembly #12558
-
extend CompactedPinotSegmentRecordReader so that it can skip deleteRecord #13352
-
get startTime outside the executor task to avoid flaky time checks #13250
-
handle absent segments so that catchup checker doesn't get stuck on them #12883
-
handle overflow for
MutableOffHeapByteArrayStore
buffer starting size #13215 -
handle segments not tracked by partition mgr and add skipUpsertView query option #13415
-
handle table name translation on missed api resources #12792
-
hash4j version upgrade to 0.17.0 #12968
-
including the underlying exception in the logging output #13248
-
int96 parity with native parquet reader #12496
-
jsonExtractIndex support array of default values #12748
-
log the log rate limiter rate for dropped broker logs #13041
-
make http listener ssl config swappable #12455
-
make reflection calls compatible with 0.9.11 [#12958](https://github.com/apache/
-
maven: no transfer progress #12528
-
missed to delete the temp dir #12637
-
move shouldReplaceOnComparisonTie to base class to be more reusable #13353
-
reduce Java enum .values() usage in TimerContext #12579
-
reduce logging for SpecialValueTransformer #12970
-
reduce regex pattern compilation in Pinot jdbc #13138
-
refactor TlsUtils class #12515
-
refine when to registerSegment while doing addSegment and replaceSegment for upsert tables for better data consistency #12709
-
reformat AdminConsoleIntegrationTest.java #12552
-
reformat ClusterTest.java #12531
-
release segment mgrs more reliably #13216
-
replaced getServer with getServers #12545
-
report rebalance job status for the early returns like noops #13281
-
require noDictionaryColumns with aggregationConfigs #12464
-
share the same table config object #12463
-
track segments for snapshotting even if they lost all comparisons #13388
-
untrack the segment out of TTL #12449
-
update ControllerJobType from enum to string #12518
-
update RewriterConstants so that expr min max would not collide with columns start with "parent" #13357
-
update access control check error handling to catch throwable and log errors #13209
Bug Fixes
- Use gte(lte) to replace between() which has a bug #12595
- Fix the ConcurrentModificationException for And/Or DocIdSet #12611
- Upgrade RoaringBitmap to 1.0.5 to pick up the fix for RangeBitmap.between() #12604
- bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539
- Bug Fix createDictionaryForColumn does not take into account inverted index #13048
- fix Cluster Manager error #12632
- fix for quick start Cluster Manager issue #12610
- Adding config for having suffix for client ID for realtime consumer #13168
- Addressed comments and fixed tests from pull request 12389. /uptime and /start-time endpoints working all components #12512
- Bigfix. Added missing paramName #13060
- Bug fix: Do not ignore scheme property #12332
- Bug fix: Handle missing shade config overwrites for Kafka #13437
- BugFix: Fix merge result from more than one server #12778
- Bugfix. Allow tenant rebalance with downtime as true #13246
- Bugfix. Avoid passing null table name input to translation util #12726
- Bugfix. Correct wrong method call from scheduleTask() to scheduleTaskForDatabase() #12791
- Bugfix. Maintain literal data type during function evaluation #12607
- Cleanup: Fix grammar in error message, also improve readability. #13451
- Fix Bug in Handling Equal Comparison Column Values in Upsert #12395
- Fix ColumnMinMaxValueGenerator #12502
- Fix JavaEE related dependencies #13058
- Fix Logging Location for CPU-Based Query Killing #13318
- Fix PulsarUtils to not share buffer #12671
- Fix URI construction so that AddSchema command line tool works when override flag is set to true #13320
- Fix [Type]ArrayList elements() method usage #13354
- Fix a typo when calculating query freshness #12947
- Fix an overflow in PinotDataBuffer.readFrom #13152
- Fix bug in logging in UpsertCompaction task #12419
- Fix bug to return validDocIDsMetadata from all servers #12431
- Fix connection issues if using JDBC and Hikari (#12267) #12411
- Fix controller host / port / protocol CLI option description for admin commands #13237
- Fix environment variables not applied when creating table #12560
- Fix error message for insufficient number of untagged brokers during tenant creation #13234
- Fix few metric rules which were affected by the database prefix handling #13290
- Fix file handle leaks in Pinot Driver (#12263) #12356
- Fix flakiness of ControllerPeriodicTasksIntegrationTest #13337
- Fix issue with startree index metadata loading for columns with '__' in name #12554
- Fix metric rule pattern regex #12856
- Fix pinot-parquet NoClassFound issue #12615
- Fix segment size check in OfflineClusterIntegrationTest #13389
- Fix some resource leak in tests #12794
- Fix the NPE from IS update metrics #13313
- Fix the NPE when metadataTTL is enabled without delete column #13262
- Fix the ServletConfig loading issue with swagger. #13122
- Fix the issue that map flatten shouldn't remove the map field from the record #13243
- Fix the race condition for H3InclusionIndexFilterOperator #12487
- Fix the time segment pruner on TIMESTAMP data type #12789
- Fix time stats in SegmentIndexCreationDriverImpl #13429
- Fixed infer logical type name from avro union schema #13224
- Fixing instance type to resolve #12677 and #12678
- Helm: bug fix for chart rendering issue. #13264
- Try to amend kafka common package with pinot shaded package prefix #13056
- Update getValidDocIdsMetadataFromServer to make call in batches to servers and other bug fixes #13314
- Upgrade com.microsoft.azure:msal4j from 1.3.5 to 1.3.10 for CVE fixing #12580
- [bugfix] Handling null value for kafka client id suffix #13279
- bugfix: fixing jdbc client sql feature not supported exception #12480
- bugfix: re-add support for
not text_match
#12372 - bugfix: reduce enum array allocation in QueryLogger #12478
- bugfix: use consumerDir during lucene realtime segment conversion #13094
- cleanup: fix apache rat violation #12476
- fix GuavaRateLimiter acquire method #12500
- fix fieldsToRead class not in decoder #13186
- fix flakey test, avoid early finalization #13095
- fix merging null multi value in partial upsert #13031
- fix race condition in
ScalingThreadPoolExecutor
#13360 - fix shared buffer, tests #12587
- fix(build): update node version to 16 #12924
- fixing CVE critical issues by resolving kerby/jline and wildfly libraries #12566
- fixing pinot-adls high severity CVEs #12571
- fixing swagger setup using localhost as host name #13254
- swagger-ui upgrade to 5.15.0 Fixes #12908
- upgrade jettison version to fix CVE #12567