Bug fixes
- Fix a consistency issue with transactions. by @rystsov in #3247
- #1275 Fix kafka server send group topic partition offset metric to prometheus. by @ZeDRoman in #3440
- #3263 k8s: fix bug reconciling clusters with fixed nodeport. by @alenkacz in #3281
- #3383 k8s: fix bug in external port binding when port is specified in configuration. by @alenkacz in #3387
- #3211 Schema Registry: Fix a crash during compatibility checks. by @BenPope in #3222
- #3305 Fix archival upload data corruption. by @Lazin in #3309
- Fix incorrect handling of failed snapshot delivery that may lead to situation in which snapshot is being redelivered in tight loop. by @mmaslankaprv in #3286
- #2142 #2568 Better handling of consumer group related errors. by @mmaslankaprv in #3286
- #3323 Fix rare crash that could happen when log segments eviction happened concurrently with fetching near the start of the log. by @ztlpn in #3541
- #3324 #3334 Fix disk space issues. by @Lazin in #3373
- #3378 cloud_storage: ensure readers only destroyed in remote_partition. by @jcsp in #3448
- #3334 cloud_storage: fix file descriptor leak. by @jcsp in #3437
- archival: fix bug in logging on success. by @LenaAn in #3512
- #3494 k8s: fix how schema reg node cert is mounted. by @simon0191 in #3568
- #3528 fix possible deadlock of raft groups. by @mmaslankaprv in #3566
- fix regression that caused an assertion to be triggered during Fetch request handling. by @mmaslankaprv in #3597
- Shadow Indexing (Tech Preview): Kubernetes operator: All Redpanda nodes created through the Kubernetes operator will have a default cloud storage maximum upload interval (
cloud_storage_segment_max_upload_interval_sec
) of 30 minutes. by @RafalKorepta in #3241 - #3272 Shadow Indexing (Tech Preview): Fix data loss in shadow indexing archived data that could occur after quick partition leadership transfer back and forth between two nodes. Compatibility note: previous redpanda versions won't be able to read shadow indexing data archived by newer versions. by @ztlpn in #3434
Features
- schema_registry: Support
GET /subjects/{subject}/versions/{version}/referencedBy
. by @BenPope in #3402 - Cluster level configuration parameter
cloud_storage_enable_remote_read
can be used to enable shadow indexing fetch for all topics (disabled by default). Cluster level configuration parametercloud_storage_enable_remote_write
can be used to enable archival uploads for all topics (disabled by default). by @Lazin in #3248
Improvements
- #3269 redpanda/cluster: improve logging for leader_balancer. by @BenPope in #3284
- #3304 Gracefully handle a situation when segment in S3 is truncated. Before client would be stuck in a cycle trying to proceed, not client will get an error. by @LenaAn in #3343
- #3428 Improved stability when doing large kafka fetch requests under low memory conditions. by @jcsp in #3472
- Consuming from a very large number of partitions at once is now subject to a per-request size limit, reducing the risk of exhausting memory if a client specifies a partition count and per-partition size limit that is greater than available RAM. The size limit per-fetch is set via the kafka_max_bytes_per_fetch configuration property, default 64MiB. by @jcsp in #3438
- #3412 Memory utilization on systems with large number of partitions can now be tweaked using configuration properties storage_read_buffer_size (default 128KiB) and storage_read_readahead_count (default 10). These properties may be decreased to more conservative values such as buffer_size=16KiB, readahead_count=1 to reduce the per-partition memory overhead and improve stability when the number of partitions is large (e.g. more than 10000). by @jcsp in #3436
- #3269 redpanda/cluster: improve logging for leader_balancer. by @Lazin in #3364
- #3333 Fine-tune logging. by @LenaAn in #3357
- #3322 cloud_storage: suppress warning on successfull topic creation by @LenaAn in #3484
- #3376 archival: make upload loop more resilient. by @Lazin in #3441
- cloud_storage: limit remote partition concurrency. by @Lazin in #3510
- #3310 archival: warnings generated when topic leadership changes. by @LenaAn in #3379