Features
- Partitions of topics enabled for remote storage now follow the topic's retention policy specified via retention.bytes and retention.ms. If those are unspecified, cluster level defaults are used. The migration of existing topics is done in such a way that they preserve the previous behavior (i.e. data will not be deleted from cloud storage for topics created before v22.3). However, note that for new topics retention is applied and data can be expired from cloud storage automatically. by @VladLazar in #6833
- Tiered Storage will now clean up objects in S3 when topics are deleted. This may be avoided by disabling Tiered Storage on the topic before deleting it. by @jcsp in #6683
- Introduce retention.local-target.ms and retention.local-target.bytes topic configuration options. They control the retention policy of each partition within the topic and are only relevant for topics with remote write enabled. by @VladLazar in #6613
- Adds HTTP Basic Auth to all Schema Registry endpoints by @BenPope in #6639
- Adds HTTP Basic Auth to all Pandaproxy endpoints and uses the kafka client cache on /brokers by @NyaliaLui in #6452
- Support HTTP basic authentication from Redpanda console to schema registry by @pvsune in #7144
- Allow authentication method per kafka listener. by @RafalKorepta in #6940
- Enables transactions feature (enable_transactions=true) by default on the service side. by @bharathv in #6770
- #4824 #4826 Transactions are now supported on compacted topics. by @bharathv in #6664
- Schema Registry and REST Proxy can now use ephemeral credentials to authenticate with Redpanda. by @BenPope in #6931
- Users are no longer required to supply a node ID in each node's node configuration file. All nodes must be upgraded before using this feature. Node IDs on existing nodes will be preserved when using this. by @andrwng in #6659
- #2760 Add defaulting webhook for Console by @pvsune in #6282
- #2760 Introduce SecretKeyRef type to reference Secret objects by @pvsune in #6282
- #2760 Introduces NamespaceNameRef type instead of using the full glory of corev1.ObjectReference by @pvsune in #6282
- #333 Configurations of all nodes across the cluster can be identical by @dlex in #6744
- #333 Seed driven cluster bootstrap mode. Disable empty_seed_starts_cluster to use it. That will allow the set of servers listed as seeds to start a cluster together without a root node. All seed servers must be available for a cluster to be created, with identical node configuration. Afterwards, none of the seed servers will try to form another new cluster if their local storage is wiped out, unless all seed nodes are wiped together at the same time. Cluster now gets a cluster UUID reflected by a new controller log message, and stored in kvstore. by @dlex in #6744
- #333 Cluster is bootstrapped with all its seed servers by @dlex in #6744
- #333 Wiped out seed cluster members will not start their own new cluster by @dlex in #6744
- #6355 Added rack awareness constraint repair in the continuous partition balancing mode. by @ztlpn in #6845
- Kubernetes Operator: it's now possible to specify a TLS issuer for Pandaproxy API by @nicolaferraro in #6637
- Kubernetes Operator: external ports can be explicitly specified for admin API, Panda proxy and schema registry by @nicolaferraro in #6564
- Support AlterConfig/IncrementalAlterConfig request for replication.factor property. by @VadimPlh in #6460
Bug Fixes
- #5163 Fix compaction for group_*_tx log records by @VadimPlh in #6086
- Fix possible shadow indexing manifest corruption under memory pressure. by @ztlpn in #6507
- Time queries are more reliable on topics using client-set timestamps via the CreateTime mode by @jcsp in #6606
- Improve robustness of Schema Registry and HTTP Proxy under std::errc::broken_pipe. by @BenPope in #6687
- #6561 It's now possible to set the log level for kafka/client and r/heartbeat through the admin API. by @BenPope in #6688
- #6508 Returning retryable kafka error code in raft replication failure events that require the client to retry by @graphcareful in #6712
- #6827 Fix rack aware placement after node rack id changes. by @ztlpn in #6900
- Fix an issue where state machine snapshots could degrade performance under certain workloads (#6854) by @jcsp in #6932
- Fixes license setting in Redpanda cluster by @pvsune in #7110
- Fixed a bug that prevented redpanda from uploading the last batch in the log to cloud storage if timeboxed uploads were enabled and the batch contained exactly one message. by @ztlpn in #7096
- Redpanda shutdown is more prompt when client reads to S3 are in progress at the same time as Redpanda shuts down. by @jcsp in #7181
- Fix init_producer_id timeouts by @rystsov in #6312
- Fix a bug that configures all clusters as development clusters by @joejulian in #7107
- Fix bug in remote read/write enablement. Topic level overrides are now respected in all cases. by @VladLazar in #6663
- Fix for flex request parsing failure when request header client id is empty or null by @graphcareful in #6585
- Fix incorrect assertion in vote_stm that in some situation may lead to redpanda crash by @mmaslankaprv in #6546
- #6018 Fix consistency violation caused by split-brain of the txn coordinator by @rystsov in #6019
- #6063 Fix a need for retrying truncation of compacted topic partition when it failed by @mmaslankaprv in #6071
- #6795 #5507 Enable cluster config editing when IAM roles are used by @abhijat in #6864
Improvements
- New configuration property storage_strict_data_init. When the storage_strict_data_init property is enabled a user will have to manually add an empty magic file called .redpanda_data_dir to Redpanda's data directory for RP to start. by @ballard26 in #6786
- New metric for partition movement available bandwidth by @ZeDRoman in #6110
- #4871 New metrics for partition movements amount: redpanda_cluster_partition_moving_to_node, redpanda_cluster_partition_moving_from_node, redpanda_cluster_partition_node_cancelled_movements by @ZeDRoman in #5749
- Improve moving partitions at scale by @mmaslankaprv in #6905
- Add fields for RedpandaCloud login provider under .spec.login.redpandaCloud by @pvsune in #6359
- #3278 Support safe epoch incrementing for idempotent/transactional producer in retries cases by @VadimPlh in #5362
- Tunable cluster configuration properties are added to set bounds on the segment.bytes topic property. If log_segment_size_min and/or log_segment_size_max are set, then any segment.bytes outside these bounds will be silently clamped to the permitted range. This prevents poorly-chosen configurations from inducing the cluster to create very large numbers of small segment files, or extremely large segment files. by @jcsp in #6492
- A new tunable cluster property log_segment_size
- _jitter_percent is added, to enable greater determinism in test/benchmark environments by disabling jitter. The default 5% jitter is the same as in previous versions. by @jcsp in #6515
- Kubernetes Operator: Install Console from the operator that connects to Redpanda Kafka API via mTLS. by @pvsune in #6280
- rpk topic consume now supports %a to print attributes; see rpk topic consume --help for more details by @twmb in #6894
- rpk topic consume now has --print-control-records to opt into printing control records (for advanced use cases) by @twmb in #6894
- rpk cloud has a new byoc command, which manages the byoc plugin directly and makes it easier to use by @twmb in #7102
- rpk redpanda admin config log-level has been updated for v22.3 loggers by @twmb in #7197
- rpk topic produce now has --allow-auto-topic-creation, which can create non-existent topics if the cluster has auto_create_topics_enabled set to true by @twmb in #7197
- #6844 Introduced a configurable limit for the number of segments pending deletion from the could. This limit is controlled by the cloud_storage_max_segments_pending_deletion cluster config. by @VladLazar in #7191
- Support RedpandaAdmin in the Console CR by @pvsune in #6667
- Adds license field in Cluster spec by @pvsune in #6863
- Improved admin API error handling to reduce 500 errors on internal RPC failures. by @mmaslankaprv in #5916
- Schema Registry: Disable compression on the _schemas topic to better support manually creating schemas. by @BenPope in #6156
- Simplified schema registry deployment: the schema registry now always run as part of a redpanda cluster. Running it separately is no longer supported. by @jcsp in #4324
- Simplified HTTP proxy deployment: the HTTP proxy now always run as part of a redpanda cluster. Running it separately is no longer supported. by @jcsp in #4324
- Internal topics are created with an appropriate replication factor more reliably on clusters with at least 3 nodes, whereas previously in some circumstances they could exist in a single-replica state for a period of time before being upgraded to a replicated state. by @jcsp in #6299
- Configuration property id_allocator_replication is deprecated in favor of internal_topic_replication_factor by @jcsp in #6299
- Configuration property transaction_coordinator_replication is deprecated in favor of internal_topic_replication_factor by @jcsp in #6299
- The Schema Registry topic's default replication factor is now controlled by internal_topic_replication_factor rather than default_topic_replication. by @jcsp in #6299
- #2760 Use Redpanda wildcard certificate for Ingress since Console is exposed through https://console. by @pvsune in #6282
- #2760 Rename ClusterKeyRef to ClusterRef by @pvsune in #6282
- Kubernetes Operator: added option to customize the external advertised address of Redpanda nodes by @nicolaferraro in #6304
- The cluster configuration properties raft_heartbeat_interval_ms and raft_heartbeat_timeout_ms may now be modified without restarting redpanda. by @jcsp in #6426
- #5154 During shutdown, spurious "offset_monitor::wait_aborted" log error messages are no longer emitted. by @jcsp in #6419
- #5460 Replicas of __consumer_offsets partitions are distributed evenly across brokers, resulting in better (although not perfect) distribution of consumer group coordinators. by @dlex in #6251
- Kubernetes Operator: added options to configure generated Ingress resources for Console and Pandaproxy by @nicolaferraro in #6456
- Suppress logging for harmless 404 responses from S3 while probing for transaction range objects by @jcsp in #6526
- Logging verbosity is reduced when S3 backends unexpectedly close connections by @jcsp in #6524
- Console deletion is fully independent of Cluster by @pvsune in #6474
- Redpanda now cleans up empty directories in the tiered storage cache directory on startup, as well as after removing segments. by @jcsp in #6533
- RedpandaCloud AllowedOrigins can be set as a list by @pvsune in #6679
- Improved shadow indexing memory efficiency by @Lazin in #6558
- Incorporates the kafka client cache on /consumers Pandaproxy endpoints. The cache supports multiple authenticated connections with HTTP Basic Auth. by @NyaliaLui in #6693
- Incorporate the kafka client cache on /topics Pandaproxy endpoints. The cache supports multiple authenticated connections with HTTP Basic Auth. by @NyaliaLui in #6618
- Improve robustness of Schema Registry and HTTP Proxy under std::errc::timed_out. by @BenPope in #6885
- rpk now allows setting hostnames with dashes or numbers in the final domain segment by @twmb in #6894
- rpk now seeks to end offsets if you seek to a future timestamp, rather than -1 by @twmb in #6894
- rpk now supports using basic auth while creating a new acl user (--password for basic auth, --new-password or -p for the new user's password) by @twmb in #6894
- rpk now defaults to SCRAM-SHA-256 if SASL is specified, and now rejects invalid SASL mechanisms by @twmb in #7197
- #6495 rpk redpanda config bootstrap no longer changes configuration settings that have already been manually modified (e.g., redpanda.kafka_api[0].port) by @twmb in #7026
- The properties cloud_storage_enable_remote_[read|write] are now applied to topics at creation time, and if they subsequently change, then existing topics' properties do not change. To modify the tiered storage mode of existing topics, you may set the redpanda.remote.[read|write] properties on the topic. by @jcsp in #6950
- Support retrieving credentials from kube2iam. by @missingcharacter in #7030
- #6892 #7025 #7016 faster recovery from rolling restart by @mmaslankaprv in #7017
- Recreate Console's referenced ConfigMap if manually deleted. by @pvsune in #7077
- #6111 #6023 Improved stability under random read workloads to tiered storage topics. by @jcsp in #7042
- Improved stability under read workloads touching many tiered storage segments in quick succession by @jcsp in #7082
- #6111 #6023 A new cluster configuration property cloud_storage_max_readers_per_shard is added, which controls the maximum number of cloud storage reader objects that may exist per CPU core: this may be tuned downward to reduce memory consumption at the possible cost of read throughput. The default setting of one per partition (i.e. the value of topic_partitions_per_shard is used). by @jcsp in #7042
- A new cluster configuration property cloud_storage_max_segments_per_shard is added, which controls the maximum number of segments per CPU core that may be promoted into a readable state from cloud storage. This may be tuned downward to reduce memory consumption at the possible cost of read throughput. The default setting is two per partition (i.e. the value of topic_partitions_per_shard multiplied by 2 is used). by @jcsp in #7082
- Two new metrics are added to the /public_metrics endpoint: redpanda_cloud_storage_active_segments and redpanda_cloud_storage_readers. by @jcsp in #7082
- Two new metrics are introduced to help track the lifetime of segments uploaded to the cloud by @VladLazar in #7133
- redpanda_cloud_storage_segments:
- Description: Total number of accounted segments in the cloud for the topic
- Labels: redpanda_namespace, redpanda_topic
- Type: gauge
- redpanda_cloud_storage_segments_pending_deletion:
- Description: Total number of segments awaiting deletion from the cloud for the topic
- Labels: redpanda_namespace, redpanda_topic
- Type: gauge
- redpanda_cloud_storage_segments:
- pandaproxy: consumer fetch: More gracefully handle partition movement by @BenPope in #7210
- pandaproxy: Shut down consumers more gracefully during shutdown. by @BenPope in #7210
- Set Kafka SASL password as environment variable by @pvsune in #7112
- Improve transaction metadata handling in Shadow Indexing by @Lazin in #6001
- Improves logging in tx subsystem by moving everything under one logger and adds additional partition context. by @bharathv in #6556
- Print the timestamp along with the version info at startup by @daisukebe in #6321
- #5324 Recover from failures quickly by cleaning up resources. by @bharathv in #5730
- #6214 Transactions can now span leadership changes of transaction coordinator. by @bharathv in #6252
- #6795 #5507 Extend validation to make sure secrets do not get supplied when they are not used by Redpanda. by @abhijat in #6864
- #7119 Includes changes to make /v/1features/license (GET) to include the checksum of the loaded license in response by @graphcareful in #7130
- #7119 Includes changes to make /v1/features/license (PUT) call totally idempotent by @graphcareful in #7130
- Added new property kafka_request_max_bytes to control the maximum size of a request processed by server. by @mmaslankaprv in #6283
- Controller log limiting mechanism. by @ZeDRoman in #6641
- RPS of requests that creates entry in controller log can now be limited. by @ZeDRoman in #6641
Full Changelog: v22.2.7...v22.3.1