Release Highlights
This release of Percona Operator for MongoDB includes the following new features and improvements:
Percona Server for MongoDB 8.0 is now the default version
For you to enjoy all features and improvements that come with the latest major version out of the box, the Operator now deploys the cluster with Percona Server for MongoDB 8.0 by default. You can always change the version to your desired one for the installation and update. Check the list of Percona certified images for the database versions available for this release. For previous Operator versions, learn how to query the Version Service and retrieve the available images from it.
PMM3 support
The Operator is natively integrated with PMM 3, enabling you to monitor the health and performance of your Percona Distribution for MongoDB deployment and at the same time enjoy enhanced performance, new features, and improved security that PMM 3 provides.
Note that the Operator supports both PMM2 and PMM3. The decision on what PMM version is used depends on the authentication method you provide in the Operator configuration: PMM2 uses API keys while PMM3 uses service account tokens. If the Operator configuration contains both authentication methods with non-empty values, PMM3 takes the priority.
To use PMM, ensure that the PMM client image is compatible with the PMM Server version. Check Percona certified images for the correct client image.
For how to configure monitoring with PMM see the documentation.
Hidden nodes support
In addition to arbiters and non-voting nodes, you can now deploy hidden nodes in your Percona Server for MongoDB cluster. These nodes hold a full copy of the data but remain invisible to client applications. They are good for tasks like backups and reporting, since they access the data without affecting normal traffic.
Hidden nodes are added as voting members and can participate in primary elections. Therefore, the Operator enforces rules to ensure the number of voting members is odd and doesn't exceed seven, which is the maximum allowed number of voting members:
- If the total number of voting members is even, the Operator converts one node to non-voting to maintain an odd number of voters. The node to convert is typically the last Pod in the list.
- If the number of voting members is odd and not more than 7, all nodes participate in elections.
- If the number of voting members exceeds 7, the Operator automatically converts some nodes to non-voting to stay within MongoDB’s limit.
To inspect the current configuration, connect to the cluster with the clusterAdmin
privileges and run the rs.config().members
command.
Support for Google Cloud Client library in PBM
The Operator comes with the latest PBM version 2.11.0, which includes the support of Google Cloud Client library and authentication with service account keys.
To use Google Cloud Storage for backups with service account keys, you need to do the following:
- Create a service account key
- Create a Secrets object with this key
- Configure the storage in the Custom Resource
See the Configure Google Cloud Storage documentation for detailed steps.
The configuration of Google Cloud Storage with HMAC keys remains unchanged.
However, PBM has a known issue for using HMAC keys with GCS, which was
reported in PBM-1605. The issue is in uploading large files (~512MB+) to the storage when the network is unstable. Such backups may be corrupted or incomplete but they are incorrectly treated as valid backups and pose a risk of restore failures. Therefore, we recommend migrating to the native GCS connection type with service account (JSON) keys after the upgrade.
Improve operational resilience and observability with persistent cluster-level logging for MongoDB Pods
Debugging distributed systems just got easier. The Percona Operator for MongoDB now supports cluster-level logging, ensuring that logs from your mongod
instances are stored persistently, even across Pod restarts.
Cluster-level logging is done with Fluent Bit, running as a sidecar container within each database Pods.
Currently, logs are collected only for the mongod
instances. All other logs are ephemeral, meaning they will not persist after a Pod restart. Logs are stored for 7 days and are rotated afterwards.
Learn more about cluster-level logging in the documentation
Improved backup retention for streamlined management of scheduled backups in cloud storage
A new backup retention configuration gives you more control over how backups are managed in storage and retained in Kubernetes.
With the deleteFromStorage
flag, you can disable automatic deletion from AWS S3 or Azure Blob storage and instead rely on native cloud lifecycle policies. This makes backup cleanup more efficient and better aligned with flexible storage strategies.
The legacy keep
option is now deprecated and mapped to the new retention
block for compatibility. We encourage you to start using the backup.tasks.retention
configuration:
spec:
backup:
tasks:
- name: daily-s3-us-west
enabled: true
schedule: "0 0 ** *"
retention:
count: 3
type: count
deleteFromStorage: true
storageName: s3-us-west
compressionType: gzip
compressionLevel: 6
Improve operational efficiency with the support for concurrent cluster reconciliation
Reconciliation is a Kubernetes mechanism to keep your cluster in sync with its desired state. Previously, the Operator ran only one reconciliation loop at a time. This sequential processing meant that other clusters managed by the same Operator had to wait for the current reconciliation to complete before receiving updates.
With this release, the Operator supports concurrent reconciling and can process several clusters simultaneously. You can define the maximum number of concurrent reconciles as the environment variable for the Operator deployment.
This enhancement significantly improves scalability and responsiveness, especially in multi-cluster environments.
Added labels to identify the version of the Operator
Custom Resource Definition (CRD) is compatible with the last three Operator versions. To know which Operator version is attached to it, we've added labels to all Custom Resource Definitions. The labels help you identify the current Operator version and decide if you need to update the CRD.
To view the labels, run:
$ kubectl get crd perconaservermongodbs.psmdb.percona.com --show-labels
View backup size
You can now see the size of each backup when viewing the backup list either via the command line or from Everest or other apps integrated with the Operator. This improvement makes it easier to monitor storage usage and manage your backups efficiently.
Delegate PVC resizing to an external autoscaler
You can now configure the Operator to use an external storage autoscaler instead of its own resizing logic. This ability may be useful for organizations needing centralized, advanced, or cross-application scaling policies.
To use an external autoscaler, set the spec.enableExternalVolumeAutoscaling
option to true
in the Custom Resource manifest.
Deprecation, rename and removal
-
The
backup.schedule.keep
field is deprecated and will be removed in future releases. We recommend using thebackup.schedule.retention
instead as follows:schedule: - name: "sat-night-backup" schedule: "0 0 ** 6" retention: count: 3 type: count deleteFromStorage: true storageName: s3-us-west
-
The S3-compatible implementation of Google Cloud Storage (GCS) with using HMAC keys is deprecated in the Operator. We encourage you to switch to using to the native GCS connection type with service account (JSON) keys after the upgrade.
Changelog
New features
- K8SPSMDB-297: Added cluster-wide logging with the Fluent Bit log collector
- K8SPSMDB-1268 - Added support for PMM v3.
- K8SPSMDB-723 - Added the ability to add hidden members to MongoDB replica sets for specialized purposes.
Improvements
- K8SPSMDB-1072 - Added the ability to configure retention policy for scheduled backups
- K8SPSMDB-1216 - Updated the command to describe the
mongod
instance role todb.hello()
, which is the currently used one.
- K8SPSMDB-1243 - Added the ability to pass PBM restore configuration options to the Operator.
- K8SPSMDB-1261 - Improved the test suite for physical backups to run on every supported platform individually.
- K8SPSMDB-1262 - Improved the test suite foron demand backups to run on OpenShift
- K8SPSMDB-1272 - The
helm upgrade
command now displays warnings to clarify when CRDs are not updated. - K8SPSMDB-1284 - Clearer error messages are now displayed if a filesystem backup deletion fails.
- K8SPSMDB-1285 - CRDs now include labels that make it easy to identify their associated Operator version.
- K8SPSMDB-1304 - Added labels recommended by Kubernetes to the Operator deployment object
- K8SPSMDB-1318 - Added the ability to configure concurrent reconciles to speed up cluster reconciliation in setups where the Operator manages several database clusters.
- K8SPSMDB-1319 - Scheduled database backups now wait for the database to be healthy before starting, preventing unnecessary failures.
- k8spsmdb-1339 - Added validation for the selected restore time, preventing the point-in-time restore process from starting with an invalid date or time.
- K8SPSMDB-1344, K8SPSMDB-871 - Added the ability to retrieve and store the backup size
- K8SPSMDB-1398 - Added the ability to configure the use of an external autoscaler (Thank you Terry for contribution)
- K8SPSMDB-1412 - Added the support for Google Cloud Storage with authentication via service account keys.
Fixed bugs
- K8SPSMDB-1154 - MongoDB clusters using the
inMemory
storage engine now deploy correctly (Thank you user KOS for reporting this issue). - K8SPSMDB-1292 - Fixed the issue with physical restores failing when TLS configuration is defined by using it to construct the correct MongoDB connection string URL.
- K8SPSMDB-1297 - Exposed the data directory for the
pmm-client
sidecar container to enable it to gather required metrics.
- K8SPSMDB-1308 - Improved PBM restore logging to store logs for the latest restore in the
/data/db/pbm-restore-logs
.
- K8SPSMDB-1336 - Logical backups can now be restored to a new cluster without encountering
Time monotonicity violation
errors or service restarts. - K8SPSMDB-1371 - Physical point-in-time recovery using the
latest
type no longer crashes but gracefully fails the restore process when oplog data is unavailable. - K8SPSMDB-1400 - Resolved an issue that caused physical restores to fail on AKS and EKS environments.
- K8SPSMDB-1425 - Restoring a MongoDB cluster with point-in-time recovery now succeeds even when source and target storage prefixes differ.
- K8SPSMDB-1480 - Fixed an issue that caused cluster errors when scaling replica sets resulted in an invalid number of voting members.
Documentation improvements
-
The multi-cluster and multi-region deployment section has been improved and expanded with the information about multi-cluster deployment and its value as well as how it works. It provides improved guidance on multi-cluster services, a step-by-step tutorial for enabling multi-cluster deployments on GKE, and revised instructions for deploying and interconnecting sites for replication. The docs also walk you through planned switchover and controlled failover procedures in disaster scenarios.
-
Updated the Scale Percona Server for MongoDB on Kubernetes topic with the information about the
pvc-resize-in-progress
annotation and how it works. -
Updated the Configure backup storage with the Google Cloud Storage configuration.
-
Configuration for config server split horizons is now accurately documented, simplifying multi-cluster deployments and external DNS integration.
-
The Data-at-rest encryption topic is updated with the correct steps for using HashiCorp Vault.
-
New documentation is available detailing important considerations for upgrading your Kubernetes cluster before updating any Operator.
Supported software
The Operator was developed and tested with the following software:
- Percona Server for MongoDB 6.0.25-20, 7.0.24-13, and 8.0.12-4.
- Percona Backup for MongoDB 2.11.0.
- PMM Client: 3.4.1
- LogCollector based on fluent-bit 4.0.1
Other options may also work but have not been tested.
Supported platforms
Percona Operators are designed for compatibility with all CNCF-certified Kubernetes distributions. Our release process includes targeted testing and validation on major cloud provider platforms and OpenShift, as detailed below for Operator version {{release}}:
- Google Kubernetes Engine (GKE) 1.31-1.33
- Amazon Elastic Container Service for Kubernetes (EKS) 1.31-1.34
- OpenShift Container Platform 4.16 - 4.19
- Azure Kubernetes Service (AKS) 1.31-1.33
- Minikube 1.37.0 based on Kubernetes 1.34.0
This list only includes the platforms that the Percona Operators are specifically tested on as part of the release process. Other Kubernetes flavors and versions depend on the backward compatibility offered by Kubernetes itself.