longhorn/longhorn v1.1.1 on GitHub

v1.1.1 released! 🎆

There are many new features, enhancements, and bug fixes including different areas as below. Especially, most of the major items are driven by community users to enhance and improve Longhorn functionalities and stability. Thanks for all the contributions!

Automatic Longhorn Engine Upgrade
- Before v1.1.1, for live engine image upgrade, users only can manually upgrade the engine image of each healthy volume. v1.1.1 supports this to avoid needless user intervention [doc]
Virtual Machine Volume Support
- Virtual machine volume creation by Longhorn backing image [doc]
- Backing image management native support without any 3rd-party image repository [LEP]
- Virtual machine live migration support by Longhorn migratable RWO volume [LEP]
- Ability to use as a volume storage service by virtualization orchestration solutions like Harvester
Granular Resource Reservation for Longhorn Engine/Replica
- Before v1.1.1, users only can configure a global setting for CPU resource reservation for each engine/replica. v1.1.1 supports granular node-specific settings for each engine/replica individually. [doc]
AWS IAM Role Support to Generate Temporary AWS credentials
- Besides using IAM user credentials, v1.1.1 allows users to configure IAM Assume Role to generate temporary credentials by AWS security token service to do the backup and restore operations. [doc]
Longhorn Frontend WebSocket Performance Enhancement to Improve User Experience
- Improves the UI performance to adopt event-driven data update to reduce needless network traffic caused by frequently pulling the updated info from the backend.
Dependency Installation Manifests for Node Prerequisites
- Comes out with some installation manifests for prerequisites (like iSCSI and NFS client libraries). It's not integrated into the Longhorn installation process, but it's optional and helpful for users to prepare the environment ready for Longhorn.

Installation

Longhorn supports 3 installation ways including Rancher catalog, Kubectl, and Helm. Follow the installation instructions here.

Upgrade

Follow the upgrade instructions here.

Automatic Longhorn Engine Upgrade supported

Besides manually upgrading Longhorn engines, it's able to do automatic engine upgrades with configurable settings and conditions.

Concurrent Automatic Engine Upgrade Per Node Limit Setting

Controls the maximum number of engines per node that are allowed to automatically upgrade to the default engine image at the same time. The default value is 0, so Longhorn will not automatically upgrade volumes’ engines to the default version by default. If configured, we recommend setting the value to 3 to leave some room for error but don’t overwhelm the system with too many failed upgrades. [doc]

Conditions of Automatic Engine Upgrade

If the value of Concurrent Automatic Engine Upgrade Per Node Limit Setting is bigger than 0, in the below conditions, the automatic engine upgrade will happen.

Healthy attached volumes - Live upgrade
Detached volumes - Offline upgrade

But, Longhorn doesn’t automatically upgrade disaster recovery volumes to the new default engine image because it would trigger a full restoration for the disaster recovery volumes. The full restoration might affect the performance of other running Longhorn volumes in the system, so we leave it to users to decide to do manual engine upgrades at the appropriate time. For the detailed, check [doc]

Engine Manager CPU Request and Replica Manager CPU Request introduced, but Guaranteed Engine CPU deprecated

The new node-specific Engine Manager CPU Request and Replica Manager CPU Request reservations are introduced, and the existing Guaranteed Engine CPU gets deprecated. To avoid crashing existing volumes after Longhorn Manager upgrade, the deprecated setting Guaranteed Engine CPU will be automatically set Engine Manager CPU Request and Replica Manager CPU Request from each node based on the deprecated setting value during the upgrade. Then, the new global instance manager CPU settings Guaranteed Engine Manager CPU and Guaranteed Replica Manager CPU won’t take effect.

For the detailed, check [doc: post-upgrade], [doc: guaranteed-engine-manager-cpu], and [doc: guaranteed-replica-manager-cpu]

Highlights

[FEATURE] Use AWS IAM Role for Backup / Restore (1526) - @jenting
[FEATURE] Help user to install dependencies when installing Longhorn (2033) - @cclhsu
[FEATURE] Live migration support for KubeVirt (2127) - @joshimoo @khushboo-rancher
[FEATURE] Upgrade longhorn engine automatically (2152) - @meldafrawi @PhanLe1010 @anupama2501 @khushboo-rancher
[FEATURE] Use fsfreeze instead of sync before snapshot (2187) - @joshimoo @khushboo-rancher
[FEATURE] Enhanced resource reservation mechanism (2207) - @shuo-wu @khushboo-rancher
[FEATURE] Enhance the backing image feature (2295) - @shuo-wu @khushboo-rancher
[BUG] Longhorn UI websocket traffic is way to high (2372) - @c3y1huang @smallteeths @khushboo-rancher

Enhancements

[QUESTION] PVCs takes a long time (or fails) to be attached due to fsgroup change (1221) -
[FEATURE] PVC or Workload name visible in backup list section of Dashboard (1539) - @smallteeths @jenting @khushboo-rancher
[ENHANCEMENT] refactor replica-controller to use structured logging (1731) - @c3y1huang @khushboo-rancher
[FEATURE] Improve documentation for installing open-iscsi (1741) - @cclhsu @anupama2501 @khushboo-rancher
[BUG] sparse-tools: ssync should explicitly asking for directIO or not, rather than determine is using filesize (1943) - @cclhsu @khushboo-rancher
[BUG] S3 Backup fails with "failed to put object" (1967) - @PhanLe1010 @khushboo-rancher
[FEATURE] Enable backing image feature in Longhorn (2006) - @shuo-wu
[FEATURE] Option of creating RWX volume from the Longhorn UI (2048) - @joshimoo @smallteeths @khushboo-rancher
[BUG] The 'Attached to' column needs to incorporate all pods history for RWX volume (2056) -
[FEATURE] Handle NFS client info for the RWX feature (2058) - @cclhsu @khushboo-rancher
[FEATURE] The detach message should include the RWX volume behavior (2059) - @smallteeths @khushboo-rancher
[TASK] switch the image base for Longhorn share manager to ubuntu (2111) - @c3y1huang @khushboo-rancher
[TASK] Update NFS examples (2114) - @joshimoo
[BUG] annotation last-applied-tolerations magically added in 1.1.0 (2120) - @PhanLe1010
[FEATURE] change row size of volumes page in web ui (2142) - @smallteeths
[FEATURE]Add new setting descriptions in the doc (2153) - @cclhsu @khushboo-rancher
[FEATURE] Set tolerations and node selectors for dynamically provisioned Longhorn workloads (2199) - @PhanLe1010
[FEATURE] UI: Only allow to upgrade to the default engine image when the setting automatically upgrade volumes' engine to default version is enable (2205) - @smallteeths @khushboo-rancher
[QUESTION] unable to mount NFS backup (2242) - @cclhsu @khushboo-rancher
[FEATURE] Add support for ingressClassName in helm chart (2257) - @meldafrawi
[FEATURE] Upgrade engine image other options to be greyed out while the upgrade is going on (2260) - @smallteeths @khushboo-rancher
[BUG] Show indication on the UI that the automatic upgrade is in progress (2281) - @smallteeths @khushboo-rancher
[FEATURE] Add annotations support for resources in helm chart (2387) - @jenting @khushboo-rancher
[FEATURE] Add tabular display of the visual snapshots, backups and volume head summary (2400) - @smallteeths
[FEATURE] rwx shared directory should be rw accessible when running as non-root user (2418) - @joshimoo @PhanLe1010
[TASK] Add the way of generating support bundle into the doc (2458) - @c3y1huang @khushboo-rancher
[BUG] Show the backing image size on the backing image page once it is downloaded. (2497) - @smallteeths @shuo-wu @khushboo-rancher

Bugs

[BUG] MountVolume.SetUp failed for volume ~ rpc error: code = Internal desc = exit status 1 due to multipathd on the node (1210) - @cclhsu
UI API request sometimes failed on 1.0.0 (1442) - @jenting
[BUG] zombie processes (1457) -
[BUG] Service account secret mount spam on longhorn-manager restart (1877) - @PhanLe1010 @khushboo-rancher
[BUG] SnapshotPurge error cases handle improvement (1895) - @PhanLe1010 @khushboo-rancher
[BUG] Cronjob for backup with Node who has taint (1903) - @PhanLe1010 @khushboo-rancher
[BUG] Add Pagination support to the S3 client (currently we only list the first 1000 results) (1904) - @jenting @khushboo-rancher
[BUG] Volume remains to attach to the wrong node when recurring backup job fails during the auto attachment (1922) - @meldafrawi @PhanLe1010
[BUG] Backup - S3 Timeout (1955) - @jenting @khushboo-rancher
[BUG] Volume stuck in attaching state if one of the replica location can't be found on scaling up a pod. (1999) - @meldafrawi @shuo-wu
[BUG] The feature Pod Deletion Policy When Node is Down doesn't work on Kubernetes >= v1.19.0 (2062) - @PhanLe1010 @khushboo-rancher
[BUG] Volumes used by MinIO workloads cannot be attached with state Degraded (2073) - @PhanLe1010
[BUG] Engine Image fail to reach ready state if there are tainted worker node (2081) - @PhanLe1010 @khushboo-rancher
[BUG]Longhorn node is not cleaned up automatically when the kube node is removed in v1.0.x and the Longhorn is upgraded to v1.1.0 (2100) - @shuo-wu @khushboo-rancher
[BUG] Recurring backup job stuck on K8s 1.19.4 if volume is attached to the same node and powered down (2106) - @khushboo-rancher
[BUG] longhorn-ui should not require IPv6 (2136) - @anupama2501
[BUG] Panic down longhorn-csi-plugin by no param in SC (2154) - @joshimoo @khushboo-rancher
[BUG] Provide multiarch image for longhorn-ui (2175) - @yasker
[QUESTION] Expand a RWX volume (2181) - @joshimoo
[BUG] Manual node deletion for down node (2186) - @c3y1huang @khushboo-rancher
[BUG] Wrong ui size sorting (2193) - @smallteeths @khushboo-rancher
[BUG] ganesha in the share-manager pod requires on IPv6 on the host (2197) - @c3y1huang @khushboo-rancher
[BUG] Secret doesn't refresh in Longhorn-manager unless longhorn-manager is restarted or access the backupstore (2198) - @jenting @khushboo-rancher
[BUG] Shouldn't bind mount /var/run/ in Helm chart (2200) - @jenting @khushboo-rancher
[BUG] All longhorn-manager pods crushloopbackoff (2208) - @cclhsu @khushboo-rancher
[BUG] Fresh Install longhorn volumes not getting attached to pods (2241) - @cclhsu @khushboo-rancher
[BUG] Sort functionality for size on longhorn UI for nodes is not working (2247) - @smallteeths @khushboo-rancher
[BUG] Longhorn crash due to adding disk to a node that's in evicting/unschedulable (2250) - @cclhsu
[BUG] Error during WebSocket handshake for Longhorn UI (2264) - @khushboo-rancher
[BUG] DR volume is not getting upgraded automatically by default (2268) - @PhanLe1010
[BUG] volume name limitation breaks CSI sanity check (2270) - @c3y1huang
[BUG] longhorn-post-upgrade job fails due to "Error: container has runAsNonRoot and image will run as root" (2292) - @jenting @khushboo-rancher
[BUG] UI - 'Enable scheduling' option appears for 'down' node on the Longhorn UI (2308) - @smallteeths
[QUESTION] Volume faulted after reboot, need manually salvage? (2309) - @joshimoo @khushboo-rancher
[BUG] list backup volume for NFS backupstore should not returns error when directory not exist (2312) - @c3y1huang
[BUG] UI - Upgrade menu option is not enabled on Longhorn UI. (2315) - @smallteeths
[BUG] RWX volume fails to attach to a pod (2316) - @joshimoo
[BUG] e2e backup_image tests fails (w/upgrade disabled) (2318) - @c3y1huang
[BUG] Configuration file /etc/iscsi/initiatorname.iscsi does not exist iscsi using longhorn-iscsi-installation.yaml. (2319) - @cclhsu
[BUG] e2e test_deleting_backup_volume fails (2323) - @c3y1huang
[BUG] e2e test_instance_manager_cpu_reservation failed (2325) - @c3y1huang @shuo-wu
[BUG] Number of volumes upgrading appeared to be more than Concurrent Automatic Engine Upgrade Per Node Limit setting value. (2328) - @PhanLe1010 @khushboo-rancher
[BUG] Volume unable to recover when upgrading several StatefulSets (2329) - @joshimoo @khushboo-rancher
[QUESTION] After attaching volume in maintenance mode, the volume can't no longer be attached or detached (2338) - @PhanLe1010 @smallteeths @khushboo-rancher
[BUG] Deleting a backup causes longhorn-manager memory to spike for 30 minutes (2339) - @PhanLe1010 @c3y1huang
[BUG] csi-resizer encounter error (2347) - @joshimoo
[BUG] RWX mount share ownership is being reset to user nobody (2357) - @c3y1huang
[BUG] Engine image and instance manager state is not correct on the node page of Longhorn UI (2377) - @meldafrawi
[BUG] uninstall-controller does not cleanup share-manager crds (2384) - @cclhsu
[BUG] UI - 'Expand volume' option is enabled for attached RWX volume. (2389) - @smallteeths
[BUG] Deployed image showing 'deploying' state forever. (2399) - @PhanLe1010 @khushboo-rancher
[BUG] Attach option available for an already attached RWX volume. (2411) - @smallteeths @khushboo-rancher
[BUG] There is no way to cancel the file syncing when the caller becomes invalid (2416) - @PhanLe1010 @shuo-wu
[BUG] Unable to attach RWX to a node using Longhorn UI (2420) - @smallteeths @khushboo-rancher
[BUG] Volume starts with uppercase in the volume details page (2424) - @smallteeths
[BUG] Longhorn setting page toggles with blank page. (2439) - @smallteeths
[BUG] Instance Manager Pods will be deleted during Longhorn installation (2446) - @PhanLe1010 @khushboo-rancher
[BUG] Error logs in csi-snapshotter after Longhorn deployment. (2459) -
[BUG] Node can't be removed from Longhorn UI if scheduling not disabled before removing Longhorn components from the node (2462) - @smallteeths
[BUG] RWX doesn't work with Degraded Availability (2463) - @joshimoo @PhanLe1010
[BUG] After adding taint to a node, volume cannot be attached to any other node (2475) - @PhanLe1010
[BUG] Unable to add disk using Longhorn UI node page. (2477) - @PhanLe1010
[BUG] Backing image 'URL' should be uppercase on the Longhorn UI (2494) - @smallteeths
[BUG] Increase the size of the create backing image form on the Longhorn UI (2516) - @smallteeths
[BUG] degraded RWX performance on RHEL 7.9 (2528) - @joshimoo

Misc

[QUESTION] Help needed to understand reason of readonly. (2075) - @shuo-wu @khushboo-rancher
[FEATURE] Support CSI fsGroupPolicy (2131) - @PhanLe1010 @khushboo-rancher
[FEATURE] Find a way to backup and restore Longhorn cluster before v1.2.0 (2228) - @meldafrawi @shuo-wu
[FEATURE] Type of the "Concurrent Automatic Engine Upgrade Per Node Limit" to be integer (2253) - @smallteeths
[QUESTION] Longhorn-UI Error during WebSocket handshake: Unexpected response code: 200 (2265) - @cclhsu
[UI] Change engine image state from ready to deployed (2311) - @smallteeths @khushboo-rancher

Contributors

Thanks to all contributors!

longhorn/longhorn v1.1.1 Longhorn v1.1.1 Release on GitHub