We're excited to announce the release of AWS ParallelCluster 3.1.1
Upgrade
How to upgrade?
sudo pip install --upgrade aws-parallelcluster
ENHANCEMENTS
- Add support for multiple users cluster environments by integrating with Active Directory (AD) domains managed via AWS Directory Service.
- Enable cluster creation in subnets with no internet access.
- Add abbreviated flags for
cluster-name
(-n),region
(-r),image-id
(-i) andcluster-configuration
/image-configuration
(-c) to the cli. - Add support for multiple compute resources with same instance type per queue.
- Add support for
UseEc2Hostnames
in the cluster configuration file. When set totrue
, use EC2 default hostnames (e.g. ip-1-2-3-4) for compute nodes. - Add support for GPU scheduling with Slurm on ARM instances with NVIDIA cards. Install NVIDIA drivers and CUDA library for ARM.
- Add
parallelcluster:compute-resource-name
tag to LaunchTemplates used by compute nodes. - Add support for
NEW_CHANGED_DELETED
as value of FSx for LustreAutoImportPolicy
option. - Explicitly set cloud-init datasource to be EC2. This save boot time for Ubuntu and CentOS platforms.
- Improve Security Groups created within the cluster to allow inbound connections from custom security groups when
SecurityGroups
parameter is specified for head node and/or queues.
CHANGES
- Upgrade Slurm to version 21.08.5.
- Upgrade NICE DCV to version 2021.3-11591.
- Upgrade NVIDIA driver to version 470.103.01.
- Upgrade CUDA library to version 11.4.4.
- Upgrade NVIDIA Fabric manager to version 470.103.01.
- Upgrade Intel MPI Library to 2021.4.0.441.
- Upgrade PMIx to version 3.2.3.
- Disable package update at instance launch time on Amazon Linux 2.
- Enable possibility to suppress
SlurmQueues
andComputeResources
length validators. - Use compute resource name rather than instance type in compute fleet Launch Template name.
- Disable EC2 ImageBuilder enhanced image metadata when building ParallelCluster custom images.
- Remove dumping of failed compute nodes to
/home/logs/compute
. Compute nodes log files are available in CloudWatch
and in EC2 console logs.
BUG FIXES
- Redirect stderr and stdout to CLI log file to prevent unwanted text to pollute the
pcluster
CLI output. - Fix exporting of cluster logs when there is no prefix specified, previously exported to a
None
prefix. - Fix rollback not being performed in case of cluster update failure.
- Do not configure GPUs in Slurm when NVIDIA driver is not installed.
- Fix
ecs:ListContainerInstances
permission inBatchUserRole
. - Fix
RootVolume
schema for theHeadNode
by raising an error if unsupportedKmsKeyId
is specified. - Fix
EfaSecurityGroupValidator
. Previously, it may produce false failures when custom security groups were provided and EFA was enabled. - Fix FSx metrics not displayed in Cloudwatch Dashboard.