Releases: SchedMD/slurm-gcp
Releases · SchedMD/slurm-gcp
6.1.1
- Fix suspend issue with TPU nodes
- Add TPU job example
- Changed slurm dependency from man2html to man2html-base and man2html-core to
reduce image size - Changed default docker image name to remove the OS reference
Full Changelog: 6.1.0...6.1.1
6.1.0
- Add on_host_maintenance to packer module to support instances with GPUs.
- Fix retry of powering up static nodes on failure.
- Add support for H3 machines and enumerated multi-socket processors.
- Fix munge failing after manual reboot of node.
- [Beta feature] Added support for TPU-vm nodes.
- [Beta feature] Added support for TPU-vm multi-rank nodes.
- Add
ignore_prefer_validation
to SchedulerParameters in generated cloud.conf. - Remove unaltered centos-7 image from actively published and supported images.
- Upgrade installed Slurm to 23.02.4.
- Fix CUDA install on Ubuntu 20.04.
Full Changelog: 6.0.0...6.1.0
6.0.0
- Add slurm cluster management daemon
- Update default Slurm version to 23.02.2.
- Make
slurm_cluster
root module use terraform 1.3 and optional object fields. - Reconfigure now is a service on the instances.
- Move from project metadata to GCS bucket to store cluster files.
- Factored out nodeset modules (regular, dynamic) from partition module.
- Replace
zone_policy_*
withzones
in nodeset module. - Replace
access_config
withenable_public_ip
andnetwork_tier
. - Add partition options
default
,resume_timeout
,suspend_time
,
suspend_timeout
. - Increase
nodeset_name
length to 15 characters (from 7). - Remove
partition_name
length limit. - Add
bandwidth_tier
support to instance templates. - Move
spot
preemptible support to instance template. - Fix login template name not using
group_name
in name schema. - Add
enable_login
to toggle creation of login node resources. - Remove partition level startup-scripts and network mounts.
- Fix Ubuntu 20.04 NVIDIA install.
- Change partition level placement policy to nodeset level.
- Use
topology.conf
to prioritize nodes within nodesets. - Remove debian-10 and vanilla rocky-linux-8 images from build process and
support. - Fix threads per core inference.
- Upgrade Slurm to 23.02.3
Full Changelog: 5.7.4...6.0.0