Skip to content

Releases: oracle-quickstart/oci-hpc

v2.11.0

16 Oct 21:11
6c9652e
Compare
Choose a tag to compare

What's Changed

  • Slurm Update (24.05.1)
  • Slurm nodes are dynamic instead of pre-defined (+ custom hostnames)
  • Multi MT FSS deployment (Using DNS Round Robin)
  • New monitoring solution (Prometheus, Node exporter, Can be run on a separate instance)
  • Fix to slurm Healthchecks
  • New shapes added, MI300X, L40S, A100 VMs,....
  • OpenLDAP fix for OL8
  • Meshpinger added (Used to validate RDMA connectivity between all RDMA NICs on all hosts)
  • NVME's (Switch from LVM to MDADM)
  • New OL8 images (OCA hold)
  • Bug Fixes

v2.10.6

17 May 16:55
382c496
Compare
Choose a tag to compare

What's Changed

  • Add Healthchecks for GPU nodes in Slurm (Idle nodes and at job start)
  • Scratch from NVMe not default
  • OCI ALgo Tuner example for A100
  • GPU and RDMA monitoring turned on
  • New images
  • Bug fixes

v2.10.5

25 Mar 15:51
3c6f243
Compare
Choose a tag to compare

What's Changed

  • Rename bastion to controller
  • Added Slurm names to DNS
  • Enable BIOS change
  • New images
  • Fix Compute Cluster to use Compute Agent

v2.10.4.1

16 Jan 22:43
b9531c4
Compare
Choose a tag to compare

What's Changed

  • Quick fix for Modified Hashicorp repos

v2.10.4

05 Jan 19:46
7f6f274
Compare
Choose a tag to compare

What's Changed

  • Support for Ubuntu 22.04 and OL8 on GPU nodes, update of all images
  • Support for Oracle Cloud Agent for RDMA auth
  • Support for the H100, E5 std and E5 HPC
  • Update Slurm to 23.02.5-1
  • Add automatic backup of bastion boot volume

v2.10.3

16 Sep 05:28
f0499b7
Compare
Choose a tag to compare

What's Changed

  • Support for OL8 on bastion
  • Support for compute Clusters
  • Add GPU monitoring
  • Support for Hyperthreading of 256 threads+ nodes in SLURM
  • Add IB Write tests
  • Mount multiple disks as one (with or without redundancy)
  • Bug Fixes and improvements

v2.10.2.1

14 Jun 23:31
763d350
Compare
Choose a tag to compare

What's Changed

  • Ubuntu support for PAM
  • Update of oci-cn-auth in case the image has outdated one
  • Update some default variables.

v2.10.2

24 May 17:02
3595386
Compare
Choose a tag to compare

What's Changed

  • Updated to Slurm 23.02 (Which remove the need for node ordering in large GPU clusters)
  • Updated marketplace images for OL7, OL8, and GPU with version 2.1.4 of the OCI authentication packages (Needed for better perf for GPU clusters).
  • Fixed LDAP on Ubuntu
  • Added the option to mount all NVMe's as separate Namespaces or One Logical volume (With or without redundancy)
  • Added Hyperthreading for Ubuntu BMs
  • Support for PMIx in Slurm
  • Fix a Slurm bug due to long Rack IDs
  • Other Small bug fixes

v2.10.1.1

27 Mar 17:57
6640d3e
Compare
Choose a tag to compare

What's Changed

Fix bug about bastion and login node Flex Shapes

v2.10.1

24 Mar 22:36
25c5704
Compare
Choose a tag to compare

What's Changed

  • Slurm User limits and PAM
  • Updated marketplace images for OL7, OL8 and GPU with latest drivers.
  • Support for the upcoming E5, A10 VMs and Dense.E4.Flex
  • Add the ability to run a login node separate from the bastion.
  • OCI provider version to 4.112.0
  • Other Small bug fixes