Skip to content

Commit

Permalink
Driver upgrade with unhealthy nodes
Browse files Browse the repository at this point in the history
CNT-4913

NVIDIA/gpu-operator#688

Signed-off-by: Mike McKiernan <[email protected]>
  • Loading branch information
mikemckiernan committed Apr 30, 2024
1 parent 9c9b2b1 commit 296e3ad
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions gpu-operator/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,9 @@ Fixed Issues
* Previously, under load, the Operator could fail with the message `fatal error: concurrent map read and map write`.
In this release, the Operator controller is refactored to prevent the race condition.
Refer to Github `issue #689 <https://github.com/NVIDIA/gpu-operator/issues/689>`__ for more details.
* Previously, if any node in the cluster was in the `NotReady` state, the GPU driver upgrade controller failed to make progress.
In this release, the upgrade library is updated and skips unhealthy nodes.
Refer to Github `issue #688 <https://github.com/NVIDIA/gpu-operator/issues/688>`__ for more details.


.. _v24.3.0-known-limitations:
Expand Down

0 comments on commit 296e3ad

Please sign in to comment.