Replies: 3 comments
-
Will these work?
Pruning objects to reclaim resources
<https://docs.okd.io/4.11/applications/pruning-objects.html>
Freeing node resources using garbage collection
<https://docs.okd.io/4.11/nodes/nodes/nodes-nodes-garbage-collection.html>
…On Wed, Aug 31, 2022 at 2:24 PM Bruce Link ***@***.***> wrote:
I have a cluster that has been running for a while (608 days) and I
noticed recently some odd disk usage. I sized the cluster as per the
documents of the time and gave each node 120GB of disk space (control plane
32GiB memory, 4 cores, workers 128GiB memory, 8 cores). Ceph-rook disks are
on top of this, as is the registry.
I have had the issue with 4.11 of memory/CPU leak and have to reboot some
of the nodes every few days.
When looking at the memory leak issue, I noticed that the actual usage of
disk space was perhaps abnormal and varying from node to node. This is most
obvious on the control plane nodes which are using 89 GiB, 87 GiB and 21
GiB. There is no correlation with the disk usage and the number of running
pods.
It seems that the control planes, at least, should have the same disk
usage.
A quick search of the docs did not uncover an easy way of pruning any disk
space that is not currrently needed. Any ideas, before I dig more deeply as
to a way of cleaning up any wasted space on the nodes?
Thanks
—
Reply to this email directly, view it on GitHub
<#1325>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGAKSQ2URLX5WGQX3KN232TV36PM3ANCNFSM6AAAAAAQBT74W4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Michael Burke
HE / HIM / HIS
Technical Writer, Customer Content Services
Red Hat
<https://www.redhat.com>
314 Littleton Rd
Westford, MA 01886
***@***.***
<https://red.ht/sig>[image:
Logo-Red_Hat-Native_and_Indigenous-B-Full_Color_Standard-CMYK.jpg]
<https://source.redhat.com/communitiesatredhat/diversity_and_inclusion/native>
|
Beta Was this translation helpful? Give feedback.
-
This is expected. Kubelet starts cleaning unused images when disk usage reaches 90% by default, so current disk free depends on image history. Michael has linked to the docs pages if you want to tweak image GC parameters. |
Beta Was this translation helpful? Give feedback.
-
Thanks, Michael and Vadim. I had earlier looked at the pruning objects page, but it did not seem useful as I did not have many completed pods. I had not consulted the GC page. On how GC works, there is also this RH blog, https://cloud.redhat.com/blog/image-garbage-collection-in-openshift None of the above discuss manual image removal, or getting a list of unused images on a node via oc or kubectl, which is possible for pods (and from the console you can see the difference between total number of pods and number of running pods). A GC dryrun would also be useful. to see how much of the disk pressure is a non-issue. |
Beta Was this translation helpful? Give feedback.
-
I have a cluster that has been running for a while (608 days) and I noticed recently some odd disk usage. I sized the cluster as per the documents of the time and gave each node 120GB of disk space (control plane 32GiB memory, 4 cores, workers 128GiB memory, 8 cores). Ceph-rook disks are on top of this, as is the registry.
I have had the issue with 4.11 of memory/CPU leak and have to reboot some of the nodes every few days.
When looking at the memory leak issue, I noticed that the actual usage of disk space was perhaps abnormal and varying from node to node. This is most obvious on the control plane nodes which are using 89 GiB, 87 GiB and 21 GiB. There is no correlation with the disk usage and the number of running pods.
It seems that the control planes, at least, should have the same disk usage.
A quick search of the docs did not uncover an easy way of pruning any disk space that is not currrently needed. Any ideas, before I dig more deeply as to a way of cleaning up any wasted space on the nodes?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions