-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mode where nfd-worker updates the labels #2022
Comments
One big issue with the nodefeature objects currently is their size. Quickly thinking, I can see two big culprits adding to that. One is the "managed fields metadata", basically every feature (like Second improvent, which I think we really need (and which I have thought for a long time) would be sharding of nfd-master. I.e. distribute the nodes across multiple instances of nfd-master. E.g. calculate a checksum of the nodename and do mod number-of-shards to get the instance (shard) which is responsible for that node. Splitting the functionality to two daemons is deliberate, e.g. from the security considerations. Thoughts? |
Yeah I think the deny list as suggested here #2026 is a great idea and smaller change then the other suggestions. This would be a great place to start asap if possible. Sharding is an interesting idea for sure, would love to discuss that further. The problem though is that even as we paginate the list calls or shard the master it helps automation work smoothly but it still leaves a footgun if a user gets curious and calls Just for my understanding though, what are the security concerns for the split? Just giving the daemonset permissions to edit nodes? |
We can start there for sure
That might help, need to follow up the work more closely. One thing would be to try to make kubectl smarter, support pagination there, too. Another thing we could do is documentation in NFD, warn about the consequences in big clusters.
Yes. Run nfd-worker with smallest possible privileges. We could also explore the NodeFeature-less nfd-worker-standalone option too, as an alternative operating mode. But then nfd-worker should replicate all the functionality of nfd-master (NodeFeatureRules, NodeFeatureGroups etc). |
We could/should create a separate issuea about sharding. |
What would you like to be added:
a mode where NFD worker can update labels without needing to run nfd-master with informer cache of entire nodefeatures, maybe some subset map of nodefeatures just for gc if this issue is implemented: #2021 i.e. just store nodefeatures by name/node.
Why is this needed:
nodefeature CRs can be a footgun if users list them i.e.
k get nodefeatures
in a high scale environment.nfd master also uses a ton of memory at scale.
If nodefeature-worker just handled the labels for its own node then it would alleviate a lot of scale concerns
The text was updated successfully, but these errors were encountered: