-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kuberenetes resource limits & security context configs #1612
base: master
Are you sure you want to change the base?
Add kuberenetes resource limits & security context configs #1612
Conversation
Just an FYI, but there is a (semi-)related PR for adding support for limits. There is a related issue on it too, which is linked in the PR. Not sure if the approach in the issue is still the desired approach from the Metaflow team though. Currently waiting for a review on the PR. |
Thanks. Maybe 'limit_memory' and 'limit_cpu' in the decorator is the cleanest way to specify it |
Perhaps, the implementation in the other PR was based on a discussion with the Metaflow team in Slack. (Think there is a link to the Slack thread in the original issue.) You are not the first person to raise the alternative suggestion. Someone else raised it in the issue too. I've referenced it in the other PR and was waiting to hear back from the Metaflow folk. |
If it is decided to go with the approach in this PR, then I guess Argo Workflow functionality will need to be updated too. Also, I guess ephemeral-storage? |
@roofurmston I've just refactored the code to support argo as well. There was a 'TODO' in the original code to use shared code for the Argo and Kubernetes container creation. I did that by creating a utility function used by both. |
…-limits-security-settings
@roofurmston checking in on this. Since there is a refactor in here it will be hard to keep this PR open & up-to date for an extended period. |
I'm not a maintainer of Metaflow, I'm afraid. :) I left the initial comment as I had a related PR. Mine is still waiting to be reviewed. Maybe it has slipped the radar of the Metaflow folk and it is worth raising with them? |
For all K8s options, there is a possibility to specify them also on (a) the command line and (b) as args to the @kubernetes decorator. Will the security context be able to be specified the same way? The user-experience won't be great since it's a json object. The benefit of having the @kubernetes decorator is that there can be different values for different steps. Especially if there are different docker images used for different steps, a global security context may not work. E.g. a Docker image the user authored could have a non-root user, but a Docker image taken from an OSS project may only have a root user. Not sure if this is a good idea but one option could be to specify the security_context in the @kubernetes decorator to point to a file. This file could have the security context json. Similarly, will it be able to specify resource_limits in the decorator? |
Yes @shrinandj all 3 fields that have been added to the config also support the kubernetes decorator. For example we support
Where secure_environment_config is some complex setting like the below (which is what we use):
There are so many recursive params in the security settings it seemed to me to make sense to use json (config file) and dict (in python code) so that everything can be specified in detail. |
Is this PR going to happen? We wanted to apply some Kubernetes quota on one of our namespaces for limiting excessive overuse but realized that Kubernetes quotas require limits, which is currently unsupported. Would be interested in that feature. |
The feedback I have had from the Metaflow team is that you need to set limits through a policy agent. This is the approach we will be investigating in due course. |
Hey @roofurmston thanks for your response. Cheers |
I was pointed in the direction of https://kyverno.io/. Other policy agents also allow mutations, such as open policy agent gatekeeper, but I couldn't find a way to set the resources through that policy agent. |
Many thanks for the pointer. Will definitely try that. |
This PR adds more support for secure computing environments for metaflow in kubernetes and with argo.
• Refactored so that common code is used to create the kubernetes client container object. This fixes and existing TODO in the code.
• Added support for kubernetes security_context field in the metaflow config file or decorators.
Use the 'METAFLOW_KUBERNETES_SECURITY_CONTEXT' field in the config file or 'security_context' (see #1262)
The input is a JSONified version of the YML in the kuberentes specification for security context.
Example from the config file
Note that modifications in your docker images may be needed to to support this (such as creating a non-root user etc)
• Added support for kubernetes resource memory and cpu limits field in the metaflow config file or decorators.
This follows the format for memory and cpu requirements for the resource requests field. Examples usage:
... "METAFLOW_KUBERNETES_RESOURCE_LIMITS_MEMORY": 4096, "METAFLOW_KUBERNETES_RESOURCE_LIMITS_CPU": 1, ...