Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: change user of operator to 1001 #1155

Draft
wants to merge 2 commits into
base: incubation
Choose a base branch
from

Conversation

zdtsw
Copy link
Member

@zdtsw zdtsw commented Aug 6, 2024

  • this is the user we used for our image
  • the previous only set to non-root user which can be conflict with customized SCC

Description

https://issues.redhat.com/browse/RHOAIENG-10272

How Has This Been Tested?

local build: quay.io/wenzhou/opendatahub-operator-catalog:v2.13.10272-3
test:

  • create a customized SCC as stated in the jira ticket ^
  • install this operator build
  • operator pod should still work and no error on permission

Screenshot or short clip

Merge criteria

  • You have read the contributors guide.
  • Commit messages are meaningful - have a clear and concise summary and detailed explanation of what was changed and why.
  • Pull Request contains a description of the solution, a link to the JIRA issue, and to any dependent or related Pull Request.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

- this is the one we used for our image
- the previous only set to non-root user which can be conflict with customized SCC

Signed-off-by: Wen Zhou <[email protected]>
Copy link

openshift-ci bot commented Aug 6, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from zdtsw. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zdtsw zdtsw requested review from lburgazzoli and VaishnaviHire and removed request for lphiri and MarianMacik August 6, 2024 16:02
@bartoszmajsak
Copy link
Contributor

bartoszmajsak commented Aug 7, 2024

I recall facing a similar problem as described in linked JIRA when looking at bringing Workbenches to Service Mesh - #616

It turned out runAs on pods were not aligned with what openshift was setting

annotations:
    openshift.io/sa.scc.mcs: s0:c26,c25
    openshift.io/sa.scc.supplemental-groups: 1000700000/10000
    openshift.io/sa.scc.uid-range: 1000700000/10000

Changing default SA to restricted-v2 did the job,

Name: "system:openshift:scc:anyuid",

but I am not sure this is relevant in this context. Though might be a thing to work on as well.

@bartoszmajsak
Copy link
Contributor

If the RunAsUser field is absent in the manifest, vanilla k8s will honor the value you set in the Dockerfile, but when running on OpenShift, it will automatically assign UID that satisfies its requirements. Have you tried that @zdtsw?

We did similar work in the context of Istio and Openshift istio/istio#45394

@ykaliuta
Copy link
Contributor

ykaliuta commented Aug 7, 2024

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

@zdtsw
Copy link
Member Author

zdtsw commented Aug 7, 2024

If the RunAsUser field is absent in the manifest, vanilla k8s will honor the value you set in the Dockerfile, but when running on OpenShift, it will automatically assign UID that satisfies its requirements. Have you tried that @zdtsw?

We did similar work in the context of Istio and Openshift istio/istio#45394

interesting topic of these permissions :D

I think we have 2 parts here:

  1. which uid to start Operator pod: related to the one we used to build image, the SA we assigned in deployment which passed to pod runtime (for this PR and its jira)
  2. which role (in our case it is set to anyuid) to be used for binding to "default" SA in application namespace: this probably is for component's controllers to use if they do not specify SA (e.g workbench?) and override "restricted-v2" to "anyuid" to grant more permission? tbh, i am not really sure why or if we do need this block of code , unless by using "restricted-v2" can cause any problem (e.g components controller wanna run as root????)

after some reading from https://docs.openshift.com/container-platform/4.16/authentication/managing-security-context-constraints.html#security-context-constraints-pre-allocated-values_configuring-internal-oauth
this sounds like, these annotations on NS level are taking effects if pod .spec does not set anything.

As for the change in istio
if i understand it correctly:func getPreallocatedSupplementalGroups(ns *corev1.Namespace) ([]securityv1.IDRange, error) is called after pod is up, in order to generate template? which is different from our case in Operator itself

I did a test:

  • remove .spec.securityContext from CSV
  • make a new build (in Dockerfile, still we set USER 1001)
  • install build in OCP
  • manually create restricted-pipeline-scc with
priority: 10
runAsUser:
  type: MustRunAs
  uid: 1000
  • check operator namespace's annotations:
    openshift.io/sa.scc.supplemental-groups: 1000490000/10000
    openshift.io/sa.scc.mcs: 's0:c22,c14'
    openshift.io/sa.scc.uid-range: 1000490000/10000
  • create DSC with Dashboard enabled
  • check operator pod:
    run as 1000
    log with error: "error":"failed to update params.env from /opt/manifests/dashboard/odh

BUT
if i remove priority from restricted-pipeline-scc
then Operator Pod works: uid is to 1001

I guess/think/assume the whole combination:

  • if user create their customized SCC with priority set, we will have to use RunAsUser in CSV (which must match the value we set in containerimage, or it will use this value to run and have same permission issue ) so the serviceaccount e.g opendatahub-operator-controller-manager will set run as the same uid
  • if user create their customized SCC but not specify priority, we can leave SCC in CSV unset
  • if user does not create their SCC, we can leave SCC in CSV unset too.

@zdtsw
Copy link
Member Author

zdtsw commented Aug 7, 2024

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

the use case for our Operator with write permission:

  • update existing manifests files: this is something i dont have a better solution for now.
  • create tmp files before final dump into ^ these files. (this we could have re-write the logic to avoid)

This is likely the biggest difference of ours and other products operator.
they do not need to set uid and can use namespace populated range as running id, which does not have problem unless there is any on-disk write operator happens there.

we could make all manifests live under /var/manifests instead of /opt/manifests
but this does not really resolve the problem.
the current user in Operator pod is uid=1001(1001) gid=0(root) groups=0(root)
so i feel skeptical to open more (on group level) than narrow down what exactly is needed(for specific user)

@ykaliuta
Copy link
Contributor

ykaliuta commented Aug 7, 2024

Can we just make /opt/manifests group writeable? (in theory volatile content should be copied to /var, but it's too much for us)

the use case for our Operator with write permission:

  • update existing manifests files: this is something i dont have a better solution for now.

What the problem to make it group writable?

  • create tmp files before final dump into ^ these files. (this we could have re-write the logic to avoid)

This is likely the biggest difference of ours and other products operator. they do not need to set uid and can use namespace populated range as running id, which does not have problem unless there is any on-disk write operator happens there.

we could make all manifests live under /var/manifests instead of /opt/manifests but this does not really resolve the problem.

If on startup it copies RO /opt/manifest to /var/manifests under its UID with typical umask, it gets writable directory.

the current user in Operator pod is uid=1001(1001) gid=0(root) groups=0(root) so i feel skeptical to open more (on group level) than narrow down what exactly is needed(for specific user)

What are the security concerns about that UIDs in the container? There is no sensitive content, the UIDs are virtual from the host point of view.

@zdtsw zdtsw marked this pull request as draft August 16, 2024 08:59
@zdtsw
Copy link
Member Author

zdtsw commented Aug 16, 2024

we will need to revisit this UID late.
mark it to draft (should close but then often it gets lost in a while)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Todo
Development

Successfully merging this pull request may close these issues.

3 participants