Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for securityContext for pods #30

Open
lvikstro opened this issue Mar 30, 2022 · 43 comments
Open

Support for securityContext for pods #30

lvikstro opened this issue Mar 30, 2022 · 43 comments

Comments

@lvikstro
Copy link

Installing the prometheus operator helm chart with defaults (https://prometheus-community.github.io/helm-charts, kube-prometheus-stack) is by default setting this for the prometheus instance:

securityContext:
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
fsGroup: 2000

This makes the "prometheus-kube-prometheus-stack-prometheus-0" pod go into a crash-loop with the error in logs: "nable to create mmap-ed active query log"

Changing the prometheusSpec securityContext like this:
securityContext:
runAsGroup: 0
runAsNonRoot: true
runAsUser: 0
fsGroup: 2000
makes it all work. But most likely running with root permissions then on the file system.

This seems to be an issue with the csi implementation where it doesn't support fsGroupSupport or similar. For example longhorn does this with "fsGroupPolicy: ReadWriteOnceWithFSType" which make each volume being examined at mount time to determine if permissions should be recursively applied.

@inductor
Copy link
Contributor

@lvikstro I get the following error even when I speficy runAsUser 0. Are you sure it works? 🤔

securityContext:
  runAsGroup: 0
  runAsNonRoot: true
  runAsUser: 0
  fsGroup: 2000
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  3m52s                 default-scheduler  Successfully assigned monitoring/prometheus-prometheus-kube-prometheus-prometheus-0 to node
  Warning  Failed     75s (x12 over 3m30s)  kubelet            Error: container's runAsUser breaks non-root policy (pod: "prometheus-prometheus-kube-prometheus-prometheus-0_monitoring(9e509cb5-e8af-4ac4-8ce0-c96fb6ca19c5)", container: init-config-reloader)
  Normal   Pulled     60s (x13 over 3m30s)  kubelet            Container image "quay.io/prometheus-operator/prometheus-config-reloader:v0.56.2" already present on machine

@schoeppi5
Copy link

Hi there @lvikstro,
I ran into the exact same problem (and I mean down to the error message same). I am using a Kubernetes 1.21 with the newest release of the driver.

The security context does work, since this is not the responsibility of the driver, but Kubernetes, but in order for it to work, I had to configure some things:

  1. This apparently does not work with btrfs
  2. fsType in the storage class is deprecated and was replaced with csi.storage.k8s.io/fstype: ext4
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: normal
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: csi.san.synology.com
# if all params are empty, synology CSI will choose an available location to create volume
parameters:
  dsm: "<dsmip>"
  location: /volume<n>
  csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true

This combination fixed it for me

@inductor
Copy link
Contributor

inductor commented Jun 9, 2022

@schoeppi5

Hello.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: synology-iscsi-storage
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: csi.san.synology.com
parameters:
  dsm: '192.168.16.240'
  location: '/volume1'
  csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Retain
allowVolumeExpansion: true

This is my SC spec but still doesn't work for me .-. anything in your mind?

@jjdiazgarcia
Copy link

I am also facing this issue.
I am using latest version of this repository and postgresql is not able to start because owner of nfs is not set properly (although I have setup securityContext and SC properly)

statefulset definition

apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: "2022-06-17T14:55:44Z"
  generation: 18
  labels:
    app.kubernetes.io/component: primary
    app.kubernetes.io/instance: vaultarden
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: postgresql
    helm.sh/chart: postgresql-10.16.2
  name: vaultarden-postgresql
  namespace: vaultwarden
  resourceVersion: "7462911"
  uid: 239fd77b-e13b-4303-b007-431424ce526e
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app.kubernetes.io/instance: vaultarden
      app.kubernetes.io/name: postgresql
      role: primary
  serviceName: vaultarden-postgresql-headless
  template:
    metadata:
      creationTimestamp: null
      labels:
        app.kubernetes.io/component: primary
        app.kubernetes.io/instance: vaultarden
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/name: postgresql
        helm.sh/chart: postgresql-10.16.2
        role: primary
      name: vaultarden-postgresql
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/component: primary
                  app.kubernetes.io/instance: vaultarden
                  app.kubernetes.io/name: postgresql
              namespaces:
              - vaultwarden
              topologyKey: kubernetes.io/hostname
            weight: 1
      automountServiceAccountToken: false
      containers:
      - env:
        - name: BITNAMI_DEBUG
          value: "true"
        - name: POSTGRESQL_PORT_NUMBER
          value: "5432"
        - name: POSTGRESQL_VOLUME_DIR
          value: /bitnami/postgresql
        - name: PGDATA
          value: /bitnami/postgresql/data
        - name: POSTGRES_POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              key: postgresql-postgres-password
              name: vaultarden-postgresql
        - name: POSTGRES_USER
          value: vaultwarden
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              key: postgresql-password
              name: vaultarden-postgresql
        - name: POSTGRES_DB
          value: vaultwarden
        - name: POSTGRESQL_ENABLE_LDAP
          value: "no"
        - name: POSTGRESQL_ENABLE_TLS
          value: "no"
        - name: POSTGRESQL_LOG_HOSTNAME
          value: "false"
        - name: POSTGRESQL_LOG_CONNECTIONS
          value: "false"
        - name: POSTGRESQL_LOG_DISCONNECTIONS
          value: "false"
        - name: POSTGRESQL_PGAUDIT_LOG_CATALOG
          value: "off"
        - name: POSTGRESQL_CLIENT_MIN_MESSAGES
          value: error
        - name: POSTGRESQL_SHARED_PRELOAD_LIBRARIES
          value: pgaudit
        image: docker.io/bitnami/postgresql:11.14.0-debian-10-r28
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - exec pg_isready -U "vaultwarden" -d "dbname=vaultwarden" -h 127.0.0.1
              -p 5432
          failureThreshold: 6
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        name: vaultarden-postgresql
        ports:
        - containerPort: 5432
          name: tcp-postgresql
          protocol: TCP
        readinessProbe:
          exec:
            command:
            - /bin/sh
            - -c
            - -e
            - |
              exec pg_isready -U "vaultwarden" -d "dbname=vaultwarden" -h 127.0.0.1 -p 5432
              [ -f /opt/bitnami/postgresql/tmp/.initialized ] || [ -f /bitnami/postgresql/.initialized ]
          failureThreshold: 6
          initialDelaySeconds: 5
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 5
        resources:
          requests:
            cpu: 250m
            memory: 256Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /dev/shm
          name: dshm
        - mountPath: /bitnami/postgresql
          name: postgresql
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext:
        fsGroup: 1001
      terminationGracePeriodSeconds: 30
      volumes:
      - name: postgresql
        persistentVolumeClaim:
          claimName: test
      - emptyDir:
          medium: Memory
        name: dshm
      - emptyDir: {}
        name: data
  updateStrategy:
    type: RollingUpdate

Storage class definition

allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2022-06-21T07:53:58Z"
  name: synology-smb-storage
  resourceVersion: "7438914"
  uid: 19290577-5044-427d-a4a6-5532b83c49bb
parameters:
  csi.storage.k8s.io/node-stage-secret-name: cifs-csi-credentials
  csi.storage.k8s.io/node-stage-secret-namespace: synology-csi
  dsm: 192.168.30.13
  fsType: ext4
  location: /volume1
  protocol: smb
provisioner: csi.san.synology.com
reclaimPolicy: Retain
volumeBindingMode: Immediate

These are logs from postgresql when using this configuration

postgresql 11:04:36.12
postgresql 11:04:36.12 Welcome to the Bitnami postgresql container
postgresql 11:04:36.12 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
postgresql 11:04:36.12 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
postgresql 11:04:36.13
postgresql 11:04:36.13 DEBUG ==> Configuring libnss_wrapper...
postgresql 11:04:36.14 INFO  ==> ** Starting PostgreSQL setup **
postgresql 11:04:36.18 INFO  ==> Validating settings in POSTGRESQL_* env vars..
postgresql 11:04:36.18 INFO  ==> Loading custom pre-init scripts...
postgresql 11:04:36.19 INFO  ==> Initializing PostgreSQL database...
postgresql 11:04:36.19 DEBUG ==> Ensuring expected directories/files exist...
mkdir: cannot create directory ‘/bitnami/postgresql/data’: Permission denied

Anyway, if I change directory to be emptydir (instead of pvc from synology nfs) it works and I can verify that owner is 1001 (which I have set in securityContext)

The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /bitnami/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
syncing data to disk ... ok

Success. You can now start the database server using:

    /opt/bitnami/postgresql/bin/pg_ctl -D /bitnami/postgresql/data -l logfile start

postgresql 11:09:21.51 INFO  ==> Starting PostgreSQL in background...
waiting for server to start....2022-06-21 11:09:21.540 GMT [66] LOG:  listening on IPv6 address "::1", port 5432
2022-06-21 11:09:21.540 GMT [66] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2022-06-21 11:09:21.543 GMT [66] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-06-21 11:09:21.555 GMT [67] LOG:  database system was shut down at 2022-06-21 11:09:21 GMT
2022-06-21 11:09:21.558 GMT [66] LOG:  database system is ready to accept connections
 done
server started
CREATE DATABASE
postgresql 11:09:21.97 INFO  ==> Changing password of postgres
ALTER ROLE
postgresql 11:09:22.01 INFO  ==> Creating user vaultwarden
CREATE ROLE
postgresql 11:09:22.03 INFO  ==> Granting access to "vaultwarden" to the database "vaultwarden"
GRANT
ALTER DATABASE
postgresql 11:09:22.06 INFO  ==> Setting ownership for the 'public' schema database "vaultwarden" to "vaultwarden"
ALTER SCHEMA
postgresql 11:09:22.10 INFO  ==> Configuring replication parameters
postgresql 11:09:22.14 INFO  ==> Configuring synchronous_replication
postgresql 11:09:22.14 INFO  ==> Configuring fsync
postgresql 11:09:22.18 INFO  ==> Loading custom scripts...
postgresql 11:09:22.19 INFO  ==> Enabling remote connections
postgresql 11:09:22.20 INFO  ==> Stopping PostgreSQL...
waiting for server to shut down....2022-06-21 11:09:22.214 GMT [66] LOG:  received fast shutdown request
2022-06-21 11:09:22.216 GMT [66] LOG:  aborting any active transactions
2022-06-21 11:09:22.220 GMT [66] LOG:  background worker "logical replication launcher" (PID 73) exited with exit code 1
2022-06-21 11:09:22.221 GMT [68] LOG:  shutting down
2022-06-21 11:09:22.239 GMT [66] LOG:  database system is shut down
 done
server stopped
postgresql 11:09:22.32 INFO  ==> ** PostgreSQL setup finished! **

postgresql 11:09:22.37 INFO  ==> ** Starting PostgreSQL **
2022-06-21 11:09:22.393 GMT [1] LOG:  pgaudit extension initialized
2022-06-21 11:09:22.395 GMT [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-06-21 11:09:22.395 GMT [1] LOG:  listening on IPv6 address "::", port 5432
2022-06-21 11:09:22.398 GMT [1] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-06-21 11:09:22.412 GMT [156] LOG:  database system was shut down at 2022-06-21 11:09:22 GMT
2022-06-21 11:09:22.415 GMT [1] LOG:  database system is ready to accept connections

Owner of the directory

$ ls -l /bitnami/
total 8
drwxrwsrwx. 3 root 1001 4096 Jun 21 11:09 postgresql

$ ls -l /bitnami/postgresql
total 8
drwx------. 19 1001 1001 4096 Jun 21 11:09 data

$ ls -l /bitnami/postgresql/data
total 176
drwx------. 6 1001 root 4096 Jun 21 11:09 base
drwx------. 2 1001 root 4096 Jun 21 11:10 global
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_commit_ts
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_dynshmem
-rw-------. 1 1001 root 1636 Jun 21 11:09 pg_ident.conf
drwx------. 4 1001 root 4096 Jun 21 11:09 pg_logical
drwx------. 4 1001 root 4096 Jun 21 11:09 pg_multixact
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_notify
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_replslot
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_serial
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_snapshots
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_stat
drwx------. 2 1001 root 4096 Jun 21 11:10 pg_stat_tmp
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_subtrans
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_tblspc
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_twophase
-rw-------. 1 1001 root    3 Jun 21 11:09 PG_VERSION
drwx------. 3 1001 root 4096 Jun 21 11:09 pg_wal
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_xact
-rw-------. 1 1001 root   88 Jun 21 11:09 postgresql.auto.conf
-rw-------. 1 1001 root  249 Jun 21 11:09 postmaster.opts
-rw-------. 1 1001 root   79 Jun 21 11:09 postmaster.pid

@schoeppi5
Copy link

Depending on the K8s version you are using, there is a problem with the DelegateFSGroupToCSIDriver feature gate. This is enabled by default in starting with K8s 1.23.

Normally, the kubelet is responsible for "fulfilling" the securityContext chown and chmod requirements. This feature gate enables the kubelet to delegate this to the csi driver, if it supports it.

The synology csi driver declares that it is able to do that, but just isn't doing it.

The quick workaround for this, is to disable this feature gate and always let the kubelet do that.

The solution for this would be for the csi driver either to not declare this capability, or for it to actually work.

@rblundon
Copy link

rblundon commented Sep 6, 2022

I am also running into this issue with OpenShift 4.11 (based on k8s 1.24). CSI driver provisions and mounts the volume with no issues, but pod instantiation always fails with Permission Denied.

@schoeppi5
Copy link

nfs

kubelet isn't doing the chown for NFS volumes. You'll have to use an init container for that

@schoeppi5
Copy link

@rblundon Do you have a bit more info? Maybe the storageclass you're using and the podspec

@inductor
Copy link
Contributor

inductor commented Sep 6, 2022

Disabling DelegateFSGroupToCSIDriver worked perfectly for me btw. Thank you so much!

@Ryanznoco
Copy link

Disabling DelegateFSGroupToCSIDriver worked perfectly for me btw. Thank you so much!

How to disable DelegateFSGroupToCSIDriver on a existing cluster?

@schoeppi5
Copy link

@inductor
Copy link
Contributor

inductor commented Sep 6, 2022

@Ryanznoco Hi, it depends on what Kubernetes distribution you use but you need to look up "feature gate"

@Ryanznoco
Copy link

https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/

It is not working for me. I added "--feature-gates=DelegateFSGroupToCSIDriver=false" into the api-server manifest, after re-deploy completely, my redis pod still occurs "Permission denied" error. My kubernetes version is 1.23.10.

@schoeppi5
Copy link

You need to add the Feature Gate to every kubelet config for your cluster, since this is a kubelet gate, not an api server one. Depending on your installation method, there might be an easier way of changing the kubelet config

@Ryanznoco
Copy link

Sorry I don't know how to do this. Can I delete the line csi.NodeServiceCapability_RPC_VOLUME_MOUNT_GROUP, in source code and recompile and install csi dirver?

@schoeppi5
Copy link

That would be, like, a lot more work than disabling the feature gate.

Please read the whole message, before proceeding

There are a few ways of doing that, manually should work most of the time, other approaches depend on your installation.

Manually

If you have K8s installed locally, on VMs, or any other way, where your installation isn't managed by a cloud provider, you can:

  1. Go to one of your nodes
  2. Find the kubelet process (pid) ps -ef | grep /usr/bin/kubelet
  3. Find the command line cat /proc/<pid>/cmdline
  4. Find the path of the kubelet config. That is the path after --config= (That is probably going to be /var/lib/kubelet/config.yaml)
  5. In that file, you can add:
featureGates:
  DelegateFSGroupToCSIDriver: false
  1. Restart the kubelet service (systemctl restart kubelet)
  2. Congrats, you just disabled a feature gate in the kubelet. Now repeat 5. & .6 on all other worker nodes.

kubeadm

kubeadm keeps the current kubelet config in a configmap on the cluster in the kube-system namespace, called kubelet-config.

You can find the whole process well documented here.
It works pretty much the same as the manual doing.

talos linux

Just putting that one in here, since I am working with that right now:

In your worker configuration, add the feature gates, you want to en/disable like depicted in their docs here.

Others

For other installation methods, you'll have to consult their respective docs on how to disable kubelet feature gates.

But yes, you could probably also remove this capability, recompile, rebuild the image and patch your deployment, but no guaranty on that.

At this point, I might as well open a PR for this issue. We'll see.

Hope this helps you and you get the driver running correctly. Let me now how it went, I'm invested now 😆

@rblundon
Copy link

rblundon commented Sep 8, 2022

This makes sense, but how would it be accomplished on OpenShift?

@schoeppi5
Copy link

Sorry, I don't have any experience in OpenShift

@inductor
Copy link
Contributor

inductor commented Sep 8, 2022

@rblundon for OpenShift problems you really SHOULD ask Red Hat

@Ryanznoco
Copy link

@schoeppi5 Thank you for your help. I solved it by modifying the source code. I also tried modifying the kubeadm configmap, but it still doesn't work. Looking forward to the new release with your PR.
@inductor And thank you too.

@rblundon
Copy link

rblundon commented Sep 9, 2022

@Ryanznoco - I will try reinstalling the CSI driver from your REPO. @inductor - I work for RH, but in sales, not engineering. Pretty sure this wouldn't get any traction as it looks to be na issue with the Synology Driver not doing what it is advertising t=it is capable of doing.

@mazzy89
Copy link

mazzy89 commented Nov 26, 2022

The problem here is that this CSI just declare the capability but it does not do anything to actually implement that like for other CSI. See here an example https://github.com/kubernetes-csi/csi-driver-smb/pull/379/files

We should point into the README about this limitation because more and more k8s clusters and bistro implement nowadays Security Context and those days were everything was running as a root likely and hopefully are gone.

@inductor
Copy link
Contributor

inductor commented Jan 4, 2023

DelegateFSGroupToCSIDriver is enabled by default as of 1.26 since the feature is considered as GA.

invakid404 added a commit to invakid404/synology-csi that referenced this issue Jan 23, 2023
@vaskozl
Copy link
Contributor

vaskozl commented Jan 31, 2023

I also just upgraded to v1.26.1 and kubelet complained wouldn't allow me to disable the feature gate anymore. This is a pretty major bug which means the CSI driver doesn't work on k8s v1.26 or later.

It's a pretty easy fix.
@chihyuwu would you mind cutting a new release please?

@chihyuwu
Copy link
Collaborator

chihyuwu commented Feb 3, 2023

hi @vaskozl, Of course! Thanks for letting us know about it!
A new version without the RPC_VOLUME_MOUNT_GROUP capability will be released soon to make sure that the plugin is compatible with k8s v1.26.
We'll definitely put more emphasis on CSI's flexibility and compatibility in the future as well!

@chihyuwu
Copy link
Collaborator

chihyuwu commented Feb 6, 2023

A new version without the RPC_VOLUME_MOUNT_GROUP capability will be released soon to make sure that the plugin is compatible with k8s v1.26.

A new update of synology/synology-csi:latest is available now.

The problem here is that this CSI just declare the capability but it does not do anything to actually implement that like for other CSI. See here an example https://github.com/kubernetes-csi/csi-driver-smb/pull/379/files

In the previous version, we have tried to implement securityContext like the example, but still missed something. We'll check and fix it in the future.

@inductor
Copy link
Contributor

inductor commented Feb 6, 2023

@chihyuwu Are you willing to move the image to GitHub instead of Docker Hub? Rate limit can be problematic on some specific environment sharing same outgoing global IP address(es).

inductor added a commit to GiganticMinecraft/seichi_infra that referenced this issue Feb 23, 2023
@camaeel
Copy link

camaeel commented Mar 30, 2023

When it will be released?

@tomasodehnal
Copy link

Thanks @schoeppi5 for the detailed description! Adding the K3s instructions for those that might need them until it is resolved:

  1. add these lines to /etc/rancher/k3s/config.yaml:
kubelet-arg:
  - feature-gates=DelegateFSGroupToCSIDriver=false
  1. systemctl restart k3s

Works for me on K3s v1.25.3.

@camaeel
Copy link

camaeel commented Apr 1, 2023

Works on <=1.25. 1.26 has this flag completely disabled.

@salzig
Copy link

salzig commented May 30, 2023

@chihyuwu v1.1.1 still contains this bug?

@chihyuwu
Copy link
Collaborator

chihyuwu commented Jun 5, 2023

Hi @salzig,
v1.1.1 is the version with this bug, please try v1.1.2 we just released.
If you still encounter problems, please let us know.

@salzig
Copy link

salzig commented Jun 5, 2023

@chihyuwu you may close #53 then.

@dperez-sct
Copy link

dperez-sct commented Aug 1, 2023

I'm trying to get the CSI to work on 1.27

The problem I see is that despite indicating the fsgroup, the mount in the pod is done with the option of UID=0 and GID=0:

//192.168.1.1/k8s******************b-45e6-b on /prometheus type cifs (rw,relatime,vers=3.1.1,cache=strict,username=******,uid=0,noforceuid,gid=0,noforcegid,addr=192.168.1.1,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1)

I had understood that since version 1.1.2 it was solved, but perhaps I have not understood it correctly, any ideas?

Thanks and regards.

Sorry, I've already found the parameter for GID and UID, problem solved.

@sirjobzy-yorku
Copy link

I am still unable to use the CIS driver to change the mounted FS @chihyuwu. I am running v1.1.3

@chris-sanders
Copy link

chris-sanders commented Jan 20, 2024

Starting a container with a security context:

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    helm.sh/hook: test
    helm.sh/hook-delete-policy: hook-succeeded
  name: pvc-test
  namespace: synology-csi
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: synology-csi-iscsi
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
  name: my-shell-pod
  namespace: synology-csi
spec:
  securityContext:
    fsGroup: 107
  containers:
  - name: shell
    image: busybox
    command: [ "/bin/sh", "-c", "--" ]
    args: [ "while true; do sleep 30; done;" ]
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 107
      seccompProfile:
        type: RuntimeDefault
    volumeMounts:
    - name: my-volume
      mountPath: /mnt/pvc
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: pvc-test

Still results in a disk that is root:root when using the iscsi provider. Is there a way to get the the user to match? I thought the fsGroup should cause it to be applied to the volume.

@chris-sanders
Copy link

After much work kubevirt/containerized-data-importer#2523 (comment)

Pointed me at setting the CSIDriver to "File" which is now working. However, the chart doesn't support this so I had to disable the CSI in the chart and install my own. If this is expected (doesn't sound like it is) the chart should be updated. However, I'm not clear why this is necessary it seems like a bug and the default driver setting should work.

@schoeppi5
Copy link

@chris-sanders If I remember this issue correctly, the fix discussed above does not solve the issue but is a workaround. The CSI spec defines, that the driver declares certain capabilities, for example, if it is able to set the uid und gid of the mount. When a new PV is requested, the kubelet watches for this capability and defers setting the uid and gid in acordance to the security context to the CSI driver. The feature gate mentioned above, which was present in the kubelet until v1.25, disabled this behaviour and instructed it to always chown the new mount.
I am not sure, what the current state of things is, but it seems like the kubelet might have removed this functionality (for obvious reasons, since chowning a mount recursivly might take a long time). Can you confirm, that this is the case?

@chris-sanders
Copy link

@schoeppi5 yes I can confirm in 1.28 the feature gate was removed and the CSI Driver no longer works with the default method. I had to set the method to "File" to get things to work correctly. It is slow, but adding the -E nodiscard flag causes the ext4 mkfs to discard zeros and an empty volume comes up very quickly now.

I guess ideally the CSI should be doing the right thing, until then the helm chart should at least be defaulting to File with nodiscard.

@jakejx
Copy link

jakejx commented Feb 3, 2024

Ran into this issue when using the v1.1.3 image also. Turns out the v1.1.3 image provided by Synology includes the older binary v1.1.2 (#71). Building my own image resolved the issue.

@bgulla
Copy link

bgulla commented Feb 9, 2024

Ran into this issue when using the v1.1.3 image also. Turns out the v1.1.3 image provided by Synology includes the older binary v1.1.2 (#71). Building my own image resolved the issue.

which commit did you build to? I am getting the error still on HEAD.

@jakejx
Copy link

jakejx commented Feb 10, 2024

which commit did you build to? I am getting the error still on HEAD.

I built fbe1b7e. You can also check your container logs to see the version that is running. The version number is logged on start up.

@marcelofernandez
Copy link

Hi!

I've run into this bug as well. Despite my container's process running as uid/gid 1000, I had to configure its pod to run as root to mount successfully and access PVC:

kind: Pod
spec:
  [...]
  volumes:
    - name: pvc
      persistentVolumeClaim:
      claimName: synology-pvc
  [...]
  containers:
    - name: hub
      image: jupyterhub/k8s-hub
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        allowPrivilegeEscalation: false
  [...]
  securityContext:  # Had to add this extra securityContext to the pod
    runAsUser: 0
    runAsGroup: 0

Otherwise, I could see the driver creating the dynamic volume in the NAS successfully, mounting it from the pod in the node, but the hub container didn't have enough permissions to read the volume.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests