Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BGD-4448] run DP components as as non root #173

Merged
merged 1 commit into from
Jan 19, 2024
Merged

[BGD-4448] run DP components as as non root #173

merged 1 commit into from
Jan 19, 2024

Conversation

ImpSy
Copy link
Collaborator

@ImpSy ImpSy commented Jan 4, 2024

This is the default pull request template. You can customize it by adding a pull_request_template.md at the root of your repo or inside the .github folder.

Jira Ticket

Include a link to your Jira Ticket
Example: JIRAISS-1234

Demo

Please add a recording of the feature/bug fix in work. if you added new routes, the recording should show the request and response for each new/changed route

Checklist:

  • I have filled relevant self assessment (NodeJS, Frontend, Backend)
  • I have run ESlint on my changes and fixed all warnings and errors (NodeJS & Frontend Services)
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have validated all the requirements in the Jira task were answered
  • I have all neccessary approvals for the design/mini design of this task
  • I have approved the API changes and granular permission patterns (documentation subtask) (For public services only)

@ImpSy ImpSy requested a review from a team as a code owner January 4, 2024 14:31
@ImpSy
Copy link
Collaborator Author

ImpSy commented Jan 8, 2024

Bigdata Operator
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2024-01-08T15:35:33Z"
  generateName: bigdata-operator-5fc56c795d-
  labels:
    app.kubernetes.io/instance: bigdata-operator
    app.kubernetes.io/name: bigdata-operator
    bigdata.spot.io/component: bigdata-operator
    pod-template-hash: 5fc56c795d
  name: bigdata-operator-5fc56c795d-bfcll
  namespace: spot-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: bigdata-operator-5fc56c795d
    uid: cc6682c0-9826-473e-8d44-0f315f4f147e
  resourceVersion: "9915080"
  uid: 1bce0527-c398-4b58-8194-6e4fe53b2352
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: spotinst.io/node-lifecycle
            operator: In
            values:
            - od
        weight: 100
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: spotinst.io/ocean-vng-id
            operator: Exists
          - key: bigdata.spot.io/vng
            operator: NotIn
            values:
            - ocean-spark
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: bigdata.spot.io/component
              operator: Exists
          topologyKey: kubernetes.io/hostname
        weight: 100
  containers:
  - args:
    - --leader-elect
    env:
    - name: SPOTINST_BASE_URL
      value: https://api.spotinst.io
    - name: CHART_VERSION
      value: 0.4.5
    - name: OCEAN_CONTROLLER_NAMESPACE
    - name: BIGDATA_OPERATOR_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: HTTP_PROXY
      valueFrom:
        configMapKeyRef:
          key: proxyUrl
          name: spot-ofas-cluster-info
          optional: true
    - name: HTTPS_PROXY
      valueFrom:
        configMapKeyRef:
          key: proxyUrl
          name: spot-ofas-cluster-info
          optional: true
    image: public.ecr.aws/f4k1p1n4/bigdata-operator:0.4.3-aa3aad93
    imagePullPolicy: IfNotPresent
    name: manager
    ports:
    - containerPort: 9443
      name: webhook
      protocol: TCP
    - containerPort: 8080
      name: metrics
      protocol: TCP
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    securityContext:
      runAsNonRoot: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-4vft6
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: ip-192-168-254-144.ap-south-1.compute.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: bigdata-operator
  serviceAccountName: bigdata-operator
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: bigdata.spot.io/unschedulable
    operator: Equal
    value: ocean-spark-system
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-4vft6
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:35:33Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:35:40Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:35:40Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:35:33Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://79469a6dee02bd351bed1707faeda78bdb9ca2bbdc234c0801db465e62ef5bba
    image: public.ecr.aws/f4k1p1n4/bigdata-operator:0.4.3-aa3aad93
    imageID: public.ecr.aws/f4k1p1n4/bigdata-operator@sha256:f6835b9b2fb9786f375f3c5809e530b5e752a154d8fe8ac35751ad72e68358c6
    lastState: {}
    name: manager
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-01-08T15:35:40Z"
  hostIP: 192.168.254.144
  phase: Running
  podIP: 192.168.254.160
  podIPs:
  - ip: 192.168.254.160
  qosClass: Guaranteed
  startTime: "2024-01-08T15:35:33Z"
Bigdata Proxy
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2024-01-08T15:43:27Z"
  generateName: bigdata-proxy-bdenv-v69-558f7d68fb-
  labels:
    app.kubernetes.io/instance: bigdata-proxy-bdenv-v69
    app.kubernetes.io/name: bigdata-proxy
    bigdata.spot.io/component: bigdata-proxy
    pod-template-hash: 558f7d68fb
  name: bigdata-proxy-bdenv-v69-558f7d68fb-89cqm
  namespace: spot-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: bigdata-proxy-bdenv-v69-558f7d68fb
    uid: 8367631c-c419-4c70-a0b3-2d1079cc024e
  resourceVersion: "9917120"
  uid: 2ff2e5d9-2d82-4e9b-acb0-77740e677879
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: spotinst.io/node-lifecycle
            operator: In
            values:
            - od
          - key: spotinst.io/ocean-vng-id
            operator: Exists
          - key: bigdata.spot.io/vng
            operator: NotIn
            values:
            - ocean-spark
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: bigdata.spot.io/component
              operator: Exists
          topologyKey: kubernetes.io/hostname
        weight: 100
  containers:
  - env:
    - name: BIGDATA_PROXY_PORT
      value: "8080"
    - name: BIGDATA_PROXY_LOG_LIMIT
      value: "0"
    - name: BIGDATA_PROXY_EVENT_TIMEOUT
      value: 10m
    - name: BIGDATA_PROXY_APPS_NAMESPACE
      value: spark-apps
    - name: BIGDATA_PROXY_LOG_LEVEL
      value: info
    - name: BIGDATA_PROXY_UPSTREAM_TTL
      value: 5m
    image: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-proxy:0.5.3-3862d889
    imagePullPolicy: IfNotPresent
    name: bigdata-proxy
    ports:
    - containerPort: 8080
      name: proxy
      protocol: TCP
    resources:
      limits:
        cpu: 500m
        memory: 500Mi
      requests:
        cpu: 500m
        memory: 500Mi
    securityContext:
      runAsNonRoot: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-g8h9k
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: spot-bigdata-image-pull
  nodeName: ip-192-168-233-66.ap-south-1.compute.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: bigdata-proxy
  serviceAccountName: bigdata-proxy
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: bigdata.spot.io/unschedulable
    operator: Equal
    value: ocean-spark-system
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-g8h9k
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:43:27Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:43:29Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:43:29Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:43:27Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://2c99a19676e9a0b5ce353c14c50b2159582a444022d580485c0d9437e5b596b1
    image: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-proxy:0.5.3-3862d889
    imageID: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-proxy@sha256:bd106cc61e481eb31bb4f7e22dad22be045e5cd04f3c5c2206b45c23df1e4503
    lastState: {}
    name: bigdata-proxy
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-01-08T15:43:28Z"
  hostIP: 192.168.233.66
  phase: Running
  podIP: 192.168.216.153
  podIPs:
  - ip: 192.168.216.153
  qosClass: Guaranteed
  startTime: "2024-01-08T15:43:27Z"
Bigdata Spark Watcher
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2024-01-08T15:47:40Z"
  generateName: bigdata-spark-watcher-bdenv-v69-64f4c8c985-
  labels:
    app.kubernetes.io/instance: bigdata-spark-watcher-bdenv-v69
    app.kubernetes.io/name: bigdata-spark-watcher
    bigdata.spot.io/component: bigdata-spark-watcher
    pod-template-hash: 64f4c8c985
  name: bigdata-spark-watcher-bdenv-v69-64f4c8c985-8lf5j
  namespace: spot-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: bigdata-spark-watcher-bdenv-v69-64f4c8c985
    uid: dfff92c5-c05d-4122-a066-bf131c84c5df
  resourceVersion: "9918378"
  uid: d36a8480-04f3-4355-9231-15d8d905623d
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: spotinst.io/node-lifecycle
            operator: In
            values:
            - od
          - key: spotinst.io/ocean-vng-id
            operator: Exists
          - key: bigdata.spot.io/vng
            operator: NotIn
            values:
            - ocean-spark
    podAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: bigdata.spot.io/component
              operator: Exists
          topologyKey: kubernetes.io/hostname
        weight: 100
  containers:
  - args:
    - --metrics-bind-address
    - :8080
    - --watch-label
    - ""
    - --log-level
    - debug
    env:
    - name: SPOTINST_BASE_URL
      value: https://api.spotinst.io
    - name: APP_SYNC_PERIOD
      value: 5m
    - name: APP_SYNC_KILL_GRACE_PERIOD
      value: 5m
    - name: APP_SYNC_GHOST_GRACE_PERIOD
      value: 6m
    - name: APP_SYNC_REVERSE_GHOST_GRACE_PERIOD
      value: 30s
    - name: CREDS_REFRESH_INTERVAL
      value: 5m
    - name: SPARK_APP_FAILED_EXECUTOR_LIMIT
      value: "200"
    - name: SPARK_APP_TERMINATED_CRITICAL_SIDECAR_GRACE_PERIOD
      value: 3m
    - name: HTTP_PROXY
      valueFrom:
        configMapKeyRef:
          key: proxyUrl
          name: spot-ofas-cluster-info
          optional: true
    - name: HTTPS_PROXY
      valueFrom:
        configMapKeyRef:
          key: proxyUrl
          name: spot-ofas-cluster-info
          optional: true
    image: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-spark-watcher:0.4.5-78b84f0c
    imagePullPolicy: IfNotPresent
    name: manager
    ports:
    - containerPort: 8080
      name: metrics
      protocol: TCP
    resources:
      limits:
        cpu: "2"
        memory: 2000Mi
      requests:
        cpu: "2"
        memory: 2000Mi
    securityContext:
      runAsNonRoot: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-ldxn2
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: spot-bigdata-image-pull
  nodeName: ip-192-168-143-51.ap-south-1.compute.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: bigdata-spark-watcher
  serviceAccountName: bigdata-spark-watcher
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoSchedule
    key: bigdata.spot.io/unschedulable
    operator: Equal
    value: ocean-spark-system
  - effect: NoSchedule
    key: kubernetes.azure.com/scalesetpriority
    operator: Equal
    value: spot
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: kube-api-access-ldxn2
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:47:40Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:47:52Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:47:52Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2024-01-08T15:47:40Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: containerd://519b3703c38d79612e6570c047d392b66131bf73c5b858b59bc7bb2b3749a4be
    image: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-spark-watcher:0.4.5-78b84f0c
    imageID: 066597193667.dkr.ecr.us-east-1.amazonaws.com/private/bigdata-spark-watcher@sha256:4d68aa48d9e628932c3f4cd05b31defb6c6cf72834c1760f952a9a16f4798b7c
    lastState: {}
    name: manager
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2024-01-08T15:47:51Z"
  hostIP: 192.168.143.51
  phase: Running
  podIP: 192.168.163.156
  podIPs:
  - ip: 192.168.163.156
  qosClass: Guaranteed
  startTime: "2024-01-08T15:47:40Z"
Bigdata Telemetry (thanos receiver) 🔴
Spark Operator 🔴
Bigdata Notebook Service 🔴
Bigdata Notebook Service Storage Server 🔴

@ImpSy ImpSy force-pushed the dp-non-root branch 2 times, most recently from e37766c to 1156085 Compare January 9, 2024 14:04
Z4ck404
Z4ck404 previously approved these changes Jan 9, 2024
Copy link
Collaborator

@Z4ck404 Z4ck404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Should we add a comment why other components should not be run as non root at least for now (so that people don't change it and break the component )

Copy link
Collaborator

@Z4ck404 Z4ck404 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ImpSy ImpSy merged commit eec4c06 into main Jan 19, 2024
1 check passed
@ImpSy ImpSy deleted the dp-non-root branch January 19, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants