-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grafana Alloy container fails to start: exec /bin/alloy: operation not permitted #630
Comments
Hmm, interested to see if it's similar to #177 🤔 |
Yes, however, our policiy requires all capabilities to be dropped. |
We were able to deploy the Grafan Agent in |
To say it blunt: The solution fo issue 177 won't work in our case. |
Is this the right place for this issue or should would it be better to address this in the Grafana Helm Chart repository https://github.com/grafana/helm-charts? |
Updated the description. |
Can you share your grafana agent flow config that is working? I can reproduce this in that dropping all capabilities causes the permission error. Others have shared lists of required capabilities that work in openshift. I am not really aware of a way around that, but I can try and figure out a diff between a working flow config and a non-working alloy one. |
Hi @captncraig grafana-agent:
# -- Overrides the chart's name. Used to change the infix in the resource names.
nameOverride: null
# -- Overrides the chart's computed fullname. Used to change the full prefix of
# resource names.
fullnameOverride: null
## Global properties for image pulling override the values defined under `image.registry` and `configReloader.image.registry`.
## If you want to override only one image registry, use the specific fields but if you want to override them all, use `global.image.registry`
global:
image:
# -- Global image registry to use if it needs to be overriden for some specific use cases (e.g local registries, custom images, ...)
registry: "docker.io"
# -- Optional set of global image pull secrets.
pullSecrets: []
# -- Security context to apply to the Grafana Agent pod.
podSecurityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
crds:
# -- Whether to install CRDs for monitoring.
create: false
# Various agent settings.
agent:
# -- Mode to run Grafana Agent in. Can be "flow" or "static".
mode: 'flow'
configMap:
# -- Create a new ConfigMap for the config file.
create: false
# -- Content to assign to the new ConfigMap. This is passed into `tpl` allowing for templating from values.
content: ''
# -- Name of existing ConfigMap to use. Used when create is false.
name: grafana-agent-config
#name: null
# -- Key in ConfigMap to get config from.
key: config.river
#key: null
clustering:
# -- Deploy agents in a cluster to allow for load distribution. Only
# applies when agent.mode=flow.
enabled: true
# -- Path to where Grafana Agent stores data (for example, the Write-Ahead Log).
# By default, data is lost between reboots.
storagePath: /tmp/agent
# -- Address to listen for traffic on. 0.0.0.0 exposes the UI to other
# containers.
listenAddr: 0.0.0.0
# -- Port to listen for traffic on.
listenPort: 12345
# -- Scheme is needed for readiness probes. If enabling tls in your configs, set to "HTTPS"
listenScheme: HTTP
# -- Base path where the UI is exposed.
uiPathPrefix: /
# -- Enables sending Grafana Labs anonymous usage stats to help improve Grafana
# Agent.
enableReporting: false
# -- Extra environment variables to pass to the agent container.
extraEnv: []
# -- Maps all the keys on a ConfigMap or Secret as environment variables. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#envfromsource-v1-core
envFrom: []
# -- Extra args to pass to `agent run`: https://grafana.com/docs/agent/latest/flow/reference/cli/run/
extraArgs: []
# -- Extra ports to expose on the Agent
extraPorts:
- name: "otlpgrpc"
port: 4317
targetPort: 4317
protocol: "TCP"
- name: "otlphttp"
port: 4318
targetPort: 4318
protocol: "TCP"
# - name: "flow-port"
# port: 12345
# targetPort: 12345
# protocol: "TCP"
# - name: "faro"
# port: 12347
# targetPort: 12347
# protocol: "TCP"
mounts:
# -- Mount /var/log from the host into the container for log collection.
varlog: false
# -- Mount /var/lib/docker/containers from the host into the container for log
# collection.
dockercontainers: false
# -- Extra volume mounts to add into the Grafana Agent container. Does not
# affect the watch container.
extra:
- name: gfagent-tmp
mountPath: /tmp/agent
# -- Security context to apply to the Grafana Agent container.
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
# -- Resource requests and limits to apply to the Grafana Agent container.
resources: {}
image:
# -- Grafana Agent image registry (defaults to docker.io)
registry: "docker.io"
# -- Grafana Agent image repository.
repository: grafana/agent
# -- (string) Grafana Agent image tag. When empty, the Chart's appVersion is
# used.
tag: v0.40.3
# -- Grafana Agent image's SHA256 digest (either in format "sha256:XYZ" or "XYZ"). When set, will override `image.tag`.
digest: null
# -- Grafana Agent image pull policy.
pullPolicy: IfNotPresent
# -- Optional set of image pull secrets.
pullSecrets: []
rbac:
# -- Whether to create RBAC resources for the agent.
create: false
serviceAccount:
# -- Whether to create a service account for the Grafana Agent deployment.
create: true
# -- Additional labels to add to the created service account.
additionalLabels: {}
# -- Annotations to add to the created service account.
annotations: {}
# -- The name of the existing service account to use when
# serviceAccount.create is false.
name: null
# Options for the extra controller used for config reloading.
configReloader:
# -- Enables automatically reloading when the agent config changes.
enabled: true
image:
# -- Config reloader image registry (defaults to docker.io)
registry: "ghcr.io"
# -- Repository to get config reloader image from.
repository: jimmidyson/configmap-reload
# -- Tag of image to use for config reloading.
tag: v0.9.0
# -- SHA256 digest of image to use for config reloading (either in format "sha256:XYZ" or "XYZ"). When set, will override `configReloader.image.tag`
digest: ""
# -- Override the args passed to the container.
customArgs: []
# -- Resource requests and limits to apply to the config reloader container.
resources:
requests:
cpu: "1m"
memory: "5Mi"
# -- Security context to apply to the Grafana configReloader container.
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
controller:
# -- Type of controller to use for deploying Grafana Agent in the cluster.
# Must be one of 'daemonset', 'deployment', or 'statefulset'.
type: 'deployment'
# -- Number of pods to deploy. Ignored when controller.type is 'daemonset'.
replicas: 2
# -- Annotations to add to controller.
extraAnnotations: {}
# -- Whether to deploy pods in parallel. Only used when controller.type is
# 'statefulset'.
parallelRollout: true
# -- Configures Pods to use the host network. When set to true, the ports that will be used must be specified.
hostNetwork: false
# -- Configures Pods to use the host PID namespace.
hostPID: false
# -- Configures the DNS policy for the pod. https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy
dnsPolicy: ClusterFirst
# -- Update strategy for updating deployed Pods.
updateStrategy: {}
# -- nodeSelector to apply to Grafana Agent pods.
nodeSelector: {}
# -- Tolerations to apply to Grafana Agent pods.
tolerations: []
# -- Topology Spread Constraints to apply to Grafana Agent pods.
topologySpreadConstraints: []
# -- priorityClassName to apply to Grafana Agent pods.
priorityClassName: ''
# -- Extra pod annotations to add.
podAnnotations: {}
# -- Extra pod labels to add.
podLabels: {}
# -- Whether to enable automatic deletion of stale PVCs due to a scale down operation, when controller.type is 'statefulset'.
enableStatefulSetAutoDeletePVC: false
autoscaling:
# -- Creates a HorizontalPodAutoscaler for controller type deployment.
enabled: false
# -- The lower limit for the number of replicas to which the autoscaler can scale down.
minReplicas: 2
# -- The upper limit for the number of replicas to which the autoscaler can scale up.
maxReplicas: 5
# -- Average CPU utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetCPUUtilizationPercentage` to 0 will disable CPU scaling.
targetCPUUtilizationPercentage: 0
# -- Average Memory utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetMemoryUtilizationPercentage` to 0 will disable Memory scaling.
targetMemoryUtilizationPercentage: 80
scaleDown:
# -- List of policies to determine the scale-down behavior.
policies: []
# - type: Pods
# value: 4
# periodSeconds: 60
# -- Determines which of the provided scaling-down policies to apply if multiple are specified.
selectPolicy: Max
# -- The duration that the autoscaling mechanism should look back on to make decisions about scaling down.
stabilizationWindowSeconds: 300
scaleUp:
# -- List of policies to determine the scale-up behavior.
policies: []
# - type: Pods
# value: 4
# periodSeconds: 60
# -- Determines which of the provided scaling-up policies to apply if multiple are specified.
selectPolicy: Max
# -- The duration that the autoscaling mechanism should look back on to make decisions about scaling up.
stabilizationWindowSeconds: 0
# -- Affinity configuration for pods.
affinity: {}
volumes:
# -- Extra volumes to add to the Grafana Agent pod.
extra:
- name: gfagent-tmp
emptyDir: {}
# -- volumeClaimTemplates to add when controller.type is 'statefulset'.
volumeClaimTemplates: []
## -- Additional init containers to run.
## ref: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
##
initContainers: []
# -- Additional containers to run alongside the agent container and initContainers.
extraContainers: []
service:
# -- Creates a Service for the controller's pods.
enabled: true
# -- Service type
type: ClusterIP
# -- NodePort port. Only takes effect when `service.type: NodePort`
nodePort: 31128
# -- Cluster IP, can be set to None, empty "" or an IP address
clusterIP: ''
# -- Value for internal traffic policy. 'Cluster' or 'Local'
internalTrafficPolicy: Cluster
annotations: {}
serviceMonitor:
enabled: false
# -- Additional labels for the service monitor.
additionalLabels: {}
# -- Scrape interval. If not set, the Prometheus default scrape interval is used.
interval: ""
# -- MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
# ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
metricRelabelings: []
# - action: keep
# regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
# sourceLabels: [__name__]
# -- Customize tls parameters for the service monitor
tlsConfig: {}
# -- RelabelConfigs to apply to samples before scraping
# ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
relabelings: []
# - sourceLabels: [__meta_kubernetes_pod_node_name]
# separator: ;
# regex: ^(.*)$
# targetLabel: nodename
# replacement: $1
# action: replace
ingress:
# -- Enables ingress for the agent (faro port)
enabled: false
# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
# ingressClassName: nginx
# Values can be templated
annotations:
{}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
labels: {}
path: /
faroPort: 12347
# pathType is only for k8s >= 1.1=
pathType: Prefix
hosts:
- chart-example.local
## Extra paths to prepend to every host configuration. This is useful when working with annotation based services.
extraPaths: []
# - path: /*
# backend:
# serviceName: ssl-redirect
# servicePort: use-annotation
## Or for k8s > 1.19
# - path: /*
# pathType: Prefix
# backend:
# service:
# name: ssl-redirect
# port:
# name: use-annotation
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
I hope this helps. BR, |
Is this the policy you are using? https://kyverno.io/policies/best-practices/require-drop-all/require-drop-all/ If I am reading that correctly, it requires a blanket drop all capabilities, but it does not preclude adding specific capabilities back in. So I suspect a solution like this one in #177 may work for you. |
@captncraig |
@captncraig Got feedback from our Kubernetes Team:
|
I have it running with just |
@PatMis16 can you try docker.io/grafana/alloy-dev:v1.1.0-devel-0ad55da2c with all capabilities dropped? It runs for me, but not sure about your policies. |
I am in a similar situation with policies and I can confirm that |
@captncraig @pzmi-f3 am going to try it today. |
@captncraig @pzmi-f3 There is some delay, I have to add the image repository for alloy-dev to the allowlist first. |
I am currently on PTO. Will proceed wit this next week. |
I have the same probem (as far as I know) on OpenShift where we are also running with the above mentioned |
Hey all, This seems to work now, without any Security Context Constraint shenanigans or special rights for the I believe the problem in the 'latest' build I used was the fact that the below was set within the Dockerbuild:
Two Questions:
Thanks in advance! Below more information on my setup in order to (try) and provide complete information. My SetupMy helm Chart/Kustomize setup (via ArgoCD) to give you an idea of what I setup as parameters:
Logs Alloy container within pod:
Logs Config-Reloader container within the pod:
|
The v1.1.0 is already released and fixed the issue on OpenShift for me (using the anyuid SCC to set the right UID). |
Great! @debovema Thanks for the headsup! |
Confirmed Works as of v1.1.0 |
@benoitschipper @debovema @pzmi-f3 @captncraig |
What's wrong?
Grafana Alloy needs to be deployable to an EKS cluster with enforced
podSecurityContext
and containersecurityContext
policies. Our policy mandates that the container operates as a non-root user (runAsNonRoot: true
) with all capabilities dropped:securityContext
podSecurityContext
Deploying Grafana Alloy to EKS with a
podSecurityContext
that specifiesrunAsUser
results in the container failing to start. The error logged is:In our organization running containers as root is not permitted and enforced by Kyverno policies.
Steps to reproduce
To reproduce the deployment, install Grafana Alloy using the specified
podSecurityContext
while ensuring that no Custom Resource Definitions (CRDs), Custom Resources (CRs), ClusterRoleBindings (CRBs), or Role-Based Access Control (RBAC) configurations are deployed as they are also not allowed.System information
Deployment on AWS EKS is enhanced with Kyverno Policies, ensuring adherence to the organization's standards.
Software version
v1.0.0
Configuration
Logs
The text was updated successfully, but these errors were encountered: