Unable to connect opa-docker-authz.sock #51

ramapalani · 2020-09-16T22:05:14Z

I'm trying to run OPA docker plugin as part of Daemonset DIND (docker-in-docker).
Followed steps in this tutorial: https://www.openpolicyagent.org/docs/latest/docker-authorization/#goals

Only rule that in the rego file is to prevent privileged containers. This works as expected in a pre-prod environment. When we run this in prod env, it works as expected for about an hour, after that OPA plugin is not reachable. Docker logs has messages like these

time="2020-09-06T19:08:06.723350267Z" level=warning msg="Unable to connect to plugin: /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock/AuthZPlugin.AuthZReq: Post http://%2Frun%2Fdocker%2Fplugins%2Fe680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8%2Fopa-docker-authz.sock/AuthZPlugin.AuthZReq: dial unix /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock: connect: connection refused, retrying in 1s"

time="2020-09-06T19:08:21.759791345Z" level=error msg="Handler for POST /v1.39/images/create returned error: plugin openpolicyagent/opa-docker-authz-v2:0.7 failed with error: Post http://%2Frun%2Fdocker%2Fplugins%2Fe680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8%2Fopa-docker-authz.sock/AuthZPlugin.AuthZReq: dial unix /run/docker/plugins/e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8/opa-docker-authz.sock: connect: connection refused"

Daemonset definition:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: dind-daemonset
spec:
...
  template:
    spec:
      containers:
      - name: dind
        image: docker:18.09.5-dind
        command: ['sh', '-c', 'if [ -d /var/run/dind/docker.sock ]; then rm -rf /var/run/dind/docker.sock;fi && /usr/local/bin/dockerd-entrypoint.sh dockerd --storage-driver=overlay2 -H unix:///var/run/dind/docker.sock']
        lifecycle:
          postStart:
            exec:
              command: ["/bin/sh", "-c", "mkdir -p /etc/docker/policies && cp /etc/docker/opa-policy/authz.rego /etc/docker/policies && docker -H unix:///var/run/dind/docker.sock plugin install --grant-all-permissions openpolicyagent/opa-docker-authz-v2:0.7 opa-args=\"-policy-file /opa/policies/authz.rego\" && echo '{ \"authorization-plugins\": [\"openpolicyagent/opa-docker-authz-v2:0.7\"] }' > /etc/docker/daemon.json && kill -HUP $(pidof dockerd)"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: varlibdocker
          mountPath: /var/lib/docker
        - name: rundind
          mountPath: /var/run/dind
        - name: opa-policy
          mountPath: /etc/docker/opa-policy
...
      volumes:
      - name: varlibdocker
        emptyDir: {}
      - name: opa-policy
        configMap: 
          name: docker-opa-policy
      - name: rundind
        hostPath:
          path: /var/run/dind/

authz.rego/

apiVersion: v1
kind: ConfigMap
metadata:
  name: docker-opa-policy
data:
  authz.rego: |-
    package docker.authz

    default allow = false

    allow {
        not input.Body.HostConfig.Privileged
    }

The text was updated successfully, but these errors were encountered:

ashutosh-narkar · 2020-09-16T22:15:40Z

Are there any other logs ? Any more information from running docker plugin inspect ?

ashutosh-narkar · 2020-09-16T22:16:49Z

Also what's different between the pre-prod and prod environments ?

ramapalani · 2020-09-17T04:08:05Z

It is same except that traffic is more in prod

…

On Wed, Sep 16, 2020, 15:17 Ashutosh Narkar ***@***.***> wrote: Also what's different between the pre-prod and prod environments ? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#51 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABV4VKUZFY2NXT6RJORFUYTSGE2N5ANCNFSM4RPPE65Q> .

ashutosh-narkar · 2020-09-17T05:29:46Z

In that case, have you tried allotting more resources to check if the system is not exhausted ?

ramapalani · 2020-09-17T18:30:30Z

Ashutosh, You were right, both CPU and Memory(RAM) in the pod spiked way above the requested amount. Resources in pod spec: resources: requests: memory: 4G cpu: 1 Actual consumption: ![image](https://user-images.githubusercontent.com/7064234/93518039-2462b480-f8e1-11ea-94f1-4fb6a5f2b1c5.png) We can increase the resources to a higher level but not to 25G, but memory keeps on increasing. I suspect a memory leak or other issue? Could you help me debug this? Thanks, Rama

ramapalani · 2020-09-17T19:28:33Z

Actual consumption screenshot

ashutosh-narkar · 2020-09-18T00:49:11Z

Memory usage typically depends on the size of the data and policy that you load into OPA. This page provides more details on resource utilization. Do you have an estimate of these values ?

ramapalani · 2020-09-18T02:03:33Z

This is the policy, it just evaluates only one field.

    package docker.authz

    default allow = false

    allow {
        not input.Body.HostConfig.Privileged
    }

I don't control the data, docker sends the input data to OPA plugin

Here is a sample input data with Body as null.

time="2020-09-05T19:40:15Z" level=error msg="2020/09/05 19:40:15 {\"config_hash\":\"f418bd1c862c2178ff5c93054aa8c8adae2ddae3aa90a68e4011c07d396839d4\",\"decision_id\":\"78c32ebd-a216-4ea1-a971-acbc879df361\",\"input\":{\"AuthMethod\":\"\",\"Body\":null,\"Headers\":{\"Accept-Encoding\":\"gzip\",\"Connection\":\"close\",\"User-Agent\":\"go-dockerclient\"},\"Method\":\"GET\",\"Path\":\"/images/sha256:xxxxxxcc040e350e848dd39bf1cabc09653adb7ede6f050cbd16a7503de6/json\",\"User\":\"\"},\"labels\":{\"app\":\"opa-docker-authz\",\"id\":\"b6b53359-69d3-45e8-acbf-b7258ea848cf\",\"opa_version\":\"v0.18.0\",\"plugin_version\":\"0.7\"},\"result\":true,\"timestamp\":\"2020-09-05T19:40:15.136801273Z\"}" plugin=e680e3fff81e36d08a68f15256251be43a41a9a090f37f1c353f8d5fb95465a8

When Body is not null, data is around 6kb.

In 60 minutes OPA docker plugin processed around 2000 request.

Is there a way for me to control the size of the data?

ramapalani · 2020-09-20T08:49:40Z

@ashutosh-narkar Can you suggest a way to reduce the data or another way to avoid this 'huge' memory consumption by OPA docker plugin?

ashutosh-narkar · 2020-09-21T16:22:25Z

The data seems pretty small. Have you documented OPA's memory usage with time ? And also how much memory have you allocated so far ?

ramapalani · 2020-09-21T18:08:21Z

resource request is 4GB, but the actual usage went upto 25GB and then connection to scoket is lost. So we had to start docker-DIND without OPA plugin to get it working back

ashutosh-narkar · 2020-09-21T19:04:23Z

@ramapalani Can you provide an example of how to reproduce the issue ? Any scripts that you have to simulate the traffic etc. would be helpful.

ramapalani · 2020-09-21T20:07:45Z

I'll try to reproduce this in our pre-prod environment and share it with you.

ramapalani · 2020-09-22T21:34:28Z

@ashutosh-narkar I'm trying to reproduce this in pre-prod env. As part of this effort, I was checking whether the socket is open every minute using a simple shell script. I also collect open file and processes running at the failed instance.

Though I'm not exactly reproduce the issue as in prod env, I see opa socket is not listening often. Here is one instance of the failure. Many times the next check works fine and but failures do happen frequently.

Test script

#!/bin/sh

if ! which socat ; then apk add socat; fi

function testsocket
{
    socket=$(find /run/docker/plugins/ -name "*.sock" | grep opa)
    socat -u OPEN:/dev/null UNIX-CONNECT:${socket}
    EXIT_CODE=$?
    if [ ${EXIT_CODE} -eq 0 ];
    then
        echo "$(date): Connection to Socket successful"
    else
        echo "$(date): Connection to Socket FAILED"
        echo "Open files"
        lsof | grep opa
        echo "Running processes"
        ps -ef
    fi
}

output_file=/tmp/testsocket.log
set -x
docker -H unix:///var/run/dind/docker.sock plugin ls | tee ${output_file}
docker -H unix:///var/run/dind/docker.sock plugin inspect openpolicyagent/opa-docker-authz-v2:0.7 | tee -a ${output_file}
set +x
while true
do
    testsocket | tee -a ${output_file}
    sleep 1
done

Failure

Tue Sep 22 21:28:33 UTC 2020: Connection to Socket successful
Tue Sep 22 21:28:34 UTC 2020: Connection to Socket successful
2020/09/22 21:28:35 socat[28132] E exiting on signal 11
Tue Sep 22 21:28:35 UTC 2020: Connection to Socket FAILED
Open files
219	/opa-docker-authz	/dev/null
219	/opa-docker-authz	pipe:[208415330]
219	/opa-docker-authz	pipe:[208415331]
219	/opa-docker-authz	anon_inode:[eventpoll]
219	/opa-docker-authz	pipe:[208410360]
219	/opa-docker-authz	pipe:[208410360]
219	/opa-docker-authz	socket:[208410361]
219	/opa-docker-authz	socket:[208482551]
Running processes
PID   USER     TIME  COMMAND
    1 root     13:45 dockerd --storage-driver=overlay2 -H unix:///var/run/dind/docker.sock
   24 root      0:10 containerd --config /var/run/docker/containerd/containerd.toml --log-level info
  201 root      0:00 containerd-shim -namespace plugins.moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/plugins.moby/2946790b93416011fcf7eed801b307afbea481a8d3992b6a538e91ede4bf96e8 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/local/bin/containerd -runtime-root /run/docker/plugins/runtime-root
  219 root      0:11 /opa-docker-authz -policy-file /opa/policies/authz.rego
 5276 root      0:00 sh
 9215 root      0:00 sh
19177 root      0:00 sh
21420 root      0:00 {test-socket.sh} /bin/sh ./test-socket.sh
22230 root      0:00 tail -f /tmp/testsocket.log
28127 root      0:00 {test-socket.sh} /bin/sh ./test-socket.sh
28128 root      0:00 tee -a /tmp/testsocket.log
28136 root      0:00 ps -ef
Tue Sep 22 21:28:36 UTC 2020: Connection to Socket successful
Tue Sep 22 21:28:37 UTC 2020: Connection to Socket successful

Full log file is attached: testsocket.log

ashutosh-narkar · 2020-09-29T07:31:17Z

Hmm you're getting a segmentation fault. What system are you running this on ?

ramapalani · 2020-09-29T15:16:29Z

We run docker DIND (docker in docker) container as a Kuberenetes daemonset. This is the image docker:18.09.5-dind. OPA docker plugin is installed into this instance of docker.

ramapalani · 2020-10-05T17:20:22Z

@ashutosh-narkar I couldn't reproduce this issue in pre-prod environment, but we encounter this in production environment (with higher traffic) consistently after a short period.

So I created a custom plugin, to prevent privileged container creation and that works well.

ashutosh-narkar · 2020-10-05T17:47:24Z

That's great ! Is that custom plugin using OPA ?

ramapalani · 2020-10-05T19:30:58Z

No, created a fresh docker authorization plugin totally separate from OPA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to connect opa-docker-authz.sock #51

Unable to connect opa-docker-authz.sock #51

ramapalani commented Sep 16, 2020

ashutosh-narkar commented Sep 16, 2020

ashutosh-narkar commented Sep 16, 2020

ramapalani commented Sep 17, 2020 via email

ashutosh-narkar commented Sep 17, 2020

ramapalani commented Sep 17, 2020 via email •

edited

Loading

ramapalani commented Sep 17, 2020

ashutosh-narkar commented Sep 18, 2020

ramapalani commented Sep 18, 2020

ramapalani commented Sep 20, 2020

ashutosh-narkar commented Sep 21, 2020

ramapalani commented Sep 21, 2020

ashutosh-narkar commented Sep 21, 2020

ramapalani commented Sep 21, 2020

ramapalani commented Sep 22, 2020

ashutosh-narkar commented Sep 29, 2020

ramapalani commented Sep 29, 2020

ramapalani commented Oct 5, 2020

ashutosh-narkar commented Oct 5, 2020

ramapalani commented Oct 5, 2020

Unable to connect opa-docker-authz.sock #51

Unable to connect opa-docker-authz.sock #51

Comments

ramapalani commented Sep 16, 2020

ashutosh-narkar commented Sep 16, 2020

ashutosh-narkar commented Sep 16, 2020

ramapalani commented Sep 17, 2020 via email

ashutosh-narkar commented Sep 17, 2020

ramapalani commented Sep 17, 2020 via email • edited Loading

ramapalani commented Sep 17, 2020

ashutosh-narkar commented Sep 18, 2020

ramapalani commented Sep 18, 2020

ramapalani commented Sep 20, 2020

ashutosh-narkar commented Sep 21, 2020

ramapalani commented Sep 21, 2020

ashutosh-narkar commented Sep 21, 2020

ramapalani commented Sep 21, 2020

ramapalani commented Sep 22, 2020

ashutosh-narkar commented Sep 29, 2020

ramapalani commented Sep 29, 2020

ramapalani commented Oct 5, 2020

ashutosh-narkar commented Oct 5, 2020

ramapalani commented Oct 5, 2020

ramapalani commented Sep 17, 2020 via email •

edited

Loading