Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout in case of one-shot container #19

Open
marden opened this issue Sep 9, 2022 · 9 comments
Open

Timeout in case of one-shot container #19

marden opened this issue Sep 9, 2022 · 9 comments

Comments

@marden
Copy link

marden commented Sep 9, 2022

I have 7 services in my docker-compose.swarm.yml. Everything run as a daemon except storage-minio-client service.
The storage-minio-client acts like a one-shot container - executes command defined by an entrypoint and exits.

storage-minio-client:
    image: ${CI_REGISTRY}/service/storage/minio:client
    networks:
      - local
    deploy:
      restart_policy:
        condition: on-failure
      placement:
        constraints:
          - node.labels.worker-stateless == 1

The image was build from Dockerfile:

FROM bitnami/minio-client

COPY setup-buckets.sh /tmp/setup-buckets.sh

ENTRYPOINT ["/tmp/setup-buckets.sh"]

The setup-buckets.sh creates buckets and exits with code 0.

This kind of container causes docker-stack-wait to timeout, however services were deployed successfully.
Also, Portainer shows complete status for the storage-minio-client service.

Logs from the script:

13:19:16  Status: Downloaded newer image for sudobmitch/docker-stack-wait:latest
13:19:17  Service storage-minio3 state: deployed
13:19:17  Service storage-minio4 state: deployed
13:19:17  Service storage-minio1 state: deployed
13:19:17  Service storage-minio-client state: replicating 0/1
13:19:17  Service storage-minio2 state: deployed
13:19:17  Service storage-nginx state: deployed
13:19:17  Service storage-php-web state: deployed
14:19:24  Error: Timeout exceeded

Looks like the script expects all services are in running state.

P.S. I found a workaround - added sleep 300 in my bash script that keeps the storage-minio-client container running, so the docker-stack-wait finishes successfully.

P.P.S. Similar issue was reported earlier.

@sudo-bmitch
Copy link
Owner

sudo-bmitch commented Sep 10, 2022

I don't know a good way to differentiate between an expected exit, an unexpected exit, and a container that hasn't started yet. PR's welcome, but it should gracefully handle the different scenarios.

Ideally docker stack deploy would support mode: replicated-job, but that hasn't happened yet. Until then, filtering out the services you aren't interested in tracking may be a better option. E.g.: if you set the label wait-service: "true" on your services, you can then run

docker-stack-wait.sh -f label=wait-service=true $stack_name

@marden
Copy link
Author

marden commented Sep 11, 2022

I know about the filter option. But it's not suitable in my case because the docker-stack-wait is used in a CI/CD process to handle universal preparation for dozens of services (and it's increasing), and 99% of them need to be marked with a label to make the script wait. A developer (especially, newbie) should always remember that feature.

I think the better option is to introduce a new argument - ignore, so the script can skip services with an appropriate label. In this case all I need to do is to set a label on a few specific containers.

@marden
Copy link
Author

marden commented Nov 9, 2022

Any updates on this?

@sudo-bmitch
Copy link
Owner

None here. I haven't seen any PRs that implement this while also differentiating between an expected exit, an unexpected exit, and a container that hasn't started yet.

@marden
Copy link
Author

marden commented Nov 11, 2022

Well, it's hard to reproduce the exact case.

As I mentioned before, the new argument (i.e. -ignore or -skip) that is the opposite of -f can help. It tells the script not to wait for services that are marked with a label.

Let me give you a situation - I have 7 services and I need to wait for 6 of them.
Compare 2 configs:

  1. docker-stack-wait.sh -f label=deploy.wait=true
services:
  srv-1:
      deploy:
        labels:
            wait: "true"
  srv-2:
      deploy:
        labels:
            wait: "true"
  srv-3:
      # don't need to wait for
  srv-4:
      deploy:
        labels:
            wait: "true"
  srv-5:
      deploy:
        labels:
            wait: "true"
  srv-6:
      deploy:
        labels:
            wait: "true"
  srv-7:
      deploy:
        labels:
            wait: "true"

Requires 6 labels.

  1. docker-stack-wait.sh -skip label=deploy.wait=false
services:
  srv-1:
      ...
  srv-2:
      ...
  srv-3:
      deploy:
        labels:
            wait: "false"  # mark only this service
  srv-4:
      ...
  srv-5:
      ...
  srv-6:
      ...
  srv-7:
      ...

Requires only 1 label.

Obviously, the second one is much easier to read and maintain.

adriannieto-attechnest added a commit to adriannieto-attechnest/docker-stack-wait that referenced this issue Mar 23, 2023
@pschichtel
Copy link

@sudo-bmitch replicated-job is supported now by docker stack deploy, but your script still runs into the timeout. So can I assume these jobs are still not supported?

@sudo-bmitch
Copy link
Owner

@pschichtel see my comment above #19 (comment)

@pschichtel
Copy link

ah sorry I missed that. In that case I might work on this soon-ish. I'm currently ignoring the affected services as suggested, as it still works out fine in this setup timing-wise, but I'd prefer a proper solution.

@pschichtel
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants