Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mgbl] support for http timeout? #87

Open
ARolek opened this issue Jul 26, 2019 · 2 comments
Open

[mgbl] support for http timeout? #87

ARolek opened this issue Jul 26, 2019 · 2 comments

Comments

@ARolek
Copy link
Member

ARolek commented Jul 26, 2019

We're dealing with a situation where a tile provider is taking very long to respond (10s of minutes) and not closing the connection, so atlante is hanging and never really failing. The queue is then killing the job after 30 minutes and no fail is reported from the worker. A couple ideas:

  1. Can a http timeout be configured for tile requests?
  2. If an atlante worker is killed can it fire a fail event to the coordinator before it shuts down so the error is reported?
@gdey
Copy link
Member

gdey commented Jul 29, 2019

Do you know what type of kill is happening? Is it a hard kill (I assume it is) or a soft one? If it's a soft kill I can catch it and try sending the message. But if it's a hard kill then no. It, also, be it does both in which case sometimes we would be. I think it makes send to do this regardless.

@ARolek
Copy link
Member Author

ARolek commented Jul 29, 2019

@gdey looks like it does both: https://docs.aws.amazon.com/batch/latest/userguide/job_timeouts.html

You specify an attemptDurationSeconds parameter, which must be at least 60 seconds, either in your job definition, or when you submit the job. When this number of seconds has passed following the job attempt's startedAt timestamp, AWS Batch terminates the job. On the compute resource, your job's container receives a SIGTERM signal to give your application a chance to shut down gracefully. If the container is still running after 30 seconds, a SIGKILL signal is sent to forcefully shut down the container.

@gdey gdey mentioned this issue Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants