-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ECS Deployment Circuit Breaker Support #185
Comments
Digging around a bit more, it seems like it considers the deployment successful once the rollback removes the attempted deployment from ecs-deploy. Not sure where that lives in the code, but a possible fix might be logic that says "if AWS returns a deployment does not exist / is deleted response, consider the deployment failed" |
@fabfuel any thoughts on the above? I can provide more context if it's helpful. |
Hi Brett, sorry for the late response. In March I added support for the ECS Circuit Breaker and added parsing of the rollout state. This means, ecs-deploy should recognize, if the deployment was rolled back. Could you share some more details, especially about the timing. How long did you ecs-deploy instruct to wait for the deployment to be finished? Best |
Hi Brett, thanks for all the details! I'm trying to reproduce it myself and keep you updated. Best |
Hey Fabian! Appreciate you digging into this! The issue was that the container was missing a required env var, which caused it to fail on start-up. However, the container isn't exiting properly, so eventually it just times-out and never passes ALB healthchecks. This causes the deployment to continue recycling containers until the circuit breaker is tripped. For added color, when I tested this with circuit breaker turned on, eventually ecs deploy would timeout (since the deploy was never successful). If I turned on the --rollback flag, it would successfully roll back. If I didn't, the deploy would continue indefinitely, spinning up new containers and then recycling them after the ALB healthcheck fails. Let me know if any other info would be helpful! Brett |
Hi @fabfuel ! I hope you had a great new years! I was wondering if there were any updates on the above? We'd like to enable circuit breaker support, but we're hitting a big of a snag because of the issue described above. Thanks! Brett |
Hi Brett, thanks a lot, I also wish you a happy and healthy new year! I still need to dig into this a bit more and double check, if it might be related to this ECS behavior: Here is described & a screenshot, how In your case, what happened to the container, due to the missing env var? Did the app crash or exit with exit code 1 or something similar? Thanks for the details and sorry for moving slow, busy times 😅 Best |
thanks for the details. I will look into the Python error. Could you share your task definition (JSON)? Thanks |
Hi @fabfuel Completely missed your message. Unfortunately, this specific task definition was deleted. Is there anything else I can do to help? I try to replicate the issue on our side as well. |
Is there current support for tracking when a deployment fails and the ECS deployment breaker rolls it back? I just tried forcefully failing a deployment, and while ECS correctly triggered the circuit breaker and rolled back, ecs deploy considered it as a successful deployment.
Not sure if this is a bug or working as intended.
The text was updated successfully, but these errors were encountered: