You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, thanks for the great mesos and marathon tooling -- it's all been extremely solid and reliable!
I am using marathon-consul along with haproxy-consul for load balancing.
As soon as a task is running, it's added to consul and thus the haproxy config. If an application has a non-trivial start up time, I will start to see 503s. That is, tasks are added to the LB before they are reporting healthy.
I've been trying to figure out the best way to solve this. The app definition in consul does have health check info, so, theoretically I could translate the marathon checks to haproxy checks in my haproxy-consul template. Practically though the marathon checks could be more complex than haproxy can express (e.g. some of the apps are using "COMMAND" marathon health checks). It also seems non-trivial to be able to do the translation in the template, and, there might be a chance haproxy and marathon don't agree.
My current thinking is that it makes sense to leave the health checks evaluated by marathon, and include this information in the KV store (perhaps /<app>/<task>/healthCheck ?)
Then in the template I can check that tasks where their app has a health check is defined is only rendered if the task's health check is passing (i.e. not unknown or failing).
I believe this can be done by adding support for the following event types from the event bus:
add_health_check_event
remove_health_check_event
failed_health_check_event
health_status_changed_event
and including this new information in the KV.
I am happy to submit a PR for this, and would love to get any feedback on this approach, if there's a better way, etc.
thanks!
The text was updated successfully, but these errors were encountered:
I've simplified this a bit, by including a HealthCheckResults field directly into the task (already reported by marathon for GET /v2/apps/<app>/tasks), and updating this in the KV when the health_status_changed_event event is fired.
We are using this in production now and it's working great. Really helps out with the lag time between new tasks being added and actually being alive and healthy. 👍
I recently updated to this version to get rid of 503's, It works great if the healthcheck is http or tcp.
Though if healthcheck is based on command ( say curl), Marathon actually shows it as healthy but consul json has "healthCheckResults" set to null. So now Haproxy fails to pick it up based on healthy value ( "healthy":false).
First, thanks for the great mesos and marathon tooling -- it's all been extremely solid and reliable!
I am using marathon-consul along with haproxy-consul for load balancing.
As soon as a task is running, it's added to consul and thus the haproxy config. If an application has a non-trivial start up time, I will start to see 503s. That is, tasks are added to the LB before they are reporting healthy.
I've been trying to figure out the best way to solve this. The app definition in consul does have health check info, so, theoretically I could translate the marathon checks to haproxy checks in my haproxy-consul template. Practically though the marathon checks could be more complex than haproxy can express (e.g. some of the apps are using "COMMAND" marathon health checks). It also seems non-trivial to be able to do the translation in the template, and, there might be a chance haproxy and marathon don't agree.
My current thinking is that it makes sense to leave the health checks evaluated by marathon, and include this information in the KV store (perhaps
/<app>/<task>/healthCheck
?)Then in the template I can check that tasks where their app has a health check is defined is only rendered if the task's health check is passing (i.e. not unknown or failing).
I believe this can be done by adding support for the following event types from the event bus:
add_health_check_event
remove_health_check_event
failed_health_check_event
health_status_changed_event
and including this new information in the KV.
I am happy to submit a PR for this, and would love to get any feedback on this approach, if there's a better way, etc.
thanks!
The text was updated successfully, but these errors were encountered: