Skip to content
This repository has been archived by the owner on May 6, 2020. It is now read-only.

Move Health Checks to a top level configuration object #603

Closed
helgi opened this issue Apr 5, 2016 · 16 comments · Fixed by #836
Closed

Move Health Checks to a top level configuration object #603

helgi opened this issue Apr 5, 2016 · 16 comments · Fixed by #836
Assignees

Comments

@helgi
Copy link
Contributor

helgi commented Apr 5, 2016

Right now all health check logic lives as a normal config value with a HEALTHCHECK_* env var like naming convention - given how important this element has come and the expansion of its functionalities in the future then we should consider moving this to a top level configuration (like tags) and add new CLI endpoints for it.

deis health:set url=/, deis health:set port=4999 and so on

A migration can be pretty easily written to move things within the config object to represent the new location.

While we are at it there should be a discussion around introducing the separation between liveness and readiness probes (but keep ability to set a global one to apply to both for a given application) and how we want to arch the introduction of Exec and 'TCPSocket' in addition to HTTPGet

http://kubernetes.io/docs/user-guide/pod-states/#container-probes

@arschles
Copy link
Member

arschles commented Apr 6, 2016

I'm generally 👍 on this. Would this change have to happen as part of a minor release (at least), since it breaks the ux of the deis cli?

If I'm reading this right, deis config:set HEALTHCHECK_* ... wouldn't work as it does now. Same with the config API endpoint.

@helgi
Copy link
Contributor Author

helgi commented Apr 6, 2016

The endpoint would not change much, as with tags it would still be the /config endpoint but embedded as a health key instead of values or similar.

The CLI would be hit the least in this change but the way people used the CLI would have to change, yeah, so UX would be different.

We could possibly detect HEALTHCHECK_* and direct people to use health instead. I wouldn't recommend direct translation of the command for people if we add the other probe types

@bacongobbler
Copy link
Member

I'd be okay with this if we want to make application health checks as a first-class object. With the current proposal the UI would stray from our current standard of deis <object>:<action> since this is an object within an object.

Perhaps the following, which maps directly to a kubernetes healthcheck liveness and readiness probe?

deis healthcheck:add liveness 8888 /healthz --initial-delay-seconds 15 --timeout-seconds 1
deis healthcheck:info -a myapp
=== myapp Liveness
Path: /healthz
Port: 8888
Initial Delay (seconds): 15
Timeout (seconds): 1
=== myapp Readiness

Eventually we could add new healthcheck arguments like liveness-exec or liveness-tcp which would perform TCP or exec healthchecks within the container.

@helgi
Copy link
Contributor Author

helgi commented Apr 6, 2016

We have to store the value in the DB for recovery purposes.

How is it breaking the convention tho? We do what I proposed with tags and limits, or are you talking about something else?

@slack
Copy link
Member

slack commented Apr 6, 2016

At this stage of the stable game I'm fine with keeping "magic environment variables". Though not ideal.

If we are to unify or describe "application configuration", what other hunks of data should live in this object?

  • resource limits
  • tags
  • environment variables / secrets (should they be separate)
  • health checks
  • attached services (via add-ons)

Edit: added secrets to env vars

@bacongobbler
Copy link
Member

Resource limits, tags and environment variables are all currently part of the application config object. I agree that health checks and attached services should be part of the application's config as well, though not necessarily as a 2.0 thing. I'm just voicing how I'd prefer this feature to be rolled out. :)

We have to store the value in the DB for recovery purposes.

I agree with you there. I'm just concerned about the client UX. Sorry if that wasn't made clear. I just don't like deis config health:set and would rather deis healthchecks:set like the other application config before it (tags, limits).

@helgi
Copy link
Contributor Author

helgi commented Apr 6, 2016

oh yeah, I see now, it was my mistake since I meant to write deis when I did config in the commands above. It would be top level like the others

@helgi
Copy link
Contributor Author

helgi commented Apr 6, 2016

@slack I think you are starting to describe a higher level abstraction here and could be its own ticket

@arschles
Copy link
Member

arschles commented Apr 6, 2016

Late to respond here - I was focusing on changing the cli UX. The api endpoint could, in theory get fancy and detect magic env vars, but at this point in the discussion it seems like an implementation detail...

@slack
Copy link
Member

slack commented Apr 6, 2016

@helgi pretty much. We need is the ability to dump/load all of this stuff holistically. And extend it over time.

All of this config, beyond env vars, needs to be dumped/loaded by folks using the platform. I'll open a second ticket for that broader work.

@kmala
Copy link
Contributor

kmala commented Jun 14, 2016

I would like to store the health check data as part of the config object because changing a parameter of health check should do a new release and which continues to follow our model of build + config = release
Coming to how the cli should look like i have two ideas:

  1. have handlers and health check probes as separate configuration where our cli looks like
deis handlers:<action> <handler_name>  <handler_type> <port/command> <flags>
  • action will be add, modify, delete, list
  • handler_name can be any unique name to identify the handler
  • handler_type will be exec,http or tcp
  • handler will expect a port or command depending on the handler type
  • flags will be all the non-mandatory fields like path, host, scheme

handlers are not bound to apps like deis:keys which means they are stored as a separate object.

deis healthchecks:<action> <probe_type> <handler_name> <flags>
  • probe_type will be either liveness or readiness

health checks are create for a specific apps
Advantages:

  • we can use the same handler for multiple apps.
  • we can reuse this model if and when we want to support life cycle/contianer hooks
  • will be two simpler commands than one large and confusing command

Disadvantages:

  • Adding health check would be a two step process instead of one
  1. have handlers and health check probes in the same configuration where our cli looks like
deis healthchecks:<action> <probe_type> <handler_type>  <port/command> <flags>

the reason i want to have separate probe_type and handler_type in the command line instead of #603 (comment) is the app shouldn't have multiple handlers for the same probe(should be either of http, tcp or exec) and i feel it would be simpler to handle the duplicate cases during creation and update with one i mentioned.

Advantages:

  • single command to create a health check.

Disadvantages:

  • Implementation/maintenance of code might be difficult as everything is a single command
  • not modular enough

cc @bacongobbler @helgi @slack

@helgi
Copy link
Contributor Author

helgi commented Jun 14, 2016

I would like to store the health check data as part of the config object

That's what I want as well - That's why I drew the parallels with registry / tags as they are in the config object but no longer live within the values subfield

As far as the two proposals, both have their own merits but I personally am leaning towards the latter and that has less to do with healthchecks and more to do with a future feature of "common" (shared / global) settings (see #383) where we could achieve the former, while using the latter. But obviously #383 isn't on any lists right now to work on so take that with a grain of salt.

Another thing that comes to my mind is that I generally associate health checks 1:1 with an application, that is driving some of the above. How do you envision the scenario of 20 [identical] apps sharing the same health check but then I need to change (even temporarily) a setting for one of the apps due to a bug or I'm doing a slow rollout that affects how healthchecks will be done. Would I create a new handler and associate that instead? The risk of a mishap is a bit higher than normal when it is shared.

Nice proposals @kmala!

@bacongobbler
Copy link
Member

This was how I was envisioning the CLI would look like as well.

Should we come up with some default handlers which the user can plug into and use for the general use case (or at least some examples in the docs)? Would we ever consider making handlers "template-able" so a user could create a handler for one app, then re-use the same handler but with a different port?

Instead of a comment, we should make a PR against deis/workflow with all this fleshed out so we can merge the document with all this information after the work is done.

@helgi
Copy link
Contributor Author

helgi commented Jun 14, 2016

@bacongobbler can you elaborate on what "this" means? There are 2 proposals :)

As far as the "template-able" aspect you mentioned, wouldn't that complicate healtchecks and introduce indirection that most people would avoid in a production context by having duplicate handlers tied to each individual apps. We're not talking about many configurable things per app but I suppose sizeable enough if they are all modified

@bacongobbler
Copy link
Member

bacongobbler commented Jun 14, 2016

can you elaborate on what "this" means?

I'm in favour of @kmala's proposal for the healthchecks side of things, but not for the handlers. I'm still more in favour of my own proposal in #603 (comment) but I don't like the way the CLI is handled either.

For starters, I'm not too fond of handler being a top level object name. It's too generic of a name. I'd prefer a name that explains what the "handler" is supposed to "handle".

It also breaks down health checks into two separate commands. Why is this necessary, other than for "storing" commonly used health checks? Can't we keep it all in just the Config object for the application, since they won't be adding more than one type of handler to their app?

As far as the "template-able" aspect you mentioned, wouldn't that complicate healthchecks and introduce indirection that most people would avoid in a production context by having duplicate handlers tied to each individual apps.

I think the current approach of having 30 different http handlers, all identical except for the port or the initial delay timeout is overly complicated.

@bacongobbler
Copy link
Member

I have created a design document at deis/workflow#322. Any input is appreciated there!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
6 participants