Skip to content

Commit

Permalink
(TK-487) Allow friendly init/start fail fast via ::exit throw
Browse files Browse the repository at this point in the history
Allow init and start methods to throw a request-shutdown style ex-info
map to short circuit the startup process and exit with a specified
message and status, rather than a backtrace.

This just provides a short circuiting (immediate) counterpart to the
existing, deferred shutdown requests provided by request-shutdown.
  • Loading branch information
rbrw committed Jul 24, 2020
1 parent c943510 commit c450c97
Show file tree
Hide file tree
Showing 4 changed files with 56 additions and 20 deletions.
5 changes: 5 additions & 0 deletions documentation/Built-in-Shutdown-Service.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,11 @@ The `:messages` should include any desired newlines, and when relying
on `:puppetlabs.trapperkepper.core/main`, the `:messages` will be
printed and `exit` will be called with the given `:status`.

This map is exactly the same map that can be thrown from an `init` or
`start` method via `ex-info` to initiate an immediate shutdown.
(Calls to `request-shutdown` only trigger a shutdown later, currently
after all of the services have been initialized and started.)

### `shutdown-on-error`

`shutdown-on-error` is a higher-order function that can be used as a wrapper around some logic in your services; its functionality is simple:
Expand Down
3 changes: 3 additions & 0 deletions documentation/Defining-Services.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ The default implementation of the lifecycle functions is to simply return the se

Trapperkeeper will call the lifecycle functions in order based on the dependency list of the services; in other words, if your service has a dependency on service `Foo`, you are guaranteed that `Foo`'s `init` function will be called prior to yours, and that your `stop` function will be called prior to `Foo`'s.

If an exception is thrown by `init` or `start`, Trapperkeeper will
[initiate an immediate shutdown](Error-Handling.md).

### Example Service

Let's look at a concrete example:
Expand Down
13 changes: 13 additions & 0 deletions documentation/Error-Handling.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,19 @@ If the `init` or `start` function of any service throws a `Throwable`, it will c

If the `init` or `start` function of your service launches a background thread to perform some costly initialization computations (like, say, populating a pool of objects which are expensive to create), it is advisable to wrap that computation inside a call to `shutdown-on-error`; however, you should note that `shutdown-on-error` does *not* short-circuit Trapperkeeper's start-up sequence - the app will continue booting. The `init` and `start` functions of all services will still be run, and once that has completed, all `stop` functions will be called, and the process will terminate.

If the exception thrown by `init` or `start` is an `ex-info` exception
containing the same kind of map that
[`request-shutdown`](Built-in-Shutdown-Service.md#request-shutdown)
accepts, then Trapperkeeper will print the specified messages and exit
with the specified status as described there. For example:

(ex-info ""
{:kind :puppetlabs.trapperkepper.core/exit`
:status 3
:messages [["Unexpected filesystem error ..." *err*]])))}}

The `ex-info` message string is currently ignored.

## Services Should Fail Fast

Trapperkeeper embraces fail-fast behavior. With that in mind, we advise writing services that also fail-fast. In particular, if your service needs to spin-off a background thread to perform some expensive initialization logic, it is a best practice to push as much code as possible outside of the background thread (for example, validating configuration data), because `Throwables` on the main thread will propagate out of `init` or `start` and cause the application to shut down - i.e., it will *fail fast*. There are different operational semantics for errors thrown on a background thread (see previous section).
55 changes: 35 additions & 20 deletions src/puppetlabs/trapperkeeper/internal.clj
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,29 @@
required []]
(first (ks/cli! cli-args specs required))))

(def exit-request-schema
"A process exit request like
{:status 7
:messages [[\"something for stderr\n\" *err*]]
[\"something for stdout\n\" *out*]]
[\"something else for stderr\n\" *err*]]"
{:status schema/Int
:messages [[(schema/one schema/Str "message")
(schema/one java.io.Writer "stream")]]})

(defn exit-exception? [ex]
(and (instance? ExceptionInfo ex)
(not (schema/check {(schema/optional-key :puppetlabs.trapperkeeper.core/exit)
exit-request-schema}
(ex-data ex)))))

(defn shutdown-reason-for-ex
[exception]
(if (exit-exception? exception)
(merge {:cause :requested}
(select-keys (ex-data exception) [:puppetlabs.trapperkeeper.core/exit]))
{:cause :service-error :error exception}))

(schema/defn ^:always-validate run-lifecycle-fn!
"Run a lifecycle function for a service. Required arguments:
Expand Down Expand Up @@ -234,9 +257,15 @@
(log/debug (i18n/trs "Finished running lifecycle function ''{0}'' for service ''{1}''"
lifecycle-fn-name
service-id)))
(catch Throwable t
(log/error t (i18n/trs "Error during service {0}!!!" lifecycle-fn-name))
(throw t))))
(catch ExceptionInfo ex
(if (exit-exception? ex)
(log/info (i18n/trs "Immediate shutdown requested during service {0}"
lifecycle-fn-name))
(log/error ex (i18n/trs "Error during service {0}!!!" lifecycle-fn-name)))
(throw ex))
(catch Throwable ex
(log/error ex (i18n/trs "Error during service {0}!!!" lifecycle-fn-name))
(throw ex))))

(schema/defn ^:always-validate initialize-lifecycle-worker :- (schema/protocol async-prot/Channel)
"Initializes a 'worker' which will listen for lifecycle-related tasks and perform
Expand Down Expand Up @@ -286,9 +315,7 @@
(log/debug (i18n/trs "Lifecycle worker completed {0} lifecycle task; awaiting next task." type))
(catch Exception e
(log/debug e (i18n/trs "Exception caught in lifecycle worker loop"))
(deliver shutdown-reason-promise
{:cause :service-error
:error e})))
(deliver shutdown-reason-promise (shutdown-reason-for-ex e))))
(recur))

(do
Expand Down Expand Up @@ -345,16 +372,6 @@
;;;; regarding the cause of the shutdown, and is intended to be passed back
;;;; in to the top-level functions that perform various shutdown steps.

(def exit-request-schema
"A process exit request like
{:status 7
:messages [[\"something for stderr\n\" *err*]]
[\"something for stdout\n\" *out*]]
[\"something else for stderr\n\" *err*]]"
{:status schema/Int
:messages [[(schema/one schema/Str "message")
(schema/one java.io.Writer "stream")]]})

(def ^{:private true
:doc "The possible causes for shutdown to be initiated."}
shutdown-causes #{:requested :service-error :jvm-shutdown-hook})
Expand Down Expand Up @@ -617,8 +634,7 @@
(inc-restart-counter! this)
this
(catch Throwable t
(deliver shutdown-reason-promise {:cause :service-error
:error t})))))))
(deliver shutdown-reason-promise (shutdown-reason-for-ex t))))))))

(schema/defn ^:always-validate boot-services-for-app**
"Boots services for a TK app. WARNING: This should only ever be called
Expand All @@ -630,8 +646,7 @@
(a/init app)
(a/start app)
(catch Throwable t
(deliver shutdown-reason-promise {:cause :service-error
:error t})))
(deliver shutdown-reason-promise (shutdown-reason-for-ex t))))
(deliver result-promise app)))

(schema/defn ^:always-validate boot-services-for-app* :- (schema/protocol a/TrapperkeeperApp)
Expand Down

0 comments on commit c450c97

Please sign in to comment.