Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Time error groupping #99

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 41 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,34 +91,38 @@ defmodule YourApp.Router do
options: # ...
]
]

#...
end
```

## Notification Trigger
## Throttling notifications

By default, `BoomNotifier` will send a notification every time an exception is
raised.

However, there are different strategies to decide when to send the
notifications using the `:notification_trigger` option with one of the
following values: `:always` and `:exponential`.

### Always
There are two throttling mechanisms you can use to reduce notification rate. Counting
based and time based throttling. The former allows notifications to be delivered
on exponential error counting and the latter based on time throttling, that is
if a predefined amount of time has passed from the last error.

This option is the default one. It will trigger a notification for every
exception.
### Count Based Throttle

```elixir
defmodule YourApp.Router do
use Phoenix.Router

use BoomNotifier,
notification_trigger: :always,
notifiers: [
# ...
]
# ...
],
groupping: :count,
count: :exponential,
time_limit: :timer.minutes(30)
end
```

### Exponential
#### Exponential

It uses a formula of `log2(errors_count)` to determine whether to send a
notification, based on the accumulated error count for each specific
Expand All @@ -132,7 +136,7 @@ defmodule YourApp.Router do
use Phoenix.Router

use BoomNotifier,
notification_trigger: :exponential,
count: :exponential,
notifiers: [
# ...
]
Expand All @@ -143,32 +147,49 @@ defmodule YourApp.Router do
use Phoenix.Router

use BoomNotifier,
notification_trigger: [exponential: [limit: 100]]
count: [exponential: [limit: 100]]
notifiers: [
# ...
]
```

### Notification trigger time limit
### Time Based Throttle

```elixir
defmodule YourApp.Router do
use Phoenix.Router

use BoomNotifier,
notifiers: [
# ...
],
groupping: :time,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it makes sense to have :time as other notification_trigger strategy instead of adding a new option? And maybe we could use the same time limit key we use for the others

throttle: :timer.minutes(1)
end
```

### Time Limit

If you've defined a triggering strategy which holds off notification delivering you can define a time limit value
which will be used to deliver the notification after a time limit milliseconds have passed from the last error. The
time counter is reset on new errors and only applies for cases where notifications are not sent.
Both groupping strategies allow you to define a time limit that tells boom to deliver a notification if the amount
of time has passed from the last time a notification was sent or error groupping started.

If specified, a notification will be triggered even though the groupping strategy does not met the criteria. For
time based throttling this is usefull if errors keep appearing before the throttle time and for count based throttle
to reset (and notify) the grupping after a certain time period.

```elixir
defmodule YourApp.Router do
use Phoenix.Router

use BoomNotifier,
notification_trigger: [:exponential],
count: [:exponential],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though I like this name better I'd rather not to introduce api changes in this pull request

I think before releasing v1.0.0 we should revisit the whole api and address the changes in a separate PR

time_limit: :timer.minutes(30),
notifier: CustomNotifier

# ...
end
```


## Custom data or Metadata

By default, `BoomNotifier` will **not** include any custom data from your
Expand Down
19 changes: 15 additions & 4 deletions lib/boom_notifier/error_storage.ex
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ defmodule BoomNotifier.ErrorStorage do
]

@type t :: %__MODULE__{}
@type error_strategy :: :always | :exponential | [exponential: [limit: non_neg_integer()]]
@type error_strategy ::
:always | :none | :exponential | [exponential: [limit: non_neg_integer()]]

use Agent, start: {__MODULE__, :start_link, []}

Expand All @@ -32,8 +33,8 @@ defmodule BoomNotifier.ErrorStorage do
occurrences is increased and it also updates the first and last time it
happened.
"""
@spec store_error(ErrorInfo.t()) :: :ok
def store_error(error_info) do
@spec accumulate(ErrorInfo.t()) :: :ok
def accumulate(error_info) do
%{key: error_hash_key} = error_info
timestamp = error_info.timestamp || DateTime.utc_now()

Expand Down Expand Up @@ -137,7 +138,17 @@ defmodule BoomNotifier.ErrorStorage do
|> Map.replace!(:last_occurrence, nil)
end

@spec do_send_notification?(nil | __MODULE__.t()) :: boolean()
def eleapsed(nil), do: 0

def eleapsed(%__MODULE__{} = error_info) do
DateTime.diff(
error_info.last_occurrence,
error_info.first_occurrence,
:millisecond
)
end

@spec do_send_notification?(ErrorInfo.t() | nil) :: boolean()
defp do_send_notification?(nil), do: false

defp do_send_notification?(error_storage_item) do
Expand Down
69 changes: 46 additions & 23 deletions lib/boom_notifier/notification_sender.ex
Original file line number Diff line number Diff line change
Expand Up @@ -14,48 +14,49 @@ defmodule BoomNotifier.NotificationSender do
GenServer.start_link(__MODULE__, :ok, name: __MODULE__)
end

def async_notify(notifier, occurrences, options) do
GenServer.cast(__MODULE__, {:notify, notifier, occurrences, options})
end

def async_trigger_notify(settings, error_info) do
GenServer.cast(__MODULE__, {:trigger_notify, settings, error_info})
end

def notify(notifier, occurrences, options) do
spawn_link(fn ->
notifier.notify(occurrences, options)
end)
end

def notify_all(settings, error_info) do
notification_trigger = Keyword.get(settings, :notification_trigger, :always)
occurrences = ErrorStorage.reset_stats(error_info, notification_trigger)
error_info = Map.put(error_info, :occurrences, occurrences)
def trigger_notify(settings, error_info) do
ErrorStorage.accumulate(error_info)

BoomNotifier.walkthrough_notifiers(
do_trigger_notify(
Keyword.get(settings, :groupping, :count),
settings,
fn notifier, options -> notify(notifier, error_info, options) end
error_info
)
end

def trigger_notify(settings, error_info) do
timeout = Keyword.get(settings, :time_limit)

ErrorStorage.store_error(error_info)
defp do_trigger_notify(:count, settings, error_info) do
time_limit = Keyword.get(settings, :time_limit)

if ErrorStorage.send_notification?(error_info) do
notify_all(settings, error_info)
:ok
else
if timeout do
{:schedule, timeout}
if time_limit do
{:schedule, time_limit}
else
:ok
:ignored
end
end
end

defp do_trigger_notify(:time, settings, error_info) do
throttle = Keyword.get(settings, :throttle, 100)
time_limit = Keyword.get(settings, :time_limit)

stats = ErrorStorage.get_stats(error_info)

if ErrorStorage.eleapsed(stats) >= time_limit do
notify_all(settings, error_info)
:ok
else
{:schedule, throttle}
end
end

# Server callbacks

@impl true
Expand All @@ -78,6 +79,9 @@ defmodule BoomNotifier.NotificationSender do
cancel_timer(timer)

case trigger_notify(settings, error_info) do
:ignored ->
{:noreply, state}

:ok ->
{:noreply, state}

Expand Down Expand Up @@ -115,6 +119,25 @@ defmodule BoomNotifier.NotificationSender do
{:noreply, state |> Map.delete(error_info.key)}
end

# Private methods

defp notify(notifier, occurrences, options) do
spawn_link(fn ->
notifier.notify(occurrences, options)
end)
end

defp notify_all(settings, error_info) do
count_strategy = Keyword.get(settings, :count)
occurrences = ErrorStorage.reset_stats(error_info, count_strategy)
error_info = Map.put(error_info, :occurrences, occurrences)

BoomNotifier.walkthrough_notifiers(
settings,
fn notifier, options -> notify(notifier, error_info, options) end
)
end

defp cancel_timer(nil), do: nil
defp cancel_timer(timer), do: Process.cancel_timer(timer)
end
2 changes: 1 addition & 1 deletion test/example_app/lib/example_app_web/router.ex
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ defmodule ExampleAppWeb.Router do
use ExampleAppWeb, :router

use BoomNotifier,
notification_trigger: [exponential: [limit: 8]],
count: [exponential: [limit: 8]],
custom_data: [:assigns, :logger],
ignore_exceptions: [IgnoreExceptionError],
notifiers: [
Expand Down
Loading
Loading