-
-
Notifications
You must be signed in to change notification settings - Fork 392
api: Use exponential backoff in call_on_each_event. #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -601,8 +601,16 @@ def call_on_each_event( | |
if narrow is None: | ||
narrow = [] | ||
|
||
def do_register() -> Tuple[str, int]: | ||
while True: | ||
queue_id = None | ||
# NOTE: Back off exponentially to cover against potential bugs in this | ||
# library causing a DoS attack against a server when getting errors | ||
# (explicit values listed for clarity) | ||
backoff = RandomExponentialBackoff(maximum_retries=10, | ||
timeout_success_equivalent=300, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we want a |
||
delay_cap=90) | ||
while backoff.keep_going(): | ||
# Ensure event queue exists (or continues to do so) | ||
if queue_id is None: | ||
if event_types is None: | ||
res = self.register() | ||
else: | ||
|
@@ -611,18 +619,15 @@ def do_register() -> Tuple[str, int]: | |
if 'error' in res['result']: | ||
if self.verbose: | ||
print("Server returned error:\n%s" % res['msg']) | ||
time.sleep(1) | ||
backoff.fail() | ||
continue | ||
else: | ||
return (res['queue_id'], res['last_event_id']) | ||
|
||
queue_id = None | ||
# Make long-polling requests with `get_events`. Once a request | ||
# has received an answer, pass it to the callback and before | ||
# making a new long-polling request. | ||
while True: | ||
if queue_id is None: | ||
(queue_id, last_event_id) = do_register() | ||
backoff.succeed() | ||
queue_id, last_event_id = res['queue_id'], res['last_event_id'] | ||
|
||
# Make long-polling requests with `get_events`. Once a request | ||
# has received an answer, pass it to the callback and before | ||
# making a new long-polling request. | ||
res = self.get_events(queue_id=queue_id, last_event_id=last_event_id) | ||
if 'error' in res['result']: | ||
if res["result"] == "http-error": | ||
|
@@ -649,12 +654,11 @@ def do_register() -> Tuple[str, int]: | |
# | ||
# Reset queue_id to register a new event queue. | ||
queue_id = None | ||
# Add a pause here to cover against potential bugs in this library | ||
# causing a DoS attack against a server when getting errors. | ||
# TODO: Make this back off exponentially. | ||
time.sleep(1) | ||
|
||
backoff.fail() | ||
continue | ||
|
||
backoff.succeed() | ||
for event in res['events']: | ||
last_event_id = max(last_event_id, int(event['id'])) | ||
callback(event) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can go with
maximum_retries=50
, i.e. basically arranging it so that it will eventually give up, but only if it's really clear the server is down for good and not just having temporary downtime.I also notice there's a
time.sleep(1)
indo_register
; probably that should be changed as well.(Or maybe
do_register
should be folded into this loop?)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, the behavior is changed from before, which is a little concerning, though arguably useful - the handler will return after some time, rather than be in an infinite loop, so the caller can decide whether to retry again?
We could put the backoff parameters into the method call, or something like them; that could also enable behavior like previously (regarding the infinite loop), but with backoff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is correct to potentially fail, so the caller knows something is horribly wrong. We have a bunch of clients that are forever looping and getting 401s during registration, and I don't think that their owners have any sign that anything is going wrong.
I think we should raise an exception if we hit the max retries, since existing code is likely not expecting the function to ever return. Raising an exception means it won't unexpectedly fall through to other code.
This behavior change certainly merits a documentation update as well.