-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: behaviour penalty when non-priority queue reaches maxNumElementsInNonPriorityQueue #1083
base: master
Are you sure you want to change the base?
Conversation
b21e90f
to
a6237bd
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #1083 +/- ##
============================================
- Coverage 84.82% 84.60% -0.23%
============================================
Files 91 91
Lines 15428 15486 +58
============================================
+ Hits 13087 13102 +15
- Misses 2341 2384 +43
|
This log shows the moment when the peer 16U*23VggQ non-prio queue reaches |
p.closeSendConn(PubSubPeerEventKind.DisconnectionRequested) | ||
else: | ||
Future[void].completed() | ||
p.behaviourPenalty += 0.0001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest moving the value used here to a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
p.behaviourPenalty += 0.0001 | ||
trace "Peer has reached maxNumElementsInNonPriorityQueue. Discarding message and applying behaviour penalty.", peer = p, score = p.score, | ||
behaviourPenalty = p.behaviourPenalty, agent = p.getAgent() | ||
Future[void].completed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given most return paths are Future[void].completed()
how about moving it outside the if
statement and doing an explicit return
on each arm for the cases that are not Future[void].completed()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please elaborate on what you believe is bad with the current approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing wrong per se. What I was hinting at was, when you have code akin to this:
if <cond>:
foo()
return A
elif <cond>:
bar()
return A
elif <cond>:
baz()
return B
...
else:
foobar()
return A
Sometimes it's clearer to group/extract the returns:
if <cond>:
foo()
elif <cond>:
bar()
elif <cond>:
baz()
return B
...
else:
foobar()
return A
But I just realised here it's almost 50/50, so not a lot of improvement can be made.
proc clearNonPriorityQueue*(p: PubSubPeer) = | ||
if len(p.rpcmessagequeue.nonPriorityQueue) > 0: | ||
p.rpcmessagequeue.nonPriorityQueue.clear() | ||
|
||
when defined(pubsubpeer_queue_metrics): | ||
libp2p_gossipsub_non_priority_queue_size.set(labelValues = [$p.peerId], value = 0) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How asynchronous is this operation? That is: How feasible it is that, between the clear
call and the metrics update, some value is added to the nonPriorityQueue
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proc isn't async and won't be suspended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's right. nvm then
This PR applies a behavior penalty to peers whose non-prio queue reaches the max limit configured, instead of the previous strategy of disconnecting the peer. A conservative penalty of
0.0001
is added to behaviourPenalty for each message tried to be sent when the queue is over the limit, and the message is discarded. This usually results in abehaviourPenalty
around [0.1, 0.2] when the score is updated and its value around [-0.4, -0.1].It causes the peer to be pruned due to its negative score. This PR also clears the non-prio queue at this moment.