Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-8802: Add StoppableWorkers.StartTimer/NextTick. #353

Merged
merged 2 commits into from
Sep 19, 2024

Conversation

dgottlieb
Copy link
Member

I like the idea of having a NextTick for my worker. But I think there are some oddities with this so floating it around first.

We should chat about what our options are here as a next step.

@viambot viambot added the safe to test Mark as safe to test label Sep 17, 2024
Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Some thoughts.

// }
func (sw *StoppableWorkers) NextTick() bool {
select {
case <-sw.timer.C:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess NextTick() before StartTimer could panic due to accessing .C.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct -- also talked offline and also leveraging the type system to prevent this programmer error.

// if the timer has ticked and false if the context is canceled. Such that one can write a loop to
// do work as such:
//
// for sw.NextTick() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I like the ergonomics of this example. But:

You likely already alluded to this in our offline discussion, but my main concern is about multiple goroutines calling NextTick() concurrently. IIUC only one can <- a tick of the timer, so all concurrent NextTick() calls are subject to missing ticks <-ed by other goroutines. If you wanted multiple goroutines to be able to check ticks, you would need some "fan-out" pattern that would require more than one channel (or, hackily, relying on the fact that closeing a channel "notifies" all receivers).

Did you plan to limit usage of this to only one goroutine? If so, we'd probably need to document that and, preferably, somehow disallow calling NextTick() when another goroutine is already waiting for a tick. If you wanted more of "fan-out" pattern then would be happy to discuss alternate designs here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked offline -- I have no need for multiple goroutines being managed by a ticker (or set of tickers) event. @benjirewis and I decided to make a special constructor that would leverage the type system to prevent those usages.

@viambot viambot added safe to test Mark as safe to test and removed safe to test Mark as safe to test labels Sep 18, 2024
@@ -1,4 +1,4 @@
package utils_test
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry -- I wrote my test calling NewStoppableWorkerWithTicker and saw red when getting the compile error. I didn't need to do this, but it was already done.

I don't think* there's a concern with using the non _test package here.

defer timer.Stop()
for {
select {
case <-ctx.Done():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally prefer to not see this extra "select" statement. But I left it in here because otherwise I'd have to justify why my test assertion needs to check for ">= 1" instead of exactly 1.

And in the process of justifying it I felt pretty petty for avoiding writing this code.

That said, I'd still prefer to not see these. Most use cases do not need them at all. But using tickers with a small timeout that kicks off jobs is one of the exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that this is an exception. I agree that this pattern is generally confusing and usually unnecessary, though.

Copy link
Member

@benjirewis benjirewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for going the constructor route.

sw.mu.Lock()
defer sw.mu.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@penguinland I had mentioned before that a worker could not add another worker to the group as it would result in a deadlock. That was because I wrote the Stop code slightly incorrectly. Dan fixed it here by making sure Wait is not called while holding the mutex.

defer timer.Stop()
for {
select {
case <-ctx.Done():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that this is an exception. I agree that this pattern is generally confusing and usually unnecessary, though.

@dgottlieb dgottlieb merged commit b5700ec into viamrobotics:main Sep 19, 2024
6 checks passed
@dgottlieb dgottlieb deleted the RSDK-8802 branch September 19, 2024 15:20
sw.mu.Lock()
defer sw.mu.Unlock()
if sw.ctx.Err() != nil {
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we hit this return statement, we never unlock the mutex, and the program could deadlock. I think you're right that in this specific function, it's better to manually unlock the mutex and not use a defer statement, but usually I prefer using defer because it makes these things much easier to reason about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof yeah @dgottlieb I think we're missing an Unlock here. You're definitely right that defer normally handles this stuff, but we really wanted that Wait to be outside of a lock. Thanks for taking a look post-hoc @penguinland !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking things up: I made #361 to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test Mark as safe to test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants