Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

define concurrency model for traversal #255

Open
plastikfan opened this issue May 6, 2023 · 5 comments
Open

define concurrency model for traversal #255

plastikfan opened this issue May 6, 2023 · 5 comments
Assignees
Labels
design issue that captures design ideas

Comments

@plastikfan
Copy link
Contributor

plastikfan commented May 6, 2023

The navigator would probably have to return a channel of traverse result, rather than the traverse result itself. When the go routine is complete it signals it's completion by sending the result back thru the channel. The navigator will keep track of outstanding requests via the channels and continue navigation whilst there are still available slots. What we mean by available slots is that if we say we want to allow n concurrent go routines, we keep spawning until we reach n. As we get notified via their associated channels we can dispatch another unit of work. A point of complexity would be how to implement the listen feature. We can handle this by performing the fast forward phase single threaded. Then from this trigger point we can then proceed on a concurrent basis.

@plastikfan plastikfan self-assigned this May 6, 2023
@plastikfan plastikfan added the design issue that captures design ideas label May 6, 2023
@plastikfan
Copy link
Contributor Author

plastikfan commented May 6, 2023

So how would we build in concurrency without breaking all existing functionality? Well one idea would be to allow the client to define an alternative callback. We could define an async callback that would return a channel. The callback signifies completion by sending a traverse result to the channel.

But the trouble with this approach is that it would require the doubling of code that interacts with the callback. It would be better to stick with the single callback model but change the callback signature to return a channel. So sync and async models would communicate via a traverse result channel.

Write a quick unit test to see how this would work. We need to check that this model would still work on a sync basis.

For sync just invoke the callback for async launch with go prefix. So we need to define 2 different launchers.

@plastikfan
Copy link
Contributor Author

We don't have to expose the channel to the client, in fact that would be the wrong solution. The channel can be internalised, which is good for us as we don't have to break the existing interface. See channels inside channels pattern

@plastikfan
Copy link
Contributor Author

plastikfan commented May 9, 2023

IWorkload:

  • next() returns the next Job, should return nil when no work left

Stream: isa IWorkload open ended used we don't know the number of jobs upfront; eg the directory tree

Batch: isa IWorkload when we know all the jobs up front; eg navigator folder with files subscription

Job: is a small unit of work that can be parallelised

Worker: is the entity that executes a job (probably needs a context and a workgroup (unless we use chan of chan))

@plastikfan
Copy link
Contributor Author

plastikfan commented May 10, 2023

I think the best way forward in implementing concurrency would be to use the reactive model based upon RxJs. However, the RxGo project seems to be dead, partly due to the fact that generics have been released since go version 1.18 and creating a version 3 of RxGo based upon generics is no small task and has yet to be completed.

One approach we could use would be to re-implement RxGo with generics, implementing only what is required by this project but maintain compatibility with RxGo so that if and when a new version becomes available, it can be integrated easily.

Another approach would be to use the existing RxGo as a fork to test what it provides is adequate for the needs of this project.

@plastikfan
Copy link
Contributor Author

After playing around with RxGo and a few forks, I couldn't find whose tests would run, so I am going to build a minimal version from scratch, adhering to it's current design, except that I'll use generics rather than relying on typecasting interface{}'s. I'm only going to implement what is required by the navigator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design issue that captures design ideas
Projects
None yet
Development

No branches or pull requests

1 participant