Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-blocking handling of results #163

Closed
wants to merge 1 commit into from

Conversation

jcheng5
Copy link

@jcheng5 jcheng5 commented Sep 16, 2017

(Please DO NOT merge, for discussion only. It's possible the promises package will be renamed in the next few days.)

I have two goals with this PR, one of which benefits future users and one of which benefits promises/Shiny users.

  1. Provide a way for future users to be able to be notified when their jobs are complete, without blocking or having the user manually remember to go poll for results. This would allow e.g. a message to be popped up to show that a large background job is done. Currently this can be accomplished using the later package to implement a polling loop, but in the future it could be made part of the Extended API so some future backends could implement using e.g. background threads or file descriptors.
  2. Let people who are doing async programming with promises (especially with Shiny) be able to use Future seamlessly as a promise. Until now I have accomplished this by implementing as.promise.Future in the promises package, but this feels backwards to me, like having DBI bundling the RMySQL implementation.

I was originally going to send you separate PRs, with the first one just implementing goal 1 without foisting any dependency on promises on you, in case you don't want to take a position on promises. This would look like an onResolved(future, callback) function. But as I set to implementing this function, I realized that it's a disservice to you and your users to implement something that is just like a promise, but worse. then(future, callback) is no harder to learn or type than onResolved(future, callback), and it provides many additional benefits like well-structured error handling, composing together multiple tasks in serial or parallel, the fact that it is a first-class object (and thus be reused, passed into functions, returned from functions), and more.

So now I am advocating for a stronger position than before, that you consider taking a PR like this one, and let goal 2 subsume goal 1, since it automatically provides a better version of it. I would like to spend some time with you talking more about promises, and after that if you prefer to have an arms-length relationship with them rather than taking a direct dependency, then I'd be happy to implement another PR that provides onResolved and not actual promises.

@jcheng5
Copy link
Author

jcheng5 commented Sep 16, 2017

Just to be clear about what this package lets you do:

library(promises)
plan(multiprocess)

f <- future({ ... })

# traditional API
then(f, function(value) {
  # Save it in the global environment
  result <<- value
  message("Your job is complete")
})

# promise-pipe API
f %...>% head() %...>% View()

This entire script would execute very quickly regardless of how long the future expr takes to execute, leaving you back at the R prompt. Whenever the future finishes executing, then you would see the message() and View() etc. take effect, as long as the R process is not busy doing something else; if it is busy, then you'll see the handlers executing either when control returns to the R prompt, or somebody calls later::run_now() (which is how Shiny, which blocks the console, is able to make use of these).

@jcheng5 jcheng5 changed the title Support conversion to promise Non-blocking handling of results Sep 16, 2017
@HenrikBengtsson
Copy link
Collaborator

Thanks Joe. I'm pretty sure I understand your objectives and suggestions here and from a first look at it I believe I agree with this plan/strategy. I'll try to do give some good thoughts in the next coming days to make sure I understand the ins-and-outs and what implications etc it'll have. Then we can vchat.

@HenrikBengtsson
Copy link
Collaborator

HenrikBengtsson commented Sep 20, 2017

In order for the S3 method as.promise.Future() to work in a setup with:

library("promise")
library("future")

the future package would have to Import promise (and not just Suggest it), correct? I just wanna make sure this is still the case in R, there are no hacks to avoid it, and that I haven't missed any changes to this.

If so, then this comes down to the usual "I wanna avoid heavy dependency" problem. Yes, I would prefer avoiding to force additional directly dependencies on code/users using futures (i.e. promise + R6, Rcpp, later, rlang, magrittr, and BH, totaling 14 GiB MiB and native code). That would go the opposite directory on make the core Future API lighter (currently totaling 270 kB + 120 kB on a basically unused dependency on digest).

The only obvious solution I see is adding a dependency on a new very lightweight promise.api package without any additional dependencies where the Promise API/generics are defined. I'd be happy to import such a promise.api::as.promise generic.

This corresponds to my ideas on having a lightweight future.api package (< 10 kB and no dependencies) that defines the API/generics but without actual functionality (or in my case with a default bare bone eager sequential future backend ~= default R evaluation). Any package that implements a Future API backend would then Import this package, e.g. existing the future[.parallel], future.batchtools, and future.BatchJobs packages. Would a such a 'future.api' package be a feasible solution for you, i.e. @importFrom future.api resolved value, or would you consider that a dependency going in the wrong design direction? (If it clarifies my view on where the Future API is on the stack, in my dream world, the 'parallel' package (or even 'base') will one day declare/export those three very-basic and not-too-controversial generics and utilize them internally for its own mc*/par*/... code).

What do you think? What can be done?

@jcheng5
Copy link
Author

jcheng5 commented Sep 20, 2017

Hi Henrik, thanks for taking a look!

  1. I believe you can Suggests promises, not Imports. I tried this and it seems to work fine. Would that course of action be good enough for you?
  2. I think you mean 14MiB, not 14 GiB? :)
  3. I don't think it'd be possible to create a promise.api package that would let future implement as.promise.Future but also doesn't pull in the dependencies we're trying to avoid. The later package is fundamental here, it's not possible to implement the promise interface correctly without it (due to section 2.2.4, a requirement that eliminates a dangerous class of race conditions that would otherwise be very common). And once promise.api depends on later, it may as well depend on the rest (BH and Rcpp are responsible for 12.3MiB out of that 14MiB, and later depends on both). It's also possible I've misunderstood your intent here, so let me know if I'm not making sense.

I admire and relate to your goal of keeping dependencies minimal, and I also try to keep native code out of most packages. Thanks for the reminder. Some thoughts, in case Suggests doesn't work out:

  1. I could get the native code out of promises, but not out of later. And life is too short to do the kinds of things I'm doing in later while coding directly against R's C API, sorry. :) Seriously though, it'd be pretty hard to do safely--I don't consider myself a good enough C programmer to do it safely (i.e. without being able to rely on C++ destructors).
  2. BH alone is 10MiB or so, ouch. We (RStudio) generally don't hesitate to depend on it because it's LinkingTo only, so Windows and Mac users at least will never see it. That said, I could probably remove the dependency with only a few hours' work if I needed to.
  3. I take the magrittr dependency only for convenience to end users (so they don't have to library(promises) and library(magrittr), I think the regular %>% will be commonly used with promises as well). So this one would be very easy to drop.
  4. rlang lets us specify promise later and then callbacks using tidy evaluation. rlang itself weighs in at 200KB and doesn't have any other dependencies.
  5. R6 is nice, ubiquitous, and gives us better performance than reference classes. I could go to raw environments like you do with future but I personally think it's worth the tradeoff in having the structure and visibility features of R6.

@HenrikBengtsson
Copy link
Collaborator

Fingers type GiB's automatically these days - too much of those at UCSF (or actually TiB's and soon PiB's).

I would certainly be happy with a Suggests dependency, but I'm not sure I can do it without skipping a proper declaring the S3 method. If one attempts to declare S3method(as.promise, Future) in the NAMESPACE, then we'll get:

Error: package or namespace load failed for 'future`:
 object 'as.promise' not found whilst loading namespace 'future'

at install time (or at load time if we install with --no-test-load).

It does indeed work if I just use:

export(as.promise.Future)

in NAMESPACE. (This is illustrated with mockups future2 and promises0 in https://github.com/HenrikBengtsson/future-and-promises).

However, that would require future to be attached in order for calls to as.promise(<Future object>) to work - it is not enough for future to be just loaded. For instance, if there is a third package tries something like:

#' @importFrom promises as.promise
#' @importFrom future future as.promise.Future
#' @export
call_me <- function(x, ...) {
  f <- future(x)
  as.promise(f)
}

it will fail with:

Error in UseMethod("as.promise") : 
  no applicable method for 'as.promise' applied to an object of class "c('SequentialFuture', 'UniprocessFuture', 'Future', 'environment')"

(This is illustrated with mockup somepkg in https://github.com/HenrikBengtsson/future-and-promises).

It might be a good-enough solution, but at the same time I'm not sure if it's a "hack" not to declare an S3 method and if that might break in a future version of R. I need to dive back into the R internals for namespaces and S3 dispatching, which I haven't done in a very long time, in order fully understand what is legal, possible, and dangerous. AFAIK, the just-export-and-dont-declare the S3 method has been around for a long time and before namespaces were introduced, so I think there's lots of code / CRAN packages that would break if it stopped being supported.

A completely different route would be to add a package future.promises that bridges future and promises with just the needed S3 method:

#' @importFrom promises as.promise
#' @importFrom future resolved value
#' @export
as.promise.Future <- function(x, ...) {
  [...]
}

(This is illustrated with mockup future.promises in https://github.com/HenrikBengtsson/future-and-promises).

Brain is getting mushy - too long working day. I'll try to think more about this.

@HenrikBengtsson
Copy link
Collaborator

Hi again. I've been looking more into this one in different ways.

Adding a lightweight, no-dependency, standalone future.api package would be an option that could be added to Suggests: for the promises package. It turns out that that route might be very cumbersome requiring multiple iterations of releases to keep CRAN happy. It would also require lots of validation (and redesign) of the existing future ecosystem. So, that won't happen anytime soon.

I've also looked back into the few ideas I had above, e.g. a future.promises package that would depend on both future and promises, which means user would do library(future.promises). I've made a proof-of-concept at https://github.com/HenrikBengtsson/future.promises/tree/develop. Only when I got around to validate it, I realize that you now do Suggests: future in promises and implements a as.promise.Future(). Are you ok/happy with that solution? That would certainly avoid the various ad-hoc workarounds I'm looking into, including your original idea of adding it to future (with Suggests: promises, later) and only export(as.promise.Future) without a proper S3method(as.promise,Future). If you stick with Suggests: future, you'll certainly make my life easier :)

@jcheng5
Copy link
Author

jcheng5 commented Jan 11, 2018

I'm happy to have promises suggest future, at least for the time being. We can always switch to something else later if it becomes necessary, without (too) much pain.

@jcheng5 jcheng5 closed this Jan 11, 2018
@HenrikBengtsson
Copy link
Collaborator

That's great to hear. I'll try to follow your use cases (promises alone and shiny) closely. Feel free to shoot me issues and other feature requests (preferable less tricky than this one). Also, if you've got package tests that I can run (revdepcheck but also standalone), please let me know. The other day I had some troubles testing promises in batch mode, which I assume is because I someone needs to give control back to the event loop.

@jcheng5
Copy link
Author

jcheng5 commented Jan 11, 2018

Really, the only thing I don't like about the integration with future right now is that I have to poll the future for completion (using resolved(f, timeout=0)) rather than some more efficient mechanism of being notified. That's a big request, I know.

@HenrikBengtsson
Copy link
Collaborator

You mean without polling? A mechanism where at the very end of the future evaluation the "worker" pushes a notification? If so, that has to go in under the Optional/Extended Future API since not all possible future workers will be capable of that/such channels of communication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants