Skip to content
This repository has been archived by the owner on Jan 2, 2025. It is now read-only.

Clean Up Interface to Build a Workflow Graph #20

Open
cwschultz88 opened this issue Sep 12, 2018 · 2 comments
Open

Clean Up Interface to Build a Workflow Graph #20

cwschultz88 opened this issue Sep 12, 2018 · 2 comments
Labels
enhancement New feature or request

Comments

@cwschultz88
Copy link
Collaborator

The process of building connection and module objects can be a pain. Want to rework the object interfaces to simplify things as much as possible to build and execute workflows.

This is a major project that needs to get done on the road to get to a 1.0 release.

@cwschultz88 cwschultz88 added the enhancement New feature or request label Sep 12, 2018
@jayqi
Copy link
Contributor

jayqi commented Sep 12, 2018

So one way the interface can be simplified is if a lot of the methods return invisible(self) and you can chain stuff.

So concretely, here's the setup from test_execution.R:

workflow1 <- DAGWorkflow$new(name="workflow1") # Dependency -- assumes working WorkflowDAG, etc.
module1_1 <- PackageFunctionModule$new(name = "module1_1", fun ="rnorm", package = "stats")
module2_1 <- PackageFunctionModule$new(name = "module2_1", fun ="rnorm", package = "stats") 
module3_1 <- PackageFunctionModule$new(name = "module3_1", fun ="rnorm", package = "stats") 
module4_1 <- CustomFunctionModule$new(name = "module4_1", fun = function(a,b,c){cat(a+b+c, file=file.path(workingDir, file = "workflow1_output.txt"))})
connection1_1 <- DirectedConnection$new(name = "connection1_1", headModule = module1_1, tailModule = module4_1, inputArgument = c('a'))
connection2_1 <- DirectedConnection$new(name = "connection2_1", headModule = module2_1, tailModule = module4_1, inputArgument = c('b'))
connection3_1 <- DirectedConnection$new(name = "connection3_1", headModule = module3_1, tailModule = module4_1, inputArgument = c('c'))
workflow1$addModules(list(module1_1
                          , module2_1
                          , module3_1
                          , module4_1))
workflow1$addConnections(list(connection1_1
                              , connection2_1
                              , connection3_1))

If we are able to chain methods then you'd be able to write it like this if you wanted. Of course you could still do something in the middle with some intermediate variables, but you definitely wouldn't need to keep writing workflow1 over and over.

workflow1 <- (DAGWorkflow$new(name="workflow1")
              $addModules(list(
                  PackageFunctionModule$new(name = "module1_1", fun ="rnorm", package = "stats"),
                  PackageFunctionModule$new(name = "module2_1", fun ="rnorm", package = "stats"),
                  PackageFunctionModule$new(name = "module3_1", fun ="rnorm", package = "stats"), 
                  CustomFunctionModule$new(name = "module4_1", fun = function(a,b,c){cat(a+b+c, file=file.path(workingDir, file = "workflow1_output.txt"))})
              ))
              $addConnections(list(
                  DirectedConnection$new(name = "connection1_1", headModule = "module1_1", tailModule = "module4_1", inputArgument = c('a')),
                  DirectedConnection$new(name = "connection2_1", headModule = "module2_1", tailModule = "module4_1", inputArgument = c('b')),
                  DirectedConnection$new(name = "connection3_1", headModule = "module3_1", tailModule = "module4_1", inputArgument = c('c'))
              ))
)

@jameslamb
Copy link
Collaborator

I am merely an observer here, but want to be on record as saying that I like this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants