-
Notifications
You must be signed in to change notification settings - Fork 17
Separate executor configuration from computation #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Interesting. I'm not convinced we should completely fold the executor into the spec when they represent different choices (runtime vs storage layer + config)... But then again I guess if the "spec" means just "all configuration needed to run any given workload" then it would make sense.
You're talking about a documentation issue here now right? I agree it feels silly to have the same workload examples documented twice for two executors. It also makes it much more likely that something in the docs gets out of date without us noticing.
I like the idea of a config file to set the defaults - I expect users will in practice set this up once and then never touch it again. Which is exactly what we want - them never to have to worry about configuration once they start doing science work. We should just think about reproducibility / clarity if the options are being read from another file rather than from within the notebook. Maybe all options should be printed once the computation begins? |
That's how I think about "spec". Having the ability to pass runtime parameters at the time
Yes, quite.
💯
That's a good idea. |
The way that the examples are currently written means that we have executor configuration mixed up with the computation itself. For example, notice how the executor and its parameters (
runtime
,runtime_memory
) are found at the beginning and end of this example:cubed/examples/lithops/aws-lambda/lithops-add-asarray.py
Lines 8 to 24 in 738b70d
We could improve the separation by setting the executor on the spec, so everything is set up in one go, like this:
This is better, but it still means that every example is duplicated for every executor (Lithops AWS Lambda, Lithops GCF, Modal AWS, Modal GCP, Coiled, Dataflow, etc). While some duplication is OK for examples, it does feel excessive.
To improve this further this we could use something like donfig to allow the spec to be read from a config file, such as this one:
Then the example would look like this (note that the spec object disappears, and is automatically picked up from config instead):
It is run as follows, assuming the config file is in the lithops/aws-lambda directory:
CUBED_CONFIG=$(pwd)/lithops/aws-lambda python add-asarray.py
Note that the existing way of using a spec object programmatically would still work, this is just another way to configure things.
(It would also make it possible to implement #310 by using a config context manager to set the executor to one that raises if
compute
is called.)Any thoughts @TomNicholas?
The text was updated successfully, but these errors were encountered: