You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is just my stream-of-consciousness notes immediately post-meeting with @pmrv.
I'll come back and update it, and split it into two or more sensible actionable issues next week when I have more time.
Meanwhile, I wanted it available in light of my request (#181) to meet with both @pmrv and @jan-janssen so we can move towards a storage solution that satisfies everyone.
(De)serialization for remote processes
Change the bit of run that gets shipped off to other processes to include info about serializing the output
We'll probably also need to send off the fully scoped working directory
For performance, we'd ship out the run function and inputs, and ship back the outputs (not all of self)
Serialize to some temp file, then at the last minute move the file to the "expected" location
One the main process side, update your output, and either delete the temp
When loading a graph, if you find a node who deserializes with a running state, go look for the "expected" save file; if you find it, deserialize it and use this output data in your regular callback (including a full (not just output) save (if requested), a checkpointing save (if requested), and firing output signals)
In this way submitting to an Executor process or a queue process should look very similar
With the exception that what gets sent to the queue process may also be instructions for using an Executor nested inside that process
This should provide compatibility with both queues and restarting partially executed runs!
Storage implementation
The second topic was storage details.
One thing I forgot to mention that I liked about tinybase's storage solution was that it has a generic abstraction that would allow us to plug in non-hdf5 back ends.
Marvin was very amenable to the idea of using __get/setstate__ natively (where it exists) to do the re/store process, so that we don't necessarily need to define those custom functions.
The rough idea now is to use __getstate__ recursively to get key-value pairs and storage["key"] = value assign them to the generic storage instance.
This can be wrapped in a try-except clause, so that when the assignment fails we (cloud?)pickle the object and try again, which would circumvent the current h5io failure for, e.g., ASE Calculator objects.
Replacing _restore with something based on __setstate__ is a little more opaque to me, because I'm still not totally clear how h5io is getting me my initial classes, but maybe it will make more sense when I go back and look there.
Maybe we could also work in something extra like before- and after-storage hooks, e.g. for storing extra-state data like version controlling, so we can fail hard if we're deserializing the wrong version or whatever.
But that's a bell-and-whistle.
The text was updated successfully, but these errors were encountered:
This is just my stream-of-consciousness notes immediately post-meeting with @pmrv.
I'll come back and update it, and split it into two or more sensible actionable issues next week when I have more time.
Meanwhile, I wanted it available in light of my request (#181) to meet with both @pmrv and @jan-janssen so we can move towards a storage solution that satisfies everyone.
(De)serialization for remote processes
run
that gets shipped off to other processes to include info about serializing the outputself
)running
state, go look for the "expected" save file; if you find it, deserialize it and use this output data in your regular callback (including a full (not just output) save (if requested), a checkpointing save (if requested), and firing output signals)Executor
process or a queue process should look very similarExecutor
nested inside that processThis should provide compatibility with both queues and restarting partially executed runs!
Storage implementation
The second topic was storage details.
One thing I forgot to mention that I liked about
tinybase
's storage solution was that it has a generic abstraction that would allow us to plug in non-hdf5 back ends.Marvin was very amenable to the idea of using
__get/setstate__
natively (where it exists) to do there/store
process, so that we don't necessarily need to define those custom functions.The rough idea now is to use
__getstate__
recursively to get key-value pairs andstorage["key"] = value
assign them to the generic storage instance.This can be wrapped in a try-except clause, so that when the assignment fails we
(cloud?)pickle
the object and try again, which would circumvent the currenth5io
failure for, e.g., ASECalculator
objects.Replacing
_restore
with something based on__setstate__
is a little more opaque to me, because I'm still not totally clear howh5io
is getting me my initial classes, but maybe it will make more sense when I go back and look there.Maybe we could also work in something extra like before- and after-storage hooks, e.g. for storing extra-state data like version controlling, so we can fail hard if we're deserializing the wrong version or whatever.
But that's a bell-and-whistle.
The text was updated successfully, but these errors were encountered: