Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to make SPHInX input writer independent of pyiron while using DataContainer #1274

Open
samwaseda opened this issue Dec 30, 2023 · 6 comments
Labels
question Further information is requested

Comments

@samwaseda
Copy link
Member

I'm going through hell in this PR.

In short, the problem is that SPHInX input is highly nested and currently strongly dependent on DataContainer. More precisely, it sort of needs the ability to have multiple appearance of a single key (which dict doesn't allow) and recursive HDF saving. I'm now wondering whether I should rewrite Group in SPHInX to make it derive from dict, or if there's a way to still use DataContainer. Maybe I'm also totally missing something super simple, so I'd be thankful for any comments.

@samwaseda samwaseda added the question Further information is requested label Dec 30, 2023
@liamhuber
Copy link
Member

It is of interest to me/tentatively my plan to extract DataContainer as a stand-alone tool. If that would also solve your problem here, it would make me more likely to pursue that and we could work together to get it done quicker.

@samwaseda
Copy link
Member Author

Ah yeah that would definitely solve the problem. Is it more difficult than just copy and paste?

@jan-janssen
Copy link
Member

I am a bit confused about the recursive HDF part. I thought the idea of separating the input writer into a separate module was for this newly separated module to be independent of the data storage. While currently HDF is our preferred storage format I do not think we want to require the standalone modules to already have a HDF interface.

@samwaseda
Copy link
Member Author

I am a bit confused about the recursive HDF part. I thought the idea of separating the input writer into a separate module was for this newly separated module to be independent of the data storage. While currently HDF is our preferred storage format I do not think we want to require the standalone modules to already have a HDF interface.

Yeah after I wrote it I also realized that it wasn't really part of what should be done, so you are right about it. But the general point remains: Whether to rewrite parts of DataContainer in the parser or to import them somehow.

@niklassiemer
Copy link
Member

In tinybase @pmrv introduced an abstraction layer for storage. He can afterwards use different storage backends to 'store' the information in hdf or in a plain dict if I recall correctly. As such, the DataContainer could be separated in such an abstract version with some serialize method that is able to take some storage interface and populate it.

@liamhuber
Copy link
Member

Ah yeah that would definitely solve the problem. Is it more difficult than just copy and paste?

Only marginally; we want to make sure we preserve the git history. This isn't hard though.

More complex is just (a) thinking about what exactly we want in the new package, e.g. Niklas' comments, and (b) actually removing the extracted stuff from their original locations and replacing them with the new extracted version. I don't think these things are super complicated either, but they're reaching beyond copy and paste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants