Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brainstorm: More transparency with regards to default parameters #1038

Open
Andrew-S-Rosen opened this issue Nov 2, 2024 · 17 comments
Open

Comments

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Nov 2, 2024

I have a big picture question I wanted to kick off here. One of the challenges I have with using atomate2, particularly with new students who are still learning DFT, is that the default parameters for a given job are not particularly clear. The user has to dig through several subclasses to eventually arrive at an answer, which can be daunting. At the same time, hard-coding these into the documentation isn't ideal because then there is likely to be a mismatch if one is changed but not the other.

I find it pretty critical to know, in advance, what parameters are being specified. Does anyone here have suggestions on how to better improve this kind of user experience? Of course, one of the challenges is that some settings may be dynamic, depending on the structure itself. Maybe adding a section to the docs on how to inspect the instantiated class for its parameters is enough?

@JaGeo
Copy link
Member

JaGeo commented Nov 2, 2024

Yes, I agree with your statement but I don't have a clear way forward. It also gets very complicated in combination with custodian. We should definitely think about it.

@Andrew-S-Rosen
Copy link
Member Author

I think we can exclude Custodian from this part of the discussion because, at least for me, I am mostly interested in knowing how a given job is set up to begin with. For instance, "what is the force tolerance used in this job? Is it too low for me? Too high?" is a question based only on atomate2 logic.

That said, I also don't have a clear way forward... :(

@JaGeo
Copy link
Member

JaGeo commented Nov 2, 2024

Mh, not sure about custodian. I am always wondering about consistent calculations (e.g., for phonons, elastic constants). Algo/sigma changes can make this hard. Thus, we should maybe at least mention the logic there as well.

@JaGeo
Copy link
Member

JaGeo commented Nov 2, 2024

Could we show for one input set/generator how it works and add documentation on how to get computations compatible with MP Inputs? I feel this would already cover a big part.

Or can we add something to the generators/makers that somehow allows to get a documentation of the input setttings automatically? (E.g., this will be the incar, this will happen to metals specifically etc)

@utf
Copy link
Member

utf commented Nov 2, 2024

Agreed this is a problem. I'd hoped this would be made easier by having input sets that are not subclassed but what's happened is we now just have loads of makers that override the input sets there instead.

One technological solution would be to add a custom pipeline into building the docs that takes a job maker, initialises it, generates an input set with a dummy structure, and somehow inserts this into the website documentation. That said, it's not clear to me how much this would help. Personally, I rarely look at the built docs just the source code.

Edit: should have read @JaGeo message properly first - seems like we're on the same page.

@JaGeo
Copy link
Member

JaGeo commented Nov 2, 2024

@utf Can we maybe also make it obvious how to get the info from the makers directly?

@Andrew-S-Rosen
Copy link
Member Author

Andrew-S-Rosen commented Nov 2, 2024

One technological solution would be to add a custom pipeline into building the docs that takes a job maker, initialises it, generates an input set with a dummy structure, and somehow inserts this into the website documentation. That said, it's not clear to me how much this would help. Personally, I rarely look at the built docs just the source code.

I think this would help a lot for many users. Sure, I also rarely look at the built docs right now, but that's because it doesn't have this kind of information in it. If it did have this information, I imagine I would rely on it quite a bit. Also, we are all very Python-proficient and have no problem digging through source code, but that's not the case for many. I'm thinking about this from the perspective of users who are not as code-savvy as us three.

It doesn't seem trivial to implement. That said, I think something along these lines is a reasonable idea!

Can we maybe also make it obvious how to get the info from the makers directly?

The exact details would depend on the code (e.g. VASP vs. cp2k) of course, but I think in the short-term this is probably the wisest thing to do. Maybe we can even have a single class method that compiles all the parameters together that way the user doesn't need to fetch different attributes?

@QuantumChemist
Copy link
Contributor

I would say that the best solution to providing students the necessary means of understanding what exactly is going on and what impact each parameter has is to improve and add more tutorials. Maybe even some guidance on how to dig through the source code?

@JaGeo
Copy link
Member

JaGeo commented Nov 3, 2024

I personally think that it is even a problem if you know how to go through the code as one can easily overlook something. Having this in the documentation would surely help already. I would then also refer people to the documentation if they had questions about it.

@JaGeo
Copy link
Member

JaGeo commented Nov 3, 2024

@QuantumChemist of course, in an ideal world everyone should be perfectly trained. I am also sure that everyone involved in this thread is training their students as coworkers as much as they can. However, there will always be a learning curve. And being more clear about these settings will definitely help. Also for the reason mentioned above. With growing pupularity of automation codes, we will also get many less experienced Python users. And we cannot train everyone on IDEs.

@hongyi-zhao

This comment was marked as off-topic.

@JaGeo
Copy link
Member

JaGeo commented Nov 4, 2024

@hongyi-zhao thanks for pointing this out.

I would like to add to this thread that we are all doing as much as we can. However, as far as I know, no one of us has explicit funding for atomate2 support in general. Thus, we always have to connect our contributions to a research project. There are many different obligations arising from such research projects and documenting code is only one of them.

@Andrew-S-Rosen
Copy link
Member Author

@hongyi-zhao: Yes, this thread is for brainstorming new ideas (without any guarantee of action in the immediate future). Fur bug reports, a new issue should be opened dedicated to that topic.

@hongyi-zhao
Copy link
Contributor

hongyi-zhao commented Nov 5, 2024

I opened a new issue for my above posting.

@gpetretto
Copy link
Contributor

I have also got similar questions from atomate's users, regarding how to know beforehand which inputs are going to be used at runtime. To address this I have provided a function that uses the inputset in the Maker to generate the inputs. In this way a user can verify which parameters are effectively going to be used, even those generated dynamically based on the input structure. One option could be to add a similar function or even a method to BaseVaspMaker that takes a Structure as an input and generates the input set. In this way an inexperienced user would have a relatively quick way of generating the actual inputs and inspect them before running the Flow.

Of course this would not be a replacement for a more detailed documentation, as already suggested.

To also add one more point of view, I should say that often final users (e.g. people that are not going to collaborate on the development at any stage and are not necessarily python experts) do not want to be bothered at all with the details of how the whole machinery works. After getting a basic understanding, they would rather just know the inputs that will be used for their calculations. So, while the training of students and collaborators is definitely valuable, there is also a whole class of users that will not consider inspecting the code or referring to a very technical documentation as a viable option.

@JaGeo
Copy link
Member

JaGeo commented Nov 8, 2024

@gpetretto this function/method would be extremely valuable for atomate2. I would be very happy if you could provide the function at least within a PR. We might want to include its usage in the documentation and also use this to document the settings for some standard structures.

@gpetretto
Copy link
Contributor

To be fair it is nothing fancy. Basically just the initial part of BaseVaspMaker.make(), that returns the input set instead of writing it to file.
I will try to contribute this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants