Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The problem of using the Materials Project database and the Perovskite database #31

Open
yuyouyu32 opened this issue Oct 5, 2021 · 3 comments

Comments

@yuyouyu32
Copy link

Hello~ Thanks for your great work!
I am a computer science student, so I am not familiar with the use of material databases.
After I read your data.py code, I find that you read cif file and get anything you need.
But as a computer science student, I can't understand your method of getting information from cif file. I would be very grateful if you could give me a little explanation.
Besides, your cif files are from COD database, how could I get cif files from the Materials Project database and the Perovskite database in the same format as COD database.

@yuyouyu32
Copy link
Author

yuyouyu32 commented Oct 5, 2021

In addition, I use pymatgen.ext.matproj import MPRester API to get information from the Materials Project database, but I can not get cif files through this way.
As for the Perovskite database, ase.db.connect('cubic_perovskites.db'), I don't know how can I get the same information as `cif files from COD database.
I would be very grateful if you could answer my questions!!!

@txie-93
Copy link
Owner

txie-93 commented Oct 5, 2021

@yuyouyu32 The definition of cif files can be found in https://en.wikipedia.org/wiki/Crystallographic_Information_File

This function from pymatgen can convert the cif string to a Structure object: https://pymatgen.org/pymatgen.core.structure.html?highlight=structure#pymatgen.core.structure.IStructure.from_str
From the object you can get the coordinates and atom types as numpy arrays.

You should be able to find similar functions in the ASE documentation to read the Perovskite database: https://wiki.fysik.dtu.dk/ase/

Hope that this is helpful.

@yuyouyu32
Copy link
Author

yuyouyu32 commented Oct 6, 2021

Thank you very much for your detailed answer. I have a few more questions, sorry I need to bother you.

To be direct, I can't find the code in your data.py that reads The Materials Project database and The Perovskite database.
But I can find your function which is used for reading cif files. So I wonder that you read cif files from The Materials Project database and The Perovskite database. But I don't know how to download cif with pymatgen.ext.matproj import MPRester.
And I don't know how to get cif files from cubic_perovskites.db file with ase.

After reading your answer, I realized that maybe you are converting from string type data directly to cgcnn's input format. I wonder if I can get your example code for reading data which could be directly used in cgcnn's training from the two databases.(The Materials Project database and The Perovskite database)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants