Skip to content

Commit 5d14bb7

Browse files
authored
Add files via upload
1 parent f4684dd commit 5d14bb7

24 files changed

+3519
-0
lines changed

OpenMat/Matformer/README.md

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Periodic Graph Transformers for Crystal Material Property Prediction
2+
3+
<!-- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/MinkaiXu/GeoDiff/blob/main/LICENSE) -->
4+
5+
[[OpenReview](https://openreview.net/forum?id=pqCT3L-BU9T)] [[arXiv](https://arxiv.org/abs/2209.11807)] [[Code](https://github.com/YKQ98/Matformer)]
6+
7+
The official implementation of Periodic Graph Transformers for Crystal Material Property Prediction (NeurIPS 2022).
8+
9+
![cover](assets/matformer_graph.png)
10+
![cover](assets/matformer.png)
11+
12+
## Dataset
13+
14+
### The Materials Project Dataset
15+
We provide benchmark results for previous works, including CGCNN, SchNet, MEGNET, GATGNN, ALIGNN on The Materials Project Dataset.
16+
17+
In particular, for tasks of formation energy and band gap, we directly follow ALIGNN and use the same training, validation, and test set, including 60000, 5000, and 4239 crystals, respectively. For tasks of Bulk Moduli and Shear Moduli, we follow GATGNN, the recent state-of-the-art method for these two tasks, and use the same training, validation, and test sets, including 4664, 393, and 393 crystals. In Shear Moduli, one validation sample is removed because of the negative GPa value. We either directly use the publicly available codes from the authors, or re-implement models based on their official codes and configurations to produce the results.
18+
19+
### JARVIS dataset
20+
We also provide benchmark results for previous works, including CGCNN, SchNet, MEGNET, GATGNN, ALIGNN on JARVIS Dataset.
21+
22+
JARVIS is a newly released database proposed by Choudhary et al.. For JARVIS dataset, we follow ALIGNN and use the same training, validation, and test set. We evaluate our Matformer on five important crystal property tasks, including formation energy, bandgap(OPT), bandgap(MBJ), total energy, and Ehull. The training, validation, and test set contains 44578, 5572, and 5572 crystals for tasks of formation energy, total energy, and bandgap(OPT). The numbers are 44296, 5537, 5537 for Ehull, and 14537, 1817, 1817 for bandgap(MBJ). The used metric is test MAE. The results for CGCNN and CFID are taken from ALIGNN, other baseline results are obtained by retrained models.
23+
24+
25+
## Benchmarked results
26+
27+
### The Materials Project Dataset
28+
![cover](assets/mp.png)
29+
### JARVIS dataset
30+
![cover](assets/jarvis.png)
31+
## Training and Prediction
32+
33+
You can train and test the model with the following commands:
34+
35+
```bash
36+
conda create --name matformer python=3.10
37+
conda activate matformer
38+
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
39+
conda install pyg -c pyg
40+
pip install jarvis-tools==2022.9.16
41+
python setup.py
42+
# Training Matformer for the Materials Project
43+
cd matformer/scripts/mp
44+
python train.py
45+
# Training Matformer for JARVIS
46+
cd matformer/scripts/jarvis
47+
python train.py
48+
```
49+
50+
## Efficiency
51+
![cover](assets/efficient.png)
52+
53+
## Citation
54+
Please cite our paper if you find the code helpful or if you want to use the benchmark results of the Materials Project and JARVIS. Thank you!
55+
```
56+
@article{yan2022periodic,
57+
title={Periodic Graph Transformers for Crystal Material Property Prediction},
58+
author={Yan, Keqiang and Liu, Yi and Lin, Yuchao and Ji, Shuiwang},
59+
journal={arXiv preprint arXiv:2209.11807},
60+
year={2022}
61+
}
62+
```
63+
64+
## Acknowledgement
65+
66+
This repo is built upon the previous work ALIGNN's [[codebase]](https://github.com/usnistgov/alignn). Thank you very much for the excellent codebase.
67+
68+
## Contact
69+
70+
If you have any question, please contact me at [email protected].
59.3 KB
Loading

OpenMat/Matformer/assets/jarvis.png

104 KB
Loading
171 KB
Loading
93.2 KB
Loading

OpenMat/Matformer/assets/mp.png

118 KB
Loading
+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

OpenMat/Matformer/matformer/config.py

+193
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
"""Pydantic model for default configuration and validation."""
2+
"""Implementation based on the template of ALIGNN."""
3+
4+
import subprocess
5+
from typing import Optional, Union
6+
import os
7+
from pydantic import root_validator
8+
9+
# vfrom pydantic import Field, root_validator, validator
10+
from pydantic.typing import Literal
11+
from matformer.utils import BaseSettings
12+
from matformer.models.pyg_att import MatformerConfig
13+
14+
# from typing import List
15+
16+
try:
17+
VERSION = (
18+
subprocess.check_output(["git", "rev-parse", "HEAD"]).decode().strip()
19+
)
20+
except Exception as exp:
21+
VERSION = "NA"
22+
pass
23+
24+
25+
FEATURESET_SIZE = {"basic": 11, "atomic_number": 1, "cfid": 438, "cgcnn": 92}
26+
27+
28+
TARGET_ENUM = Literal[
29+
"formation_energy_peratom",
30+
"optb88vdw_bandgap",
31+
"bulk_modulus_kv",
32+
"shear_modulus_gv",
33+
"mbj_bandgap",
34+
"slme",
35+
"magmom_oszicar",
36+
"spillage",
37+
"kpoint_length_unit",
38+
"encut",
39+
"optb88vdw_total_energy",
40+
"epsx",
41+
"epsy",
42+
"epsz",
43+
"mepsx",
44+
"mepsy",
45+
"mepsz",
46+
"max_ir_mode",
47+
"min_ir_mode",
48+
"n-Seebeck",
49+
"p-Seebeck",
50+
"n-powerfact",
51+
"p-powerfact",
52+
"ncond",
53+
"pcond",
54+
"nkappa",
55+
"pkappa",
56+
"ehull",
57+
"exfoliation_energy",
58+
"dfpt_piezo_max_dielectric",
59+
"dfpt_piezo_max_eij",
60+
"dfpt_piezo_max_dij",
61+
"gap pbe",
62+
"e_form",
63+
"e_hull",
64+
"energy_per_atom",
65+
"formation_energy_per_atom",
66+
"band_gap",
67+
"e_above_hull",
68+
"mu_b",
69+
"bulk modulus",
70+
"shear modulus",
71+
"elastic anisotropy",
72+
"U0",
73+
"HOMO",
74+
"LUMO",
75+
"R2",
76+
"ZPVE",
77+
"omega1",
78+
"mu",
79+
"alpha",
80+
"homo",
81+
"lumo",
82+
"gap",
83+
"r2",
84+
"zpve",
85+
"U",
86+
"H",
87+
"G",
88+
"Cv",
89+
"A",
90+
"B",
91+
"C",
92+
"all",
93+
"target",
94+
"max_efg",
95+
"avg_elec_mass",
96+
"avg_hole_mass",
97+
"_oqmd_band_gap",
98+
"_oqmd_delta_e",
99+
"_oqmd_stability",
100+
"edos_up",
101+
"pdos_elast",
102+
"bandgap",
103+
"energy_total",
104+
"net_magmom",
105+
"b3lyp_homo",
106+
"b3lyp_lumo",
107+
"b3lyp_gap",
108+
"b3lyp_scharber_pce",
109+
"b3lyp_scharber_voc",
110+
"b3lyp_scharber_jsc",
111+
"log_kd_ki",
112+
"max_co2_adsp",
113+
"min_co2_adsp",
114+
"lcd",
115+
"pld",
116+
"void_fraction",
117+
"surface_area_m2g",
118+
"surface_area_m2cm3",
119+
"indir_gap",
120+
"f_enp",
121+
"final_energy",
122+
"energy_per_atom",
123+
]
124+
125+
126+
class TrainingConfig(BaseSettings):
127+
"""Training config defaults and validation."""
128+
129+
version: str = VERSION
130+
131+
# dataset configuration
132+
dataset: Literal[
133+
"dft_3d",
134+
"megnet",
135+
] = "dft_3d"
136+
target: TARGET_ENUM = "formation_energy_peratom"
137+
atom_features: Literal["basic", "atomic_number", "cfid", "cgcnn"] = "cgcnn"
138+
neighbor_strategy: Literal["k-nearest", "voronoi", "pairwise-k-nearest"] = "k-nearest"
139+
id_tag: Literal["jid", "id", "_oqmd_entry_id"] = "jid"
140+
141+
# logging configuration
142+
143+
# training configuration
144+
random_seed: Optional[int] = 123
145+
classification_threshold: Optional[float] = None
146+
n_val: Optional[int] = None
147+
n_test: Optional[int] = None
148+
n_train: Optional[int] = None
149+
train_ratio: Optional[float] = 0.8
150+
val_ratio: Optional[float] = 0.1
151+
test_ratio: Optional[float] = 0.1
152+
target_multiplication_factor: Optional[float] = None
153+
epochs: int = 300
154+
batch_size: int = 64
155+
weight_decay: float = 0
156+
learning_rate: float = 1e-2
157+
filename: str = "sample"
158+
warmup_steps: int = 2000
159+
criterion: Literal["mse", "l1", "poisson", "zig"] = "mse"
160+
optimizer: Literal["adamw", "sgd"] = "adamw"
161+
scheduler: Literal["onecycle", "none", "step"] = "onecycle"
162+
pin_memory: bool = False
163+
save_dataloader: bool = False
164+
write_checkpoint: bool = True
165+
write_predictions: bool = True
166+
store_outputs: bool = True
167+
progress: bool = True
168+
log_tensorboard: bool = False
169+
standard_scalar_and_pca: bool = False
170+
use_canonize: bool = True
171+
num_workers: int = 2
172+
cutoff: float = 8.0
173+
max_neighbors: int = 12
174+
keep_data_order: bool = False
175+
distributed: bool = False
176+
n_early_stopping: Optional[int] = None # typically 50
177+
output_dir: str = os.path.abspath(".") # typically 50
178+
matrix_input: bool = False
179+
pyg_input: bool = False
180+
use_lattice: bool = False
181+
use_angle: bool = False
182+
183+
# model configuration
184+
model = MatformerConfig(name="matformer")
185+
print(model)
186+
@root_validator()
187+
def set_input_size(cls, values):
188+
"""Automatically configure node feature dimensionality."""
189+
values["model"].atom_input_features = FEATURESET_SIZE[
190+
values["atom_features"]
191+
]
192+
193+
return values

0 commit comments

Comments
 (0)