-
Notifications
You must be signed in to change notification settings - Fork 6
/
README
96 lines (72 loc) · 3.87 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
## Contributors: Ryan DeFever, Colin Targonski, Steven Hall
## Sarupria Research Group, Smith Research Group
## Clemson University
## 2019 Jun 22
--------------------------------------------------
## PLEASE CITE OUR WORK IF YOU USE THIS IN YOUR RESEARCH
Chem. Sci., 2019,10, 7503-7515
https://doi.org/10.1039/C9SC02097G
--------------------------------------------------
## REQUIREMENTS:
- Python 3 [tested with 3.6.6]
- Tensorflow [tested with 1.12.0]
- Cython [tested with 0.29.3]
- MDAnalysis (only required for reading .gro/.xtc files) [tested with 0.19.0]
--------------------------------------------------
--------------------------------------------------
## OVERVIEW:
scripts/ --> Example scripts for training and evaluation
models/ --> PointNet model in tensorflow
utils/ --> Definition of datacontainer for PointNet
datasets/ --> Example training data and example trajectory
mda_custom/ --> Custom edits of MDAnalysis neighbor search
--------------------------------------------------
--------------------------------------------------
## RELATED WORK:
This software would not be possible without the work of previous
researchers. The neighbor search functionality is adapted from
MDAnalysis (https://mdanalysis.org). The PointNet architecture is taken
from arXiv:1612.00593. Also see http://stanford.edu/~rqi/pointnet/
and https://github.com/charlesq34/pointnet.
--------------------------------------------------
## COMPILING CUSTOM MDANALYSIS NEIGHBOR SEARCH:
cd mda_custom/nsgrid/
python setup.py build_ext --inplace
--------------------------------------------------
--------------------------------------------------
## EXAMPLES:
### TRAINING POINTNET:
python scripts/train.py --dataset datasets/lj-r2.0_scaled_shuffled_equal_samples.npy
--labels datasets/lj-r2.0_scaled_shuffled_equal_labels.npy
--weights PATH_TO_SAVE_WEIGHTS/weights
Comments:
- See prepare_train.py for example of how samples and labels .npy files were prepared
- WARNING, running train.py takes ~4 hours on one NVIDIA V100 GPU
### USING POINTNET TO CLASSIFY CRYSTAL STRUCTURES:
python scripts/mda_cluster.py --weights PATH_TO_SAVE_WEIGHTS/weights
--nclass 4
--trjpath datasets/lj-seed
--cutoff 2.0
--maxneigh 43
--outname example
Comments:
- nclass = 4, the network was trained to recognize four classes (liq,fcc,hcp,bcc)
- maxneigh = 43, each point cloud has 43 points in training set. The number of points in
each point cloud affects the network architecture.
- cutoff = 2.0, same as used for training
--------------------------------------------------
--------------------------------------------------
## GENERAL PROCEDURE FOR CRYSTAL STRUCTURE IDENTIFICATION:
- Run simulations of pure phases at a range of (T,P) conditions
- Pick a cutoff distance (targeting avg of ~30--50 points appears to give accurate classification)
- Calculate mean and stdev number of points within cutoff distance for all phases
- Choose the max number of points in each point cloud (mu+2*stdev seems to work well)
- Extract training examples from pure phase simulations -- create .npy structures (see prepare_train.py for an example)
with training point clouds and labels in the SAME ORDER
(a) Shape of numpy array for training point clouds should be [NEXAMPLES,NPOINTS,3]
(b) Labels are one-hot encoded: Shape of numpy array for training labels should be [NEXAMPLES,NCLASSES]
(c) Central atom for point clouds always translated to (0,0,0)
(d) Points in point cloud scaled such that closest point to central atom is at a distance of 1.0
- Train PointNet (see train.py)
- Use PointNet for classification (see mda_cluster.py)
--------------------------------------------------