Skip to content

Commit

Permalink
Merge pull request #92 from lanl/mlmd-bugfix
Browse files Browse the repository at this point in the history
Mlmd bugfix
  • Loading branch information
gshipman authored Mar 19, 2024
2 parents 41c514e + 4024fc3 commit 0bc4e4e
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 14 deletions.
23 changes: 11 additions & 12 deletions doc/sphinx/05_mlmd/gpu.csv
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
No. Particles,Actual
568, 110.105
1136, 257.538
2272, 501.101
3408, 628.085
4544, 890.587
6816, 1255.000
9088, 1932.000
11360, 2234.000
13632, 2257.000
15904, 2545.000
18176, 2634.000

568,133.438
1136,219.606
2272,425.377
3408,659.993
4544,836.888
6816,1241
9088,1908
11360,2205
13632,2257
15904,2365
18176,2529
22 changes: 20 additions & 2 deletions doc/sphinx/05_mlmd/mlmd.rst
Original file line number Diff line number Diff line change
Expand Up @@ -197,9 +197,11 @@ Training on CPU or GPU is configurable by editing the ``train_model.py`` script.
The process can take quite some time. This will write several files to disk. The final errors of
the model are captured in ``model_results.txt``. An example is shown here::
the model are captured in ``model_results.txt``. Examples for Crossroads and Chicoma are shown here::

train valid test
Training Accuracy on Crossroads:

train valid test
-----------------------------------------------------
EpA-RMSE : 0.53794 0.59717 0.5623
EpA-MAE : 0.42529 0.50263 0.45122
Expand All @@ -214,6 +216,22 @@ the model are captured in ``model_results.txt``. An example is shown here::
Loss : 0.058131 0.060652 0.058545
-----------------------------------------------------

Training Accuracy on Chicoma:
train valid test
-----------------------------------------------------
EpA-RMSE : 0.63311 0.67692 0.65307
EpA-MAE : 0.49966 0.56358 0.51061
EpA-RSQ : 0.998 0.99789 0.99756
ForceRMSE: 31.36 32.088 30.849
ForceMAE : 24.665 25.111 24.314
ForceRsq : 0.99825 0.99817 0.99831
T-Hier : 0.00084411 0.0008716 0.00085288
L2Reg : 98.231 98.231 98.231
Loss-Err : 0.067352 0.069605 0.0668
Loss-Reg : 0.00094234 0.00096983 0.00095111
Loss : 0.068294 0.070575 0.067751
-----------------------------------------------------

The numbers will vary from run to run due random seeds and the non-deterministic nature of multi-threaded / data parallel execution. However you should find that the Energy Per Atom mean absolute error "EpA-MAE" for test is below 0.7 (meV/atom). The test Force MAE "Force MAE" should be below 25 (meV/Angstrom).

The training script will also output the initial box file ``ag_box.data`` as well as an file used to run the resulting potential with LAMMPS, ``hippynn_lammps_model.pt``. Several other files for the training run are put in a directory, ``model_files``.
Expand Down

0 comments on commit 0bc4e4e

Please sign in to comment.