Merge pull request #92 from lanl/mlmd-bugfix

Mlmd bugfix
lanl · Mar 19, 2024 · 0bc4e4e · 0bc4e4e
2 parents 41c514e + 4024fc3
commit 0bc4e4e
Show file tree

Hide file tree

Showing 2 changed files with 31 additions and 14 deletions.
diff --git a/doc/sphinx/05_mlmd/gpu.csv b/doc/sphinx/05_mlmd/gpu.csv
@@ -1,13 +1,12 @@
 No. Particles,Actual
-568, 110.105
-1136, 257.538 
-2272, 501.101
-3408, 628.085
-4544, 890.587
-6816, 1255.000 
-9088, 1932.000
-11360, 2234.000 
-13632, 2257.000
-15904, 2545.000
-18176, 2634.000
-
+568,133.438
+1136,219.606
+2272,425.377
+3408,659.993
+4544,836.888
+6816,1241
+9088,1908
+11360,2205
+13632,2257
+15904,2365
+18176,2529
diff --git a/doc/sphinx/05_mlmd/mlmd.rst b/doc/sphinx/05_mlmd/mlmd.rst
@@ -197,9 +197,11 @@ Training on CPU or GPU is configurable by editing the ``train_model.py`` script.
 
 
 The process can take quite some time. This will write several files to disk. The final errors of
-the model are captured in ``model_results.txt``. An example is shown here::
+the model are captured in ``model_results.txt``. Examples for Crossroads and Chicoma are shown here::
 
-                       train         valid          test     
+Training Accuracy on Crossroads:
+
+                       train         valid          test
    -----------------------------------------------------
    EpA-RMSE :        0.53794       0.59717        0.5623
    EpA-MAE  :        0.42529       0.50263       0.45122
@@ -214,6 +216,22 @@ the model are captured in ``model_results.txt``. An example is shown here::
    Loss     :       0.058131      0.060652      0.058545
    -----------------------------------------------------
 
+Training Accuracy on Chicoma:
+                       train         valid          test
+   -----------------------------------------------------
+   EpA-RMSE :        0.63311       0.67692       0.65307
+   EpA-MAE  :        0.49966       0.56358       0.51061
+   EpA-RSQ  :          0.998       0.99789       0.99756
+   ForceRMSE:          31.36        32.088        30.849
+   ForceMAE :         24.665        25.111        24.314
+   ForceRsq :        0.99825       0.99817       0.99831
+   T-Hier   :     0.00084411     0.0008716    0.00085288
+   L2Reg    :         98.231        98.231        98.231
+   Loss-Err :       0.067352      0.069605        0.0668
+   Loss-Reg :     0.00094234    0.00096983    0.00095111
+   Loss     :       0.068294      0.070575      0.067751
+   -----------------------------------------------------
+
 The numbers will vary from run to run due random seeds and the non-deterministic nature of multi-threaded / data parallel execution. However you should find that the Energy Per Atom mean absolute error "EpA-MAE" for test is below 0.7 (meV/atom). The test Force MAE "Force MAE" should be below 25 (meV/Angstrom).
 
 The training script will also output the initial box file ``ag_box.data`` as well as an file used to run the resulting potential with LAMMPS, ``hippynn_lammps_model.pt``. Several other files for the training run are put in a directory, ``model_files``.