Changes to be committed:

modified: README.md modified: cli.py modified: initialSampling.py modified: psi4calc.py Untracked files: examples/SLURM/build_nwx.slurm examples/SLURM/doNWChemExSingleTrajectory.slurm examples/SLURM/doQCEngineGAMESSSingleTrajectory.slurm examples/SLURM/doSingleTrajectory.slurm examples/input.gamess.qcengine examples/input.nwchemex Update to add more detailed building and testing instructions
kaka-zuumi · Apr 17, 2024 · 6b91ab3 · 6b91ab3
1 parent d301203
commit 6b91ab3
Show file tree

Hide file tree

Showing 4 changed files with 146 additions and 28 deletions.
diff --git a/README.md b/README.md
@@ -7,14 +7,23 @@ git clone https://github.com/kaka-zuumi/bimolecularInitialSampling.git
 cd bimolecularInitialSampling
 ```
 
-The initial sampling package makes use of energy and force calls from an ASE `atoms` object with an attached `calculator`. The same calculator can then be used to do molecular dynamics with. The calculator can either be an analytical functional (e.g. some ML potential) or an _ab initio_ electronic structure calculation. The examples here make use of the `psi4` python package for the latter case.
+The initial sampling package makes use of energy and force calls from an ASE `atoms` object with an attached `calculator`. The same calculator can then be used to do molecular dynamics with. The calculator can either be an analytical functional (e.g. some ML potential) or an _ab initio_ electronic structure calculation. As of now, psi4, NWChemEx, and GAMESS (via QCEngine) interfaces are available for use. The examples here make use of the `psi4` python package for the latter case.
 
-For the ASE/psi4 interface, we suggest creating a conda environment with the two packages and activating it like so:
+For the psi4 interface, we suggest creating a conda environment with two packages and activating it like so:
 ```
 conda create -n bisamplepsi4ase psi4 ase                                                               
 conda activate bisamplepsi4ase                                                                                   
 ```
 
+For the QCEngine/GAMESS interface (which requires GAMESS pre-installed), we suggest creating a python virtual environment with four packages and activating it like so:
+```
+python -m venv .bisampleqcenginegamess
+source .bisampleqcenginegamess/bin/activate
+pip install qcengine qcelemental networkx ase
+```
+
+For the NWChemEx interface, there is quite a lot of work involved in building the package... After finding the location of appropriate packages and modules, the script "examples/SLURM/build_nwx.slurm" and toolchain file "examples/SLURM/toolchain.cmake" can be changed, and then (with both in a fresh directory) the build_nwx.slurm script can be executed/submitted to do the build. Depending on the number of processes, it may take anywhere from 1-6 hours. This assumes NWChem is already installed.
+
 
 ## Try it out yourself!
 
@@ -41,3 +50,29 @@ mkdir test3/; cd test3/
 python -u ../cli.py ../examples/input.h2br.ch3.xyz ../examples/input.psi4 . --atomsInFirstGroup "1 2 4" --collisionEnergy 5.0 --impactParameter 1.0 --centerOfMassDistance 10.0 --production 1000 --interval 1 --time_step 0.15 --INITQPa "thermal" --INITQPb "thermal" --TVIBa 298.15 --TROTa 298.15 --TVIBb 298.15 --TROTb 298.15 > asepsi4md0.out
 ```
 
+
+Command line arguments can be explained with "cli.py --help". In general, three positional arguments are always required: (1) the combined XYZ file of both reactants, (2) the potential energy surface file with a specific file ending depending on the method (".nwchemex",".gamess.qcengine",".psi4", and ".npz" correspond to the NWChemEx, QCEngine/GAMESS, psi4, and sGDML interfaces, respectively), and (3) the directory to place output files like the trajectory. 
+
+
+
+
+## Simulations on an HPC cluster
+
+While on an interactive terminal, these simulations can be done one-by-one, when submitted to a node on a cluster, hundreds of thousands of these simulations can be done at once. After getting thousands of trajectories, statistically meaningful averages (e.g., product yields, intermediate lifetimes, rate constants) can be calculated. However, configuring things on your own HPC cluster (with its own job scheduler) may be tricky. Examples shown below are for an HPC cluster with a SLURM job scheduler.
+
+Simulations making use of an sGDML or psi4 potential energy surface only require loading the python package, either with the conda or virtual environment described earlier. An example trajectory can be submitted with "examples/SLURM/doSingleTrajectory.slurm" using a varying impact paramter, here 1.0, with:
+```
+sbatch examples/SLURM/doSingleTrajectory.slurm 1.0
+```
+
+Simulations making use of a QCEngine/GAMESS potential energy surface require both loading the python package, as well as specifying PATHs in the environment (with an "export" statement in bash) for the GAMESS executable, as well as changing PATHs in the "rungms.MPI" executable so as to use an appropriate scratch directory (in this case, the current directory "."). An exaple trajectory can be submitted with "examples/SLURM/doQCEngineGAMESSSingleTrajectory.slurm" using a varying impact parameter, here 1.0, with:
+```
+sbatch examples/SLURM/doSingleTrajectory.slurm 1.0
+```
+
+Simulations making use of a NWChemEx potential energy surface require the loading the python package used in the build script "build_nwx.slurm" as well as loading all relevant modules and PATHs as used in the build script. Additional paths must be specified for the new install and module directories made during the build. An example trajectory can be submitted with "examples/SLURM/doNWChemExSingleTrajectory.slurm" using a varying impact parameter, here 1.0, with:
+```
+sbatch examples/SLURM/doSingleTrajectory.slurm 1.0
+```
+
+
diff --git a/cli.py b/cli.py
@@ -9,9 +9,32 @@
 import numpy as np
 import os
 
-from psi4calc import psi4calculator
 from initialSampling import initialSampling
 
+# Try importing psi4
+try:
+  from psi4calc import psi4calculator
+except ImportError:
+  print("WARNING: psi4 has not been loaded ... it will not be available for initial sampling")
+
+# Try importing NWChemEx
+try:
+  from nwchemexcalc import nwchemexcalculator
+except ImportError:
+  print("WARNING: NWChemEx has not been loaded ... it will not be available for initial sampling")
+
+# Try importing QCEngine/GAMESS
+try:
+  from qcengineGAMESScalc import qcengineGAMESScalculator
+except ImportError:
+  print("WARNING: QCEngine/GAMESS has not been loaded ... it will not be available for initial sampling")
+
+# Try importing sGDML
+try:
+  from sgdml.intf.ase_calc import SGDMLCalculator
+except ImportError:
+  print("WARNING: sGDML has not been loaded ... it will not be available for initial sampling")
+
 ###################################################
 
 # Define global constants up here in the correct
@@ -26,7 +49,7 @@
 
 parser = argparse.ArgumentParser(description="Do a single MD trajectory using a initial geometry (and momenta) and a sGDML model",formatter_class=argparse.ArgumentDefaultsHelpFormatter)
 parser.add_argument("initialGeometryFile", type=str, help="XYZ file with initial geometry; if initial conditions are sampled in the script, then this argument is required but is just an example XYZ")
-parser.add_argument("psi4inputFile", type=str, help="psi4 input file")
+parser.add_argument("PESinputFile", type=str, help="PES input file (may be a psi4 input file or a sGDML .npz model)")
 parser.add_argument("outputDir", type=str, help="Directory to output stuff in")
 parser.add_argument("--isotopeMassesFile", type=str, help="Change masses of specific atoms e.g. like isotopic substitution", default=None)
 parser.add_argument("--initialMomentaFile", type=str, help="XYZ file with initial momenta")
@@ -40,23 +63,32 @@
 parser.add_argument("--n_threads", type=int, help="The number of threads to ask psi4 to use")
 
 parser.add_argument("--INITQPa", type=str, help="Initial sampling method for atoms in first group ('semiclassical', 'thermal', or None)", default=None)
-parser.add_argument("--NVIBa", type=int, help="Vibrational quantum number of atoms in first group (supply if using the 'QM' initial sampling)")
-parser.add_argument("--NROTa", type=int, help="Rotational quantum number of atoms in first group (supply if using the 'QM' initial sampling)")
+parser.add_argument("--NVIBa", type=int, help="Vibrational quantum number of atoms in first group (supply if using the 'semiclassical' initial sampling)")
+parser.add_argument("--NROTa", type=int, help="Rotational quantum number of atoms in first group (supply if using the 'semiclassical' initial sampling)")
 parser.add_argument("--TVIBa", type=float, help="Vibrational temperature of atoms in first group (supply if using the 'thermal' initial sampling)")
 parser.add_argument("--TROTa", type=float, help="Rotational temperature of atoms in first group (supply if using the 'thermal' initial sampling)")
 
 parser.add_argument("--INITQPb", type=str, help="Initial sampling method for atoms in second group ('semiclassical', 'thermal', or None)", default=None)
-parser.add_argument("--NVIBb", type=int, help="Vibrational quantum number of atoms in second group (supply if using the 'QM' initial sampling)")
-parser.add_argument("--NROTb", type=int, help="Rotational quantum number of atoms in second group (supply if using the 'QM' initial sampling)")
+parser.add_argument("--NVIBb", type=int, help="Vibrational quantum number of atoms in second group (supply if using the 'semiclassical' initial sampling)")
+parser.add_argument("--NROTb", type=int, help="Rotational quantum number of atoms in second group (supply if using the 'semiclassical' initial sampling)")
 parser.add_argument("--TVIBb", type=float, help="Vibrational temperature of atoms in second group (supply if using the 'thermal' initial sampling)")
 parser.add_argument("--TROTb", type=float, help="Rotational temperature of atoms in second group (supply if using the 'thermal' initial sampling)")
 args = vars(parser.parse_args())
 
 ########################################################################################
 
+# A function to print the potential, kinetic and total energy
+def printenergy(a):
+    epot = a.get_potential_energy() / (units.kcal/units.mol)
+    ekin = a.get_kinetic_energy() / (units.kcal/units.mol)
+    print('@Epot = %.3f  Ekin = %.3f (T=%3.0fK)  '
+          'Etot = %.3f  kcal/mol' % (epot, ekin, ekin / (len(a) * 1.5 * 8.617281e-5), epot + ekin))
+
+########################################################################################
+
 # Get the various arguments
 Qfile = args["initialGeometryFile"]
-input_path = args["psi4inputFile"]
+input_path = args["PESinputFile"]
 output_path = args["outputDir"]
 
 Pfile = args["initialMomentaFile"]
@@ -70,6 +102,9 @@
 Nprint = args["interval"]
 dt = args["time_step"]
 
+if ((Nsteps is None) or (Nprint is None) or (dt is None)):
+  raise ValueError("For MD, need to specify these three: --production --interval --time_step")
+
 n_threads = args["n_threads"]
 if (n_threads is None): n_threads = 1
 
@@ -95,21 +130,67 @@
 
 # Adjust the maximum interatomic distance allowed
 # for the simulation
-r2threshold = 900.0
+r2threshold = 24.0*24.0
 if ((b is not None) and (dCM is not None) and (1.2*(b**2 + dCM**2) > r2threshold)):
   r2threshold = 1.2*(b**2 + dCM**2)
 
 ########################################################################################
 
-# Get the model ready
-calc = psi4calculator(input_path,n_threads=n_threads)
+# Look at the input file name to guess its identity
+try_psi4 = False
+try_nwchemex = False
+try_qcenginegamess = False
+if (input_path.endswith(('.npz',))):
+
+  print("Input file '"+input_path+"' looks like a sGDML file so will attempt to read it in as such...")
+  try:
+    calc = SGDMLCalculator(input_path)
+    try_psi4 = False
+  except:
+    print("   Could not load file '"+input_path+"' as a sGDML model!")
+    try_psi4 = True
 
-# To conform to VENUS, we are going to keep the units
-# in kcal/mol and Angstroms (which the model was
-# originally trained on)
-calc.E_to_eV = units.Ha
-calc.Ang_to_R = units.Ang
-calc.F_to_eV_Ang = (units.Ha / units.Bohr)
+elif (input_path.endswith(('.psi4',))):
+  try_psi4 = True
+
+elif (input_path.endswith(('.gamess.qcengine',))):
+  try_qcenginegamess = True
+
+else:
+  try_nwchemex = True
+
+if (try_psi4):
+  print("Reading input file '"+input_path+"' as a psi4 input file...")
+  calc = psi4calculator(input_path,n_threads=n_threads)
+
+  # To conform to VENUS, we are going to keep the units
+  # in kcal/mol and Angstroms (which the model was
+  # originally trained on)
+  calc.E_to_eV = units.Ha
+  calc.Ang_to_R = units.Ang
+  calc.F_to_eV_Ang = (units.Ha / units.Bohr)
+
+if (try_qcenginegamess):
+  print("Reading input file '"+input_path+"' as a QCEngine/GAMESS input file...")
+  calc = qcengineGAMESScalculator(input_path,n_threads=n_threads)
+
+  # To conform to VENUS, we are going to keep the units
+  # in kcal/mol and Angstroms (which the model was
+  # originally trained on)
+  calc.E_to_eV = units.Ha
+  calc.Ang_to_R = (units.Ang / units.Bohr)
+  calc.F_to_eV_Ang = (units.Ha / units.Bohr)
+
+if (try_nwchemex):
+  print("Reading input file '"+input_path+"' as a NWChemEx input file...")
+  calc = nwchemexcalculator(input_path,n_threads=n_threads)
+
+  # To conform to VENUS, we are going to keep the units
+  # in kcal/mol and Angstroms (which the model was
+  # originally trained on)
+  calc.E_to_eV = units.Ha
+  calc.Ang_to_R = (units.Ang / units.Bohr)
+  calc.F_to_eV_Ang = (units.Ha / units.Bohr)
 
 # Read in the geometry; set it in the "calculator"
 mol = read(Qfile)
@@ -192,17 +273,9 @@
 
 ########################################################################################
 
-
 # Run MD with constant energy using the velocity verlet algorithm
 dyn = VelocityVerlet(mol, dt * units.fs, trajectory=trajfile) 
 
-# A function to print the potential, kinetic and total energy
-def printenergy(a):
-    epot = a.get_potential_energy() / (units.kcal/units.mol)
-    ekin = a.get_kinetic_energy() / (units.kcal/units.mol)
-    print('@Epot = %.3f  Ekin = %.3f (T=%3.0fK)  '
-          'Etot = %.3f  kcal/mol' % (epot, ekin, ekin / (len(a) * 1.5 * 8.617281e-5), epot + ekin))
-
 # A function to see if any interatomic distance is > 20 
 def checkGeneralReactionProgress(a):
     Natoms = len(a)

diff --git a/initialSampling.py b/initialSampling.py
@@ -31,7 +31,7 @@ class initialSampling:
 
     # If "debug" is true, then more information
     # is printed out during the sampling
-    debug=False
+    debug=True
 
     def __init__(self,mol,atomsInFirstGroup,optimize=False,optimization_file="optimization.traj",
                       samplingMethodA="thermal",vibrationalSampleA=298,rotationalSampleA=298,
@@ -897,7 +897,7 @@ def sampleRelativeQP(self):
 
             # Optimize the two molecules with ASE
             optimizer = QuasiNewton(
-                self.mol,maxstep=0.010,
+                self.mol,maxstep=0.0250,    # original value = 0.010,
                 trajectory=self.optimization_file,
             )
 

diff --git a/psi4calc.py b/psi4calc.py
@@ -53,6 +53,8 @@ def __init__(
         self.referencemethod = 'uhf'
         self.freeze_core = 0
         self.df_ints_io = "None"
+        self.dft_spherical_points = 302   # The default for DFT jobs
+        self.dft_radial_points = 75       # The default for DFT jobs
         self.psi4method = 'b3lyp/6-31g*'  # LevelOfTheory/BasisSet with no spaces
         self.mulliken = 0
         self.charge = 0
@@ -71,6 +73,8 @@ def __init__(
                 if (entries[0] == "referencemethod"): self.referencemethod = str(entries[1])
                 if (entries[0] == "freeze_core"): self.freeze_core = int(entries[1])
                 if (entries[0] == "df_ints_io"): self.df_ints_io = str(entries[1])
+                if (entries[0] == "dft_spherical_points"): self.dft_spherical_points = int(entries[1])
+                if (entries[0] == "dft_radial_points"): self.dft_radial_points = int(entries[1])
                 if (entries[0] == "psi4method"): self.psi4method = str(entries[1])
                 if (entries[0] == "mulliken"): self.mulliken = int(entries[1])
                 if (entries[0] == "charge"): self.charge = int(entries[1])
@@ -84,6 +88,11 @@ def __init__(
         psi4.set_options({'e_convergence': self.e_convergence})
         psi4.set_options({'reference': self.referencemethod})
         psi4.set_options({'freeze_core': self.freeze_core})
+        psi4.set_options({'df_ints_io': self.df_ints_io})
+        psi4.set_options({'dft_spherical_points': self.dft_spherical_points})
+#       psi4.set_options({'DFT_SPHERICAL_POINTS': self.dft_spherical_points})
+        psi4.set_options({'dft_radial_points': self.dft_radial_points})
+#       psi4.set_options({'DFT_RADIAL_POINTS': self.dft_radial_points})
         psi4.set_options({'df_ints_io': self.df_ints_io})
         self.movecs = self.scratchdir+"/md.wfn"
         self.ref_wfn = None
@@ -122,6 +131,7 @@ def calculate(self, atoms=None, *args, **kwargs):
         #    (2) Try the default superposition of atomic densities (SAD)
         tmp_d_convergence = self.d_convergence
         tmp_e_convergence = self.e_convergence
+        self.ref_wfn = None   # Kazuumi SUPER temporary line to test no initial guess wfn
         for i in range(100):
             if (self.ref_wfn is None):
                 try: