Inputs and code to reproduce the results and analysis from "Automated Adaptive Absolute Binding Free Energy Calculations" (Clark, F.; Robb, G. R.; Cole, D. J.; Michel, J. Automated Adaptive Absolute Binding Free Energy Calculations. J. Chem. Theory Comput. 2024, 20 (18), 7806–7828. https://doi.org/10.1021/acs.jctc.4c00806).
├── analysis
│ ├── adaptive
│ ├── cyclod
│ ├── lambda_window_spacing
│ ├── non_adaptive
| └── alibay
└── simulations
├── cyclod
├── initial_systems
├── scripts
└── shared_config_files
It is recommended that you run the following in a TMUX session, or using nohup
for the a3fe script, to ensure that the calculations are not interrupted.
- Ensure that you have SLURM, mamba, and GROMACS installed and that that
gmx
is in your path - Clone this repository, install a3fe, and activate the environment
git clone https://github.com/michellab/Automated-ABFE-Paper.git
cd Automated-ABFE-Paper
make env
mamba activate a3fe_reproduce
- Modify
run_somd.sh
to match your slurm configuration, e.g.
vim simulations/shared_config_files/run_somd.sh
- Change into the desired directory. For example, to run
T4L
:
cd simulations/initial_systems/t4l
- Run the desired ABFE script. For example, to run the non-adaptively for 0.2 ns per window, run:
python ../../scripts/run_non_adaptive_200ps.py
If you want to repeat the adaptive runs and obtain similar simulation times to those shown in the paper, you need to determine the SOMD simulation speed for the MIF bound complex on your GPUs (this is taken as the reference). This can be done by running a short non-adaptive run (as described above). Once this is complete, you can determine the cost by starting an ipython session in the directory you have run in and running:
import a3fe as a3
calc = a3.Calculation()
cost = calc.legs[0].tot_gpu_time/calc.legs[0].tot_simtime # First leg is the bound leg
print(cost) # In hr/ns
You should then modify the run_adaptive.py
script to match this cost (specifically the reference_sim_cost
argument to get_optimal_lam_vals
).
Note that the input files provided are not identical to those used to complete the study, because old input files are incompatible with new versions of Sire due to an issue with box vectors. Hence, the provided equilibrated input files were generated from the original parameterised ligands/ complexes, and the original equilibration steps were repeated.
The analysis
directory contains the code and inputs required to perform analyses not already performed by a3fe
by default (when calc.analyse()
or calc.analyse_convergence()
is run). Each sub-directory contains two notebooks - "analysis" notebooks use the pre-provided data in the final_analysis
directories to generate the figures from the paper. These can be run without changes. The "preprocessing" notebooks show examples of the computationally-intensive analyses run on the simulation outputs (not included due to size) to generate the data which which is used in the "analysis" notebooks. These will not work without changes - they are intended to be adapted to run on outputs generated by the user. Note that make analysis
will skip the alibay
directory to avoid downloading large files from Zenodo.
To rerun the "analysis" notebooks and regenerate the plots:
- Ensure that you have mamba installed
- Clone this repository, install a3fe, and run all of the notebooks
git clone https://github.com/michellab/Automated-ABFE-Paper.git
cd Automated-ABFE-Paper
make env
make analysis
Alternatively, you can acitvate the environment (mamba activate a3fe_reproduce
) and run the notebooks cell-by-cell. The notebooks can be cleaned and the figures removed with make clean
.