Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation of grids is slow #39

Closed
manassharma07 opened this issue Aug 15, 2020 · 22 comments
Closed

Generation of grids is slow #39

manassharma07 opened this issue Aug 15, 2020 · 22 comments

Comments

@manassharma07
Copy link

manassharma07 commented Aug 15, 2020

I was playing around with the library and noticed that for a 74 atom molecule with C and H atoms, the generation of grids is very slow.
I am using the following code:

for center_index in range(num_centers):

            min_num_angular_points = 86
            max_num_angular_points = 302
            start=timer()
            context = numgrid.new_atom_grid_bse(radial_precision,
                                    min_num_angular_points,
                                    max_num_angular_points,
                                    proton_charges[center_index],
                                    basis_set_name)
            duration = timer() - start
            print('Duration for context: ',duration)


            #num_points = numgrid.get_num_grid_points(context)
            start = timer()
            # generate an atomic grid in the molecular environment
            x, y, z, w = numgrid.get_grid(context,
                                  num_centers,
                                  center_index,
                                  x_coordinates_bohr,
                                  y_coordinates_bohr,
                                  z_coordinates_bohr,
                                  proton_charges)
            duration = timer() - start
            print('Duration for xyz,w values: ',duration)

The context creation takes a fraction of a second, however, the numgrid.get_grid method takes a lot of time around 7 seconds for radial_precision = 1.0e-10

Is this expected or I could do something better?

For example, I understand that for the same type of atoms, I could just create the context once. But that is hardly the dominant step, so I don't think it makes much of a difference.

If there is nothing wrong with how I am using it, then should I look for parallelization?

@bast
Copy link
Member

bast commented Aug 15, 2020

It takes 74 x 7 seconds? This is indeed slow. Can you post a full example? Then I can look where the code is sitting and how we can speed it up. I was anyway considering porting it to Rust.

@manassharma07
Copy link
Author

manassharma07 commented Aug 15, 2020

The code is exactly the same as above for the calculation.

Outside the above for loop, I use the following details:

x_coordinates_bohr = [-13.735889090764081, -14.02067079730638, -16.62263451156048, -11.96464892127438, -9.37137794656968, -7.31535607053768, -4.72189612323408, -4.48757010059808, -4.52479770258138, -2.65472486386698, -2.8811140373491804, -2.74165225936098, -0.09320128577748, 1.6137882000862198, 1.96660004223252, 4.08687260189052, 3.7257459653926204, 5.751721198199521, 8.30852046131652, 9.26283208576152, 11.87726799154302, 13.50640076665992, 13.77190726811442, 15.909565306871219, 12.18850586193132, 9.69161091266562, 7.933031907302221, 7.34570506992102, 5.42876702667942, 5.811436539451919, 3.81210644308992, 1.1950249209238202, -0.050304505827180004, -0.10359477871698, -11.88225686815398, -13.92183812808168, -15.20061570483798, -13.83472175998878, -16.80858354887808, -16.825402110180182, -18.08736112563438, -12.26757199731108, -12.04458433060908, -9.06845487053298, -9.29144253723498, -7.6182791465743795, -7.3952914798723794, -2.67721260313608, -5.99765013840798, -4.75931269781628, -3.13055786789718, -4.18804853134158, 0.48222027787302, -0.08091806684898, 5.95033139964342, 4.88301416105622, 9.40531742533212, 12.72481009760952, 11.73421573417572, 16.85310549317892, 13.40530042624842, 11.815284979103819, 10.05349343955912, 8.80343969783562, 9.05779681595502, 5.90516694850632, 6.65595508393602, 4.77643361527662, 3.57513480406932, 1.3966586839501198, 0.05344145096892, 1.80181593599172, -0.81469866837768, -1.33607406874278]

y_coordinates_bohr =[-0.035829204751440004, 0.95552104907796, 0.22268531054376, -0.20004639319554, 0.7230847524309599, -0.43229371724363996, 0.49083742838286004, 0.06564908085786, 3.3464023703607597, -0.84463192804344, -0.45402556611714, -3.72948762285084, -4.47611836110474, -2.37549895173234, -2.6604696308735396, -1.9782785488445398, -1.60316794002804, -4.33042048735284, -3.80658844320204, -1.54647616035804, -1.2499781526839402, 0.29657359671366, -0.76072809413184, 0.7843118744745601, 2.8027282033254597, 2.29836033686136, 0.7947053674140601, 2.42686170411336, 0.28542421337856, -0.21969954348114, 2.6738488908756604, 2.2159682837409598, 0.00668963000106, 0.53505701652546, 0.48630208600926, -2.08712676581094, 0.78752440865586, 3.00681861013746, -1.8286122505157398, 0.9288759126330599, 1.0460389239510601, 0.37291852666925995, -2.2539895706396402, 0.15011983256616, 2.77702792987506, 0.14067120262116, -2.4862368946877402, 4.00402701453276, 4.29787940582226, 3.77159071788576, -4.73255417781204, -4.16639227150764, -6.3108533238248405, -4.48084267607724, -4.79056876567434, -5.91457778393154, -5.3771397126599405, -3.10568907388194, -0.26788755620064, -0.72312254695074, 3.95508311141766, 3.8048498952921603, 1.21573631776326, 4.09038749223006, 2.6742268360734602, 1.50165185989896, 4.26745481739936, 4.14840208009236, 3.30350559041046, 1.66794774693096, 3.92446955039586, 0.9628909804350599, -1.12317753882204, 2.14207999757106]

z_coordinates_bohr = [-3.81186077871135, -1.11068645003475, -0.08456523800775, 0.56172105023025, -0.32229276742395, 1.35011473284105, 0.46610091518685004, -1.53568582496085, 0.87428172881085, 1.9835508843538499, 3.99327447365535, 1.49505671619735, 0.47384879174174993, 1.50412740094455, 3.5134730450482503, 0.10573016908455, -1.88717485891485, 0.39599208099495, -0.71459988274035, -0.86842357824495, -2.01813286995255, -0.19303550977635, 1.55477205744975, -1.32384754159395, 0.38786625924225, 1.75564993008045, 0.017102020200450003, -2.29592259033555, 1.32592624018185, 3.28576106337375, 1.30079288452815, 2.4724229977081498, 1.08763179296895, -1.75432712188815, -4.54280679125655, -3.8135615321014504, -5.00333301477585, -1.1089856966446499, -0.08626599139785, 1.83974273659095, -1.27603747407225, 2.51702053104855, 0.42792845020905, -2.2775922482422497, -0.18850016740275, 3.30541421365935, 1.2163221328198501, 0.24443605667715, -0.20664153689715, 2.87606846895855, 3.25155702297285, 0.09514770354615, 1.21254268084185, -1.58595253626825, 2.3938103965657502, -0.59346844684545, -1.41852281364285, -2.30234765869815, -3.82301016204645, -1.7437446563497498, 1.58557459107045, -1.3726024721101502, 3.4700093473012497, 2.2481125228138503, -3.41426243062575, -3.44128551226845, -1.68024986311935, 2.36773217791755, -0.6460028293396499, 4.44775357400985, 2.32918176774195, -2.40911717707665, -2.7479450469043503, -2.13094951149585]

proton_charges = [6, 6, 6, 6, 6, 6, 6, 1, 6, 6, 1, 6, 6, 6, 1, 6, 1, 6, 6, 6, 6, 6, 1, 8, 6, 6, 6, 6, 6, 1, 6, 6, 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

num_centers = 74

basis_set_name = '6-31G'

radial_precision=1.0e-10

I created the arrays for coordinates from the following XYZ file:

74
C27H46O
C         -7.26872       -0.01896       -2.01715
C         -7.41942        0.50564       -0.58775
C         -8.79632        0.11784       -0.04475
C         -6.33142       -0.10586        0.29725
C         -4.95912        0.38264       -0.17055
C         -3.87112       -0.22876        0.71445
C         -2.49872        0.25974        0.24665
H         -2.37472        0.03474       -0.81265
C         -2.39442        1.77084        0.46265
C         -1.40482       -0.44696        1.04965
H         -1.52462       -0.24026        2.11315
C         -1.45082       -1.97356        0.79115
C         -0.04932       -2.36866        0.25075
C          0.85398       -1.25706        0.79595
H          1.04068       -1.40786        1.85925
C          2.16268       -1.04686        0.05595
H          1.97158       -0.84836       -0.99865
C          3.04368       -2.29156        0.20955
C          4.39668       -2.01436       -0.37815
C          4.90168       -0.81836       -0.45955
C          6.28518       -0.66146       -1.06795
C          7.14728        0.15694       -0.10215
H          7.28778       -0.40256        0.82275
O          8.41898        0.41504       -0.70055
C          6.44988        1.48314        0.20525
C          5.12858        1.21624        0.92905
C          4.19798        0.42054        0.00905
C          3.88718        1.28424       -1.21495
C          2.87278        0.15104        0.70165
H          3.07528       -0.11626        1.73875
C          2.01728        1.41494        0.68835
C          0.63238        1.17264        1.30835
C         -0.02662        0.00354        0.57555
C         -0.05482        0.28314       -0.92835
H         -6.28782        0.25734       -2.40395
H         -7.36712       -1.10446       -2.01805
H         -8.04382        0.41674       -2.64765
H         -7.32102        1.59114       -0.58685
H         -8.89472       -0.96766       -0.04565
H         -8.90362        0.49154        0.97355
H         -9.57142        0.55354       -0.67525
H         -6.49172        0.19734        1.33195
H         -6.37372       -1.19276        0.22645
H         -4.79882        0.07944       -1.20525
H         -4.91682        1.46954       -0.09975
H         -4.03142        0.07444        1.74915
H         -3.91342       -1.31566        0.64365
H         -1.41672        2.11884        0.12935
H         -3.17382        2.27434       -0.10935
H         -2.51852        1.99584        1.52195
H         -1.65662       -2.50436        1.72065
H         -2.21622       -2.20476        0.05035
H          0.25518       -3.33956        0.64165
H         -0.04282       -2.37116       -0.83925
H          3.14878       -2.53506        1.26675
H          2.58398       -3.12986       -0.31405
H          4.97708       -2.84546       -0.75065
H          6.73368       -1.64346       -1.21835
H          6.20948       -0.14176       -2.02305
H          8.91828       -0.38266       -0.92275
H          7.09378        2.09294        0.83905
H          6.25238        2.01344       -0.72635
H          5.32008        0.64334        1.83625
H          4.65858        2.16454        1.18965
H          4.79318        1.41514       -1.80675
H          3.12488        0.79464       -1.82105
H          3.52218        2.25824       -0.88915
H          2.52758        2.19524        1.25295
H          1.89188        1.74814       -0.34185
H          0.73908        0.88264        2.35365
H          0.02828        2.07674        1.23255
H          0.95348        0.50954       -1.27485
H         -0.43112       -0.59436       -1.45415
H         -0.70702        1.13354       -1.12765

PC:
2.9 GHz i7 quad core
16 GB RAM

@manassharma07
Copy link
Author

BTW I am really loving the flexibility of the library right now. By flexibility I mean

  1. easy to install using PIP.
  2. no hardcore dependency on other libraries, meaning it can install on any OS easily. I was even able to install it on Android using PyDroid. This in big contrast to PySCF where it cannot install on Windows due to some BLAS dependencies.
  3. Simple and easy python interface and flexibility with parameters.

Therefore, if you do consider making it faster, I would request to keep the above things in mind. As speed is something that one often has to sacrifice to keep the above advantages.

Similarly if you port it to rust, I hope you would still keep this library because it is one of a kind based on what I have seen.

Thanks!

@bast
Copy link
Member

bast commented Aug 15, 2020

I will look at this example and report back. Thanks for that. And indeed I will definitely keep the pip-installability in case I go for the rewrite.

@manassharma07
Copy link
Author

Here is the code in full:

import numgrid
import numpy as np
from timeit import default_timer as timer


x_coordinates_bohr = [-13.735889090764081, -14.02067079730638, -16.62263451156048, -11.96464892127438, -9.37137794656968, -7.31535607053768, -4.72189612323408, -4.48757010059808, -4.52479770258138, -2.65472486386698, -2.8811140373491804, -2.74165225936098, -0.09320128577748, 1.6137882000862198, 1.96660004223252, 4.08687260189052, 3.7257459653926204, 5.751721198199521, 8.30852046131652, 9.26283208576152, 11.87726799154302, 13.50640076665992, 13.77190726811442, 15.909565306871219, 12.18850586193132, 9.69161091266562, 7.933031907302221, 7.34570506992102, 5.42876702667942, 5.811436539451919, 3.81210644308992, 1.1950249209238202, -0.050304505827180004, -0.10359477871698, -11.88225686815398, -13.92183812808168, -15.20061570483798, -13.83472175998878, -16.80858354887808, -16.825402110180182, -18.08736112563438, -12.26757199731108, -12.04458433060908, -9.06845487053298, -9.29144253723498, -7.6182791465743795, -7.3952914798723794, -2.67721260313608, -5.99765013840798, -4.75931269781628, -3.13055786789718, -4.18804853134158, 0.48222027787302, -0.08091806684898, 5.95033139964342, 4.88301416105622, 9.40531742533212, 12.72481009760952, 11.73421573417572, 16.85310549317892, 13.40530042624842, 11.815284979103819, 10.05349343955912, 8.80343969783562, 9.05779681595502, 5.90516694850632, 6.65595508393602, 4.77643361527662, 3.57513480406932, 1.3966586839501198, 0.05344145096892, 1.80181593599172, -0.81469866837768, -1.33607406874278]

y_coordinates_bohr =[-0.035829204751440004, 0.95552104907796, 0.22268531054376, -0.20004639319554, 0.7230847524309599, -0.43229371724363996, 0.49083742838286004, 0.06564908085786, 3.3464023703607597, -0.84463192804344, -0.45402556611714, -3.72948762285084, -4.47611836110474, -2.37549895173234, -2.6604696308735396, -1.9782785488445398, -1.60316794002804, -4.33042048735284, -3.80658844320204, -1.54647616035804, -1.2499781526839402, 0.29657359671366, -0.76072809413184, 0.7843118744745601, 2.8027282033254597, 2.29836033686136, 0.7947053674140601, 2.42686170411336, 0.28542421337856, -0.21969954348114, 2.6738488908756604, 2.2159682837409598, 0.00668963000106, 0.53505701652546, 0.48630208600926, -2.08712676581094, 0.78752440865586, 3.00681861013746, -1.8286122505157398, 0.9288759126330599, 1.0460389239510601, 0.37291852666925995, -2.2539895706396402, 0.15011983256616, 2.77702792987506, 0.14067120262116, -2.4862368946877402, 4.00402701453276, 4.29787940582226, 3.77159071788576, -4.73255417781204, -4.16639227150764, -6.3108533238248405, -4.48084267607724, -4.79056876567434, -5.91457778393154, -5.3771397126599405, -3.10568907388194, -0.26788755620064, -0.72312254695074, 3.95508311141766, 3.8048498952921603, 1.21573631776326, 4.09038749223006, 2.6742268360734602, 1.50165185989896, 4.26745481739936, 4.14840208009236, 3.30350559041046, 1.66794774693096, 3.92446955039586, 0.9628909804350599, -1.12317753882204, 2.14207999757106]

z_coordinates_bohr = [-3.81186077871135, -1.11068645003475, -0.08456523800775, 0.56172105023025, -0.32229276742395, 1.35011473284105, 0.46610091518685004, -1.53568582496085, 0.87428172881085, 1.9835508843538499, 3.99327447365535, 1.49505671619735, 0.47384879174174993, 1.50412740094455, 3.5134730450482503, 0.10573016908455, -1.88717485891485, 0.39599208099495, -0.71459988274035, -0.86842357824495, -2.01813286995255, -0.19303550977635, 1.55477205744975, -1.32384754159395, 0.38786625924225, 1.75564993008045, 0.017102020200450003, -2.29592259033555, 1.32592624018185, 3.28576106337375, 1.30079288452815, 2.4724229977081498, 1.08763179296895, -1.75432712188815, -4.54280679125655, -3.8135615321014504, -5.00333301477585, -1.1089856966446499, -0.08626599139785, 1.83974273659095, -1.27603747407225, 2.51702053104855, 0.42792845020905, -2.2775922482422497, -0.18850016740275, 3.30541421365935, 1.2163221328198501, 0.24443605667715, -0.20664153689715, 2.87606846895855, 3.25155702297285, 0.09514770354615, 1.21254268084185, -1.58595253626825, 2.3938103965657502, -0.59346844684545, -1.41852281364285, -2.30234765869815, -3.82301016204645, -1.7437446563497498, 1.58557459107045, -1.3726024721101502, 3.4700093473012497, 2.2481125228138503, -3.41426243062575, -3.44128551226845, -1.68024986311935, 2.36773217791755, -0.6460028293396499, 4.44775357400985, 2.32918176774195, -2.40911717707665, -2.7479450469043503, -2.13094951149585]

proton_charges = [6, 6, 6, 6, 6, 6, 6, 1, 6, 6, 1, 6, 6, 6, 1, 6, 1, 6, 6, 6, 6, 6, 1, 8, 6, 6, 6, 6, 6, 1, 6, 6, 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

num_centers = 74

basis_set_name = '6-31G'

radial_precision=1.0e-10

for center_index in range(num_centers):

            min_num_angular_points = 86
            max_num_angular_points = 302
            start=timer()
            context = numgrid.new_atom_grid_bse(radial_precision,
                                    min_num_angular_points,
                                    max_num_angular_points,
                                    proton_charges[center_index],
                                    basis_set_name)
            duration = timer() - start
            print('Duration for context: ',duration)


            #num_points = numgrid.get_num_grid_points(context)
            start = timer()
            # generate an atomic grid in the molecular environment
            x, y, z, w = numgrid.get_grid(context,
                                  num_centers,
                                  center_index,
                                  x_coordinates_bohr,
                                  y_coordinates_bohr,
                                  z_coordinates_bohr,
                                  proton_charges)
            duration = timer() - start
            print('Duration for xyz,w values: ',duration)

@bast
Copy link
Member

bast commented Aug 15, 2020

And it's really nice to hear that you find the library useful. I am sure we will be able to solve the generation time problem.

@bast
Copy link
Member

bast commented Aug 15, 2020

I confirm that this takes forever also on my side. I have a feeling for what it is (Becke partitioning) but need to do some digging in the code. But this is now the "code's fault", not your fault.

@bast
Copy link
Member

bast commented Aug 15, 2020

Few years ago I used this to create big grids but maybe later I introduced something which made the Becke partitioning not scale well. Based on very quick tests the problem is neither the radial grid generation, nor the angular grid generation itself but probably this:

double get_becke_w(const int num_centers,
const int proton_charges[],
const double x_coordinates_bohr[],
const double y_coordinates_bohr[],
const double z_coordinates_bohr[],
const int center_index,
const double x,
const double y,
const double z)

@manassharma07
Copy link
Author

Thanks a lot for looking into it.

Also, I noticed that the smaller grids generated using numgrid, compared to PySCF give similar results. Even when the grids are half the size or even smaller. I am not sure if this is a coincidence or an advantage of the used method. But it is a very welcome feature for me.

I was also looking into parallelisation using Python, but don't seem to be getting somewhere with that.

I will also try to play with the older code that you mention and see if I can see something.

Thanks again!

@manassharma07
Copy link
Author

I am now able to parallelise the generation of grids. So basically the grids for different atoms are generated in parallel.

For a Benzene dimer (C12H12) which took 20-21 seconds serially, I was able to get the computation time down to 9-10 seconds.
But unfortunately, it is still not enough and the original example of 74 atoms is taking 240 seconds. It is much better than before, but I am not sure if it is enough. The parallel efficiency is around 50%. This is still nowhere near the PySCF though. It can generate grids with millions of grid points much faster. But I guess for small molecules (natoms<20), this parallelisation would be sufficient.

I did the parallelisation using JOBLIB. If you want I can post an example.

@bast
Copy link
Member

bast commented Aug 15, 2020

Long time ago I had this screening in:

// in principle good idea but fails for larger molecules containing
// diffuse sets if (a != icent && dist_a > BECKE_CUTOFF)
// {
// pa[a] = 0.0;
// continue;
// }

Without this the Becke partitioning scales quadratically. I removed that check because it lead to errors for very diffuse basis sets. So I could try to put that back in but needs some testing so that it is a conservative estimate.

@manassharma07
Copy link
Author

manassharma07 commented Aug 15, 2020

That's interesting. Would it be possible to have an optional argument to turn this on and provide some cutoff? By default it can be off. And if the optional argument is provided it can be turned on for testing/experimenting purposes.

I think C++ does support it.

@bast
Copy link
Member

bast commented Aug 15, 2020

I was also thinking of re-enabling it with a very conservative default which can be adjusted in the function call.

@manassharma07
Copy link
Author

manassharma07 commented Aug 15, 2020

Also, I am not sure but what do you think about using pre-computed interatomic distances

dist_ab = vx * vx + vy * vy + vz * vz;

I saw that PySCF and HORTON both are doing something like this (precomputing):
https://github.com/pyscf/pyscf/blob/master/pyscf/dft/gen_grid.py
https://github.com/theochem/horton/blob/master/horton/grid/becke.cpp

But I am not sure how much of an impact this is going to have.

@manassharma07
Copy link
Author

manassharma07 commented Aug 15, 2020

Similarly stuff like u_ab or rather effectively a_ab can also be pre-computed I guess:

u_ab = (R_a + R_b) / (R_b - R_a);

This and dist_ab should save like a million operations for a half million grid points.

@bast
Copy link
Member

bast commented Aug 23, 2020

I will work on this in September but the next two weeks are packed with events so I have to park this a bit but I see no reason why this should not be super fast. We will get there.

@bast
Copy link
Member

bast commented Sep 5, 2020

Good progress on porting the code to Rust and at the same time cleaning up some dusty corners. I will soon have it done (1-2 weeks?).

@bast
Copy link
Member

bast commented Sep 8, 2020

@manassharma07 would love to hear your view on #40.

@bast
Copy link
Member

bast commented Jan 3, 2021

Version 2.0.0 is out now and it should be a bit faster. Still, the bottleneck is the space partitioning and what I or somebody should do is to implement a more efficient space partitioning as suggested in #43.

@manassharma07
Copy link
Author

Thanks a lot! I could install it on a linux virtual machine and it works very well. Seems to be significantly faster. A big thank you!

@bast
Copy link
Member

bast commented Jan 17, 2021

That is great to hear - thanks for feedback! More work upcoming soon.

@bast
Copy link
Member

bast commented Jan 22, 2021

I will soon have the LKO and SSF partitioning schemes in place. Just need to do a bit more testing. These will significantly speed up grid generation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants