Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orca to trexio #22

Merged
merged 6 commits into from
Sep 25, 2023
Merged

Orca to trexio #22

merged 6 commits into from
Sep 25, 2023

Conversation

v1j4y
Copy link
Contributor

@v1j4y v1j4y commented Sep 6, 2023

Added everything except ECPs.

Converter uses filename.json orca data repository created using orca_2json to obtain the corresponding trexio_file.h5 file. Note that the orca_2json is the long term supported data repository generator for ORCA.

  • ECPs are not yet supported by orca_2json. This will need to be implemented in ORCA before we can implement it here.
  • Needs testing.

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 6, 2023

@scemama what is the recommended way to test the converter ?

@scemama
Copy link
Member

scemama commented Sep 7, 2023

To test the produced TREXIO files, the best solution is to run:

trexio check-mos  FILE.trexio

It will compute numerically the overlap of MOs, and if your MOs are orthonormal it is easy to see that everything is OK. With the -n option, you can increase the number of grid points.
If the calculation takes too much time, you can install QMCkl (make python-install in qmckl) to accelerate the calculation :-)

@scemama
Copy link
Member

scemama commented Sep 7, 2023

You should also run

trexio check-basis

to compare the overlap matrix in the trexio file with a numerically computed one. If they match, it means that all the conventions and normalization factors are OK. Because it is possible that a normalization factor is missing in the AOs and that you include it inside the MO coefficients, so that the MO check doesn't detect the problem.

@q-posev
Copy link
Member

q-posev commented Sep 7, 2023

Amazing @v1j4y, thanks!

If you have some test files, you can upload them to the data folder of this repo and we can then modify the CI to test ORCA converter.

@q-posev
Copy link
Member

q-posev commented Sep 7, 2023

I see a lot of comments referring to PySCF. Is it because ORCA's notations are similar to PySCF?

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 11, 2023

OK thanks for the comments !

Currently, there's a problem with normalization of AOs which was revealed by trexio check-mos, I'm fixing this atm.

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 13, 2023

OK !
Just FYI, everything was ok except that the coordinates had to be given in Bohrs ! This fixes the normalization issues with AOs.

Also, Dagmar Lenk has implemented ECPs in orca_2json so I'll also add it here. 😉

@scemama
Copy link
Member

scemama commented Sep 14, 2023

So we wait for ECPs before we merge, or you make another PR?

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 14, 2023

Yes !
I'm implementing ECPs following the pyscf to trexio functions by kosuke.
I'll tell you when it's done.

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 15, 2023

@scemama I've implemented ECPs. Is there any way to check for ECPs ?

Also, I'm testing the water molecule which was used for the pyscf test by Kosuke.

Here's the output (I changed the integration domain to [40.,40.,40.]):

trexio check-basis h2o -b TEXT --n_points 500
Integration steps: [0.037033353988351204, 0.03301131648479499, 0.035256541967839575]

  0   0        0.999923        1.000000
  0   1       -0.344015       -0.344017
  0   2       -0.157869       -0.157870
  0   3       -0.000000       -0.000000
  0   4       -0.000000        0.000000
  0   5        0.000000       -0.000000
  0   6       -0.000000        0.000000
  0   7       -0.000000       -0.000000
  0   8        0.000000        0.000000
  0   9       -0.047843       -0.047844
  0  10       -0.062517       -0.062518
  0  11       -0.047843       -0.047844
  0  12       -0.062517       -0.062518
  1   1        1.000000        1.000000
  1   2        0.788941        0.788941
  1   3       -0.000000       -0.000000
  1   4       -0.000000        0.000000
  1   5        0.000000        0.000000
  1   6        0.000000        0.000000
  1   7       -0.000000        0.000000
  1   8        0.000000        0.000000
  1   9        0.329268        0.329268
  1  10        0.391443        0.391443
  1  11        0.329268        0.329268
  1  12        0.391443        0.391443
  2   2        1.000000        1.000000
  2   3        0.000000       -0.000000
  2   4       -0.000000       -0.000000
  2   5        0.000000        0.000000
  2   6        0.000000        0.000000
  2   7       -0.000000        0.000000
  2   8       -0.000000        0.000000
  2   9        0.511019        0.511019
  2  10        0.690325        0.690325
  2  11        0.511019        0.511018
  2  12        0.690325        0.690325
  3   3        0.177129        1.000000
  3   4       -0.000000        0.000000
  3   5       -0.000000        0.000000
  3   6        0.219506        0.502141
  3   7       -0.000000        0.000000
  3   8       -0.000000        0.000000
  3   9       -0.126986       -0.025610
  3  10       -0.047951       -0.009642
  3  11        0.051811       -0.247551
  3  12        0.019564       -0.093205
  4   4        0.177129        1.000000
  4   5       -0.000000        0.000000
  4   6       -0.000000        0.000000
  4   7        0.219506        0.502141
  4   8       -0.000000        0.000000
  4   9       -0.027811       -0.273671
  4  10       -0.010502       -0.103040
  4  11       -0.034081        0.111660
  4  12       -0.012869        0.042041
  5   5        0.177129        1.000000
  5   6       -0.000000        0.000000
  5   7       -0.000000        0.000000
  5   8        0.219506        0.502141
  5   9       -0.011883       -0.059936
  5  10       -0.004487       -0.022567
  5  11       -0.114866       -0.073448
  5  12       -0.043374       -0.027654
  6   6        0.904436        1.000000
  6   7       -0.000000        0.000000
  6   8        0.000000        0.000000
  6   9       -0.575744       -0.056653
  6  10       -0.361553       -0.035577
  6  11        0.234908       -0.547616
  6  12        0.147517       -0.343890
  7   7        0.904436        1.000000
  7   8        0.000000        0.000000
  7   9       -0.126093       -0.605398
  7  10       -0.079183       -0.380175
  7  11       -0.154519        0.247007
  7  12       -0.097034        0.155115
  8   8        0.904436        1.000000
  8   9       -0.053878       -0.132588
  8  10       -0.033834       -0.083262
  8  11       -0.520793       -0.162478
  8  12       -0.327045       -0.102032
  9   9        1.000000        1.000000
  9  10        0.684800        0.684800
  9  11        0.120869        0.120869
  9  12        0.307324        0.307324
 10  10        1.000000        1.000000
 10  11        0.307324        0.307324
 10  12        0.606683        0.606683
 11  11        1.000000        1.000000
 11  12        0.684800        0.684800
 12  12        1.000000        1.000000
Norm of the error: 2.624232

Does this look OK or does it need more points ?
Is the error big ?

@scemama
Copy link
Member

scemama commented Sep 15, 2023

@scemama I've implemented ECPs. Is there any way to check for ECPs ?

No. The simplest way is to import the TREXIO file in another code, and check that the energy is correct. You can do it with quantum package with qp_import_trexio , and the qp run print_energy.

Does this look OK or does it need more points ?
Is the error big ?

It is hard to tell... If you install the python interface to QMCkl, trexio-tools will use it to compute the AOs and MOs, so it will be orders of magnitude faster. Then, you can increase the number of points for the integration, ans see if it converges towards an identity matrix. But you can also compute the energy with quantum package to cross-check ;-)

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 22, 2023

Hi !

@scemama I'm almost done, I get the same energies as orca after doing qp_import_trexio , and then qp run print_energy for files without ECPs. I've also added an orca file to test the conversion h2o.json. I'm now testing the ECPs with orca.

@q-posev Do I need to do anything else for the CI ? How do I set up the CI ?

@v1j4y v1j4y requested a review from scemama September 22, 2023 13:09
@q-posev
Copy link
Member

q-posev commented Sep 23, 2023

Bravo @v1j4y !

The CI is already in place, see test.yml file. As I wrote above, for the CI we would need you to upload one (or a few) test ORCA JSON file(s) to the data folder of this repo and then we can add the corresponding test lines (calling trexio convert-from -t orca ...) to the CI. Unless there is a licensing issue which does not allow storage of ORCA JSON files in the open source repos.

For now we only test that converters "work" (i.e. they do not raise errors while processing output of a q-chem program and they do produce the TREXIO file in the end). The validity of the TREXIO file is only checked for GAMESS for now.

@scemama this is something related to non-regression of the converters which worries me sometimes (see my comments in #12), perhaps at some point we want to make sure that produced TREXIO files are valid and have not changed drastically compared to some fixed ref. We could do h5diff on TREXIO HDF5 files or we could do check-basis and check-mos in the CI for all converters. What do you think?

@v1j4y Let me know if you need help with the CI.

Copy link
Member

@q-posev q-posev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@v1j4y Sorry, my bad, I have not seen the h2o.json file that you pushed already. Then the only thing left for testing the ORCA->TREXIO is adding the following line to test.yml (I cannot do it since PR is from the fork)

trexio convert-from -t orca     -i data/h2o.json  -b hdf5  trexio_h2o_orca.h5

@scemama
Copy link
Member

scemama commented Sep 25, 2023

@scemama this is something related to non-regression of the converters which worries me sometimes (see my comments in #12), perhaps at some point we want to make sure that produced TREXIO files are valid and have not changed drastically compared to some fixed ref. We could do h5diff on TREXIO HDF5 files or we could do check-basis and check-mos in the CI for all converters. What do you think?

@q-posev I think it is a good idea to check that the files are OK. But I would use the text backend for that with diff, as text files are better handled with git.

@scemama scemama merged commit 1296e1c into TREX-CoE:master Sep 25, 2023
1 check passed
@q-posev
Copy link
Member

q-posev commented Sep 25, 2023

I'm adding the ORCA test to CI on master.

@v1j4y
Copy link
Contributor Author

v1j4y commented Sep 25, 2023

Thanks @q-posev !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants