-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hb solver #546
Conversation
…al, it's now possible to specify a solver for the heatbath, different from the one used in the MD evolution (computation of the forces). In order to do that I made the following changes to the code: 1. I added `int HB_solver; // solver for the HB only (optional) ` and ` solver_params_t HB_solver_params; // parameters of the HB solver ` to the `monomial` struct in `monomial/monomial.h`. 2. I wrote the declaration `%x HB_MSOLVER` in `read_input.l`. There I also wrote ` {SPC}*HB_Solver{EQL} { solver_caller=YY_START; BEGIN(NDMSOLVER); } ` in `<NDRATMONOMIAL,NDRATCORMONOMIAL,NDCLRATMONOMIAL,NDCLRATCORMONOMIAL,NDDETRATMONOMIAL,NDCLDETRATMONOMIAL>`, which I defined the block `<HB_MSOLVER>`, analogously to `<MSOLVER>`, replacing all the occurrencies of `mnl->solver =` with `mnl->HB_solver =`. ** To do**: do the same with the solver_parameters 3. I added `INVALID_SOLVER` to the `SOLVER_TYPE` enum in `solver/solver_type.h`. It's value is `26`. 4. I added the macro `#define _default_HB_solver_flag 26 // this is INVALID_SOLVER (see solver/solver_types.h)` in `default_input_values.h`. 5. In `monomial/cloverdetratio_monomial.c` I changed the function `cloverdetratio_heatbath()` replacing all the occurrencies of `mnl->solver` and `mnl->solver_params` with `mnl->HB_solver` and `mnl->HB_solver_params` respectively. 6. In `monomial/monomial.c`: - In `add_monomial()`, I added `monomial_list[no_monomials].HB_solver = _default_HB_solver_flag;` in the function `add_monomial()`. - In `init_monomials()`, when the HB solver is invalid, i.e. `(monomial_list[i].HB_solver == INVALID_SOLVER`, I set the `HB_solver` and ` Hb_solver_params` equal to the ones used in the MD evolution. In short: if the HB solver is not specified in the input file, the solver type and its parameters are copied from what was specified for the MD. (note that the `hbfunction` is the same, but now uses the HB solver and its parameters)
`<DETMONOMIAL,CLDETMONOMIAL,CLDETRATMONOMIAL,CLDETRATRWMONOMIAL,NDRATMONOMIAL,NDRATCORMONOMIAL,NDCLRATMONOMIAL,NDCLRATCORMONOMIAL,RATMONOMIAL,RATCORMONOMIAL,CLRATMONOMIAL,CLRATCORMONOMIAL,NDDETRATMONOMIAL,NDCLDETRATMONOMIAL>` which parses, for each monomial `mnl`, the `HB_solver_params` attributes. I did it by copying and pasting the same block for `mnl->solver_params` and preplacing those occurrences with `mnl->HB_solver_params`. Note that the default values don't need to be specified because the default corresponds to the solver used in the MD evolution.
da55dc9
to
bff599c
Compare
…atbath()` replacing all the occurrencies of `mnl->solver` and `mnl->solver_params` with `mnl->HB_solver` and `mnl->HB_solver_params` respectively.
read_input.l
Outdated
@@ -2446,6 +2498,10 @@ static inline double fltlist_next_token(int * const list_end){ | |||
solver_caller=YY_START; | |||
BEGIN(NDMSOLVER); | |||
} | |||
{SPC}*HB_Solver{EQL} { | |||
solver_caller=YY_START; | |||
BEGIN(NDMSOLVER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure this is correct? Do we need to be able to set the HB solver also for the ND monomial?
@simone-romiti opened a pull request to review this, please see the inline comments |
@@ -248,6 +249,11 @@ int init_monomials(const int V, const int even_odd_flag) { | |||
} | |||
} | |||
else if(monomial_list[i].type == CLOVERDETRATIO) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need the same thing for DETRATIO
`HB_UseExternalInverter, HB_UseSloppyPrecision, HB_UseCompression`
` {SPC}*HB_Solver{EQL} { solver_caller=YY_START; BEGIN(NDMSOLVER); ` We don't need to be able to set the HB solver also for the ND monomial.
… HB_solver also for the DETRATIO monomial.
HB_solver specified only for the cloverdetratio and detratio monomials
- Removed extra spaces in blockS (flex syntax error) modified: read_input.l
monomial/monomial.c
Outdated
if (monomial_list[i].HB_solver == _default_HB_solver_flag) { | ||
monomial_list[i].HB_solver = monomial_list[i].solver; | ||
} | ||
monomial_list[i].HB_solver_params = monomial_list[i].solver_params; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please check this. As far as I can tell HB_solver_params
need always to be initialised , so I moved outside the if guard.
Using the monomial
BeginMonomial CLOVERDETRATIO
Timescale = 0
kappa = 0.1400645
2KappaMu = 0.000224103200
# numerator shift
rho = 0.0016
# denominator shift, should match CLOVERDET shift
rho2 = 0.0128
CSW = 1.74
AcceptancePrecision = 1.e-21
ForcePrecision = 1.e-18
Name = cloverdetratio2light
useexternalinverter = quda
MaxSolverIterations = 500
solver = mg
usesloppyprecision = single
# HB_solver = cg
# HB_usesloppyprecision = single
# HB_UseExternalInverter = quda
# HB_MaxSolverIterations = 50000
EndMonomial
I got the same onlinemeas after one trajectory:
- commenting out he HB_solver lines
==> HB_solver/onlinemeas.000001 <==
6 1 30 2.754953e-30 3.374063e-31
6 1 31 3.400805e-32 -2.501565e-31
6 1 32 -2.318780e-33 0.000000e+00
- keeping only the comment in the line
# HB_UseExternalInverter = quda
i.e. using the tmLQCD cg
==> HB_solver/onlinemeas.000001_with_HB_CPU_cg <==
6 1 30 2.754159e-30 3.371860e-31
6 1 31 3.399023e-32 -2.501224e-31
6 1 32 -2.242958e-33 0.000000e+00
- keeping
HB_solver
commented out
==> HB_solver_same_input/onlinemeas.000001 <==
6 1 30 2.754809e-30 3.377900e-31
6 1 31 3.415679e-32 -2.501734e-31
6 1 32 -2.284831e-33 0.000000e+00
- using 6b31350 ad of course
HB_solver
commented out
==> add_actions/onlinemeas.000001 <==
6 1 30 2.755184e-30 3.367085e-31
6 1 31 3.416626e-32 -2.500371e-31
6 1 32 -2.259483e-33 0.000000e+00
these these were done in
/qbigwork/garofalo/test_tmLQCD_QUDA/cA211.08.32
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, there should be defaults set for a few members in HB_solver_params
as is done for solver_params
. One way of doing this might indeed be to simply copy solver_params
, but only if this does not lead to things being overwritten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will overwrite settings. Any defaults should be set in add_monomial
, which is called while reading the input file. init_monomials
, instead, is called after the input file has already been parsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good way would be to set monomial_list[no_monomials].HB_solver_params = monomial_list[no_monomials].solver_params
in add_monomial
after the corresponding defaults in solver_params
have been set
.editorconfig
Outdated
@@ -0,0 +1,59 @@ | |||
[*] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please delete
configure_old
Outdated
@@ -0,0 +1,10320 @@ | |||
#! /bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please delete
tmLQCD.kdev4
Outdated
@@ -0,0 +1,4 @@ | |||
[Project] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please delete
workspace.code-workspace
Outdated
@@ -0,0 +1,17 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please delete
@@ -0,0 +1,75 @@ | |||
# this is a sample input file for a single cloverdet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should not be here, I thought I removed this with the GPU removal commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I made, some confusion with the directory.
CG { | ||
if(myverbose) printf(" HB Solver set to \"%s\" line %d monomial %d\n", yytext, line_of_file, current_monomial); | ||
mnl->HB_solver = CG; | ||
BEGIN(solver_caller); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check this line. We found that BEGIN(solver_caller)
is necessary to continue reading the input file correctly but I do not understand the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. This allows the parser to return to the point it came from (after having parsed the solver string). To be precise, this is what happens in flex:
<CLDETRATMONOMIAL,DETRATMONOMIAL>{
{SPC}*HB_Solver{EQL} {
solver_caller=YY_START;
BEGIN(HB_MSOLVER); -> jump to HB_MSOLVER parser
}
[...] -->
<HB_MSOLVER>{
CG {
if(myverbose) printf(" HB Solver set to \"%s\" line %d monomial %d\n", yytext, line_of_file, current_monomial);
mnl->HB_solver = CG;
BEGIN(solver_caller); --> jump back
}
[...]
} <-- return to this point
{SPC}*HB_MaxSolverIterations{EQL}{DIGIT}+ {
sscanf(yytext, " %[a-zA-Z_] = %d", name, &a);
mnl->HB_maxiter = a;
if(myverbose) printf(" HB_MaxSolverIterations set to %d line %d monomial %d\n", a, line_of_file, current_monomial);
}
[...]
While looking at this I realised that there are actually a few bugs in the parser. There are instances of name_caller
where instead there should be instances of solver_caller
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually surprised that the parsing of the monomials works correctly when the MG (or mixedbicgstab) is used as a solver... :)
Can you provide some more details on where these numbers were obtained and what is the ensemble in question? |
The only problematic case is the heatbath of the cloverdetratio2light monomial (at least with our current choice of mass preconditioning). You should be looking at the solve time rather than setup+solve as the setup has to be done anyway (as you say). The problem is most acute on M100, where the heatbath solve in the cloverdetratio2light takes several hundred seconds (much less on other machines). Also don't forget that the setup is only run once per job (when acceptance is sufficiently high). See the cA211.08.64 run:
First call is with the setup, the second call without. On a 32c64 test lattice (or another test lattice), the problem will not be this massive, of course. Similary, on Juwels Booster the MG performs much better, so even if it does 300-400 iterations, the impact on runtime is not too bad (using CG would still be faster though). |
Sorry I mean that test was on the same ensamble of the previous comment
There is a subdirectory add_actions which is my reference version with mg hb, please look at the last log file. For the cg hb there is a directory HB_solver and i was quoting the second last log file(cg hb only in cloverdet2light) and the last log file with cg hb also in cloverdet3light. |
…ere's a separate lib file now
The problem with the github action was due to the github runner moving to Ubuntu 22.04 and there, the MPI libs are distributed separately from the compiler wrapper and utilities. There's still another problem with the DDalphaAMG build because DDalphaAMG has issues with GCC 11.3 and how it organizes linking. I don't have the time to fix that problem right now. I'll try to check the documentation tomorrow (not sure if I'll manage) and will pull this in then. |
I've improved the github actions through "artifact upload" (https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts) which makes the various config.log (and output.data) accessible through the web interface post-build. |
alright, finally fixed our CI ... |
Damn, there's an issue with the initialisation of With
which should thus allow for 100k iterations in the HB, acceptance and derivative steps, I get:
|
@simone-romiti can you find the logic bug? It seems to me that it should work correctly:
|
The flow I see is the following, let me know it you agree on that:
|
Thanks for checking, I agree with your conclusions. When I reviewed I also agree that |
No description provided.