Problems with CG_METHOD=0 [encountered in many different use cases] #3

beddalumia · 2022-06-13T05:21:20Z

What's up

As I was performing some initial tests on the incoming changes on fit_overhaul branch, I noticed something is wrong with the Weiss field analytical gradient, as it is currently on master. By wrong I mean that the fitted field is qualitatively different from the original one, so seriously wrong.

Note that this happens only if I request the analytic gradient: switching to CG_METHOD=1 (minimize routine) solves everything, switching only CG_GRAD=1 but remaining with the NR routine gives the correct fit but takes forever to compute, basically freezing for several minutes (✨holy minimize, mighty va10a✨).

Also note that the hybridization gradient appears to work fine, so maybe the actual problem lies within grad_g0and_replica. If that's the case I'll hit the wall very soon, when testing the gradients for the new 'elemental' norm. We'll see...

Evidence

I did not include the real part in the figure for it is basically correct (but not totally: even with the numerical gradient I see some unwanted weight near the origin). Don't know if this could help, but I report it.
No restart file has been used, and here we are at one loop: we do not expect a brilliant fit, but a fair one yes.
I attach down here the input file, so that the problem could be reproduced with the current master head (dc046d8)

Notes

I've already added a warning for the user in commit b6351ff (fit_overhaul), we might want to copy those lines on master without waiting for merge, if this is considered urgent.

Link to the lines

inputHM2D.conf

 WMIXING=0.500000000                           !Mixing bath parameter
 TS=2.500000000E-01                            !hopping parameter
 NX=1                                          !Number of cluster sites in x direction
 NY=1                                          !Number of cluster sites in y direction
 NKX=30                                        !Number of kx point for BZ integration
 NKY=30                                        !Number of ky point for BZ integration
 NLAT=1                                        !Number of cluster sites
 NORB=1                                        !Number of impurity orbitals (max 5).
 NBATH=7                                       !Number of bath sites:(normal=>Nbath per orb)(hybrid=>Nbath total)(replica=>Nbath=Nreplica)
 NSPIN=1                                       !Number of spin degeneracy (max 2)
 ULOC=1.000000000,0.d0,0.d0,0.d0,0.d0          !Values of the local interaction per orbital (max 5)
 UST=0.d0                                      !Value of the inter-orbital interaction term
 JH=0.d0                                       !Hunds coupling
 JX=0.d0                                       !S-E coupling
 JP=0.d0                                       !P-H coupling
 BETA=300.000000000                            !Inverse temperature, at T=0 is used as a IR cut-off.
 XMU=0.d0                                      !Chemical potential. If HFMODE=T, xmu=0 indicates half-filling condition.
 NLOOP=1                                       !Max number of DMFT iterations.
 DMFT_ERROR=1.000000000E-05                    !Error threshold for DMFT convergence
 SB_FIELD=1.000000000E-01                      !Value of a symmetry breaking field for magnetic solutions.
 GF_FLAG=T                                     !flag to evaluate GFs and related quantities.
 DM_FLAG=F                                     !flag to evaluate the cluster density matrix \rho_IMP = Tr_BATH(\rho))
 ED_TWIN=F                                     !flag to reduce (T) or not (F,default) the number of visited sector using twin symmetry.
 ED_SECTORS=F                                  !flag to reduce sector scan for the spectrum to specific sectors +/- ed_sectors_shift.
 ED_SECTORS_SHIFT=1                            !shift to ed_sectors
 ED_SPARSE_H=T                                 !flag to select  storage of sparse matrix H (mem--, cpu++) if TRUE, or direct on-the-fly H*v product (mem++, cpu--) if FALSE
 ED_GF_SYMMETRIC=F                             !flag to assume Gij = Gji
 ED_PRINT_SIGMA=T                              !flag to print impurity Self-energies
 ED_PRINT_G=T                                  !flag to print impurity Greens function
 ED_PRINT_G0=T                                 !flag to print non-interacting impurity Greens function
 ED_VERBOSE=5                                  !Verbosity level: 0=almost nothing --> 5:all. Really: all
 NSUCCESS=1                                    !Number of successive iterations below threshold for convergence
 LMATS=5000                                    !Number of Matsubara frequencies.
 LREAL=5000                                    !Number of real-axis frequencies.
 LTAU=1024                                     !Number of imaginary time points.
 LFIT=1000                                     !Number of Matsubara frequencies used in the \Chi2 fit.
 NREAD=0.d0                                    !Objective density for fixed density calculations.
 NERR=1.000000000E-04                          !Error threshold for fixed density calculations.
 NDELTA=1.000000000E-01                        !Initial step for fixed density calculations.
 NCOEFF=1.000000000                            !multiplier for the initial ndelta read from a file (ndelta-->ndelta*ncoeff).
 WINI=-5.000000000                             !Smallest real-axis frequency
 WFIN=5.000000000                              !Largest real-axis frequency
 HFMODE=T                                      !Flag to set the Hartree form of the interaction (n-1/2). see xmu.
 EPS=1.000000000E-02                           !Broadening on the real-axis.
 CUTOFF=1.000000000E-09                        !Spectrum cut-off, used to determine the number states to be retained.
 GS_THRESHOLD=1.000000000E-09                  !Energy threshold for ground state degeneracy loop up
 HWBAND=2.000000000                            !half-bandwidth for the bath initialization: flat in -hwband:hwband
 LANC_METHOD=arpack                            !select the lanczos method to be used in the determination of the spectrum. ARPACK (default), LANCZOS (T=0 only), DVDSON (no MPI)
 LANC_NSTATES_SECTOR=2                         !Initial number of states per sector to be determined.
 LANC_NSTATES_TOTAL=1                          !Initial number of total states to be determined.
 LANC_NSTATES_STEP=2                           !Number of states added to the spectrum at each step.
 LANC_NCV_FACTOR=10                            !Set the size of the block used in Lanczos-Arpack by multiplying the required Neigen (Ncv=lanc_ncv_factor*Neigen+lanc_ncv_add)
 LANC_NCV_ADD=0                                !Adds up to the size of the block to prevent it to become too small (Ncv=lanc_ncv_factor*Neigen+lanc_ncv_add)
 LANC_NITER=512                                !Number of Lanczos iteration in spectrum determination.
 LANC_NGFITER=200                              !Number of Lanczos iteration in GF determination. Number of momenta.
 LANC_TOLERANCE=1.000000000E-12                !Tolerance for the Lanczos iterations as used in Arpack and plain lanczos.
 LANC_DIM_THRESHOLD=1024                       !Min dimension threshold to use Lanczos determination of the spectrum rather than Lapack based exact diagonalization.
 CG_METHOD=0                                   !Conjugate-Gradient method: 0=NR, 1=minimize.
 CG_GRAD=0                                     !Gradient evaluation method: 0=analytic (default), 1=numeric.
 CG_FTOL=1.000000000E-05                       !Conjugate-Gradient tolerance.
 CG_STOP=0                                     !Conjugate-Gradient stopping condition: 0-3, 0=C1.AND.C2, 1=C1, 2=C2 with C1=|F_n-1 -F_n|<tol*(1+F_n), C2=||x_n-1 -x_n||<tol*(1+||x_n||).
 CG_NITER=500                                  !Max. number of Conjugate-Gradient iterations.
 CG_WEIGHT=1                                   !Conjugate-Gradient weight form: 1=1.0, 2=1/n , 3=1/w_n.
 CG_SCHEME=weiss                               !Conjugate-Gradient fit scheme: delta or weiss.
 CG_POW=2                                      !Fit power for the calculation of the Chi distance function as 1/L*|G0 - G0and|**cg_pow
 CG_MINIMIZE_VER=F                             !Flag to pick old/.false. (Krauth) or new/.true. (Lichtenstein) version of the minimize CG routine
 CG_MINIMIZE_HH=1.000000000E-04                !Unknown parameter used in the CG minimize procedure.
 HFILE=hamiltonian                             !File where to retrieve/store the bath parameters.
 IMPHFILE=inputHLOC.in                         !File read the input local H.
 LOGFILE=6                                     !LOG unit.

The text was updated successfully, but these errors were encountered:

Here we start to unify and further develop the recent modifications to the fitting routines. More specifically we start re-introducing the old distance definition: \chi_{ij} = |FG_{ij}-FG_{ij}^Anderson|^q / W, \chi = \sum_{ij} \chi_{ij} i.e. a generalized chi-square, computed element by element, and weighted with various schemes /on the matsubara axis/. We plan to extend this def to include weighting on the matrix structure, too. (more later...) This has been replaced with a global matricial norm (Frobenius) within commit 3ad6af7. We keep the Frobenius distance available through a new input flag: > CG_NORM={frobenius,elemental} ———————————————————————————————————————————————————————————————————————— RATIONALE The reason for bringing back the old elemental distance is rooted in the need for flexibility in defining different weights for different matrix elements. A previous attempt at this can be found in the 'weighted_fit' (unmerged) branch, look here: > https://github.com/QcmPlab/CDMFT-LANC-ED/commits/weighted_fit Hence we define a generic infrastructure to assign weights on a element- by-element basis, as \chi_{ij} = \sum_{iw} |FG(iw)_{ij}-FG(iw)_{ij}^Anderson|^q / Wmats(iw), \chi = \sum_{ij} \chi_{ij} / Wmtrx_{ij} For now two choices are available for Wmtrx: • CG_MATRIX=0, giving equal weights to all components ('flat') • CG_MATRIX=1, normalizing on the total spectral weight ('spectral') > More specifically the spectral option defines: Wmtrx_{ij} = - \sum_{iw} Im[FG(iw)_{ij}] / beta = ∫A_{ij}(iw)diw = W_{diag}δ_{ij} + W_{off-diag}(1-δ_{ij}) where in general we expect W_{off-diag} << W_{diag}, making abundantly clear the rationale behind this weighting choice. ———————————————————————————————————————————————————————————————————————— NOTES (in no particular order) • The actual value for W_{diag} is ≈1d0 for the Weiss field (we sum over all the matsubara frequencies, not only the first Lfit ones). This way we can easily ensure same normalization of the chi values if switching to the 'flat' matricial weights, thus allowing easier debug & testing. • The actual value for W_{diag} is NOT ≈1d0 for the bath hybridization, recall that ∆=(D/2)^2*Gloc, so for D=1 it would be ≈0.25d0, which is what we find in our Nlat=Nspin=Norb=1 test-runs (see below) on the 2d Hubbard model. For now I've hardcoded Wflat=0.25d0 but we should find a way to define it in terms of the hopping (not so trivial since the hopping value is model dependent and the name of the variable is not enforced by the solver, with the possibility of different choices in different drivers). • Speaking of normalization conventions I've actually changed the last line of the Frobenius implementation(s), so to divide also therein by Nlso = Nlat * Nspin * Norb, which corresponds to count(Hmask) in the elemental case. • Actually the Hmask implementation is totally different now, wrt what used to be before the Frobenius update (which totally dropped Hmask). This because we need the FGmatrix structure to be a whole NNN-array, since the Frobenius norm cannot in any (easy) way operate on a logical mask, being it a whole-matrix formula. So we just define the mask and pass it to the sum() fortran intrinsic when computing the final sum over matrix elements: \chi = \sum_{ij} \chi_{ij} / Wmtrx_{ij} • More on Hmask: for now I just defined an internal ed_all_g=.true. flag and imported the current implementation from LIB_DMFT_ED, so all tests have been performed with Hmask=.true. (no mask). This is of course the safest option, thus appropriate for development. We should discuss the actual mask implementation for production, since I deem that to be the true reason for the Frobenius implementation improved fit-quality over the old flat elemental chi-square: as far as I can tell, at least when CG_POW=2, the Frobenius norm has no way to produce different chi2 vals wrt the old implementation, as long as you don't define a mask. > do it really makes sense to build the mask basing on zeros in Hrepl? > why not just exploit hermiticity of ∆ and g0, so a naive uplo mask? > this has been already brought out a few times, e.g. I'm aware of a. 0e5c272b45eda6b7ff652e2473b9ecda09e5ba8b on LIB_DMFT_ED b. cb0af32 on CDMFT-LANC-ED so it might be time to discuss it all together. • There are also many whitespace changes and new comments/printings, in line with https://github.com/QcmPlab/LIB_DMFT_ED/tree/0.5.2 ———————————————————————————————————————————————————————————————————————— TESTING For now all possible input flag combinations have been tested on the 2d Hubbard model driver only (cdn_hm_2dsquare) with Nlat=Nspin=Norb=1, so to allow a cross-check with LIB_DMFT_ED. Everything tested with minimize algorithm (CG_METHOD=1) since I've still not written the gradients for the elemental implementation. I'll point out only a few crucial outcomes: • Frobenius norm and 'flat-weighted' elemental norm give the same fit, for CG_POW=2. I've not tested other powers, we might need to explore. • Frobenius norm and 'spectral-normalized' elemental norm give slightly different fits of the real part of the Weiss field. I could not catch the reason for now (I surely expected exact match with Nlso=1 and same overall normalizations of the chi-square…). It could just be that the ∫A(iw)diw it's not really 1d0 (something like 0.97d0), so we actually increase chi-square values and the provided tolerance changes scale. (but I thought it was a relative tolerance… I might return to it). • MOST IMPORTANTLY: Frobenius norm FAILS TO FIT the Weiss field, if the analytic gradient is used (the hybridization works fine instead). More info reported within issue #3. > As I said, all cross-checks are evaluated with numerical gradient, which is efficient only if using the minimize routine. > I've added an explicit warning in the code, so to alert users if they enter the function. (new lines 649-658 in ED_FIT_CHI2.f90) ———————————————————————————————————————————————————————————————————————— TODO 1. Write the analytical gradients for the elemental norm (ASAP). 2. Solve issue #3 for the Frobenius norm (I might defer it, sorry). 3. Test on true clusters (Nlat>1), where Wmtrx choice is relevant. 4. Test on different models (I'd delegate to relevant people here).

+ add debug printing of \grad{\chi^2} to CG_NORM=frobenius (to compare) ———————————————————————————————————————————————————————————————————————— TESTING [cdn_hm_2dsquare, Nlat=Nspin=Norb=1] • CG_SCHEME = WEISS We observe the very same problem reported in issue #3 for Frobenius distance: the shape of the fitted function it's not that of a Weiss field, but that of a hybridization function (there's a minimum, it goes to zero for iw -> 0). Does this imply that we have a problem within grad_g0and_replica()? > I believe not, cause it matches quite literally the DMFT_ED version. • CG_SCHEME = DELTA Recall that with Frobenius distance we got a correct fit... Now with CG_NORM=elemental we get... again a qualitatively wrong fit (similar situation really: the shape of the fitted function it's not that of a hybridization function, but that of a Weiss field). This is becoming interesting... we call the same grad_delta_replica() but with Frobenius gradient we get the right fit, while elemental grad makes for a qualitatively wrong result (with the implementation being a literal porting of the DMFT_ED one, which works totally fine!). The qualitative change of the function to me hints to a wrong /sign/ in the gradients, like if we are finding a maximum, instead of a min. > this appears to be indeed the case if we look at the printed dchi2 in two runs with everything equal but CG_norm: dchi2(elemental) at first print is exactly -dchi2(frobenius). We have a lead. ———————————————————————————————————————————————————————————————————————— >> TO BE FURTHER INVESTIGATED (todo: update the issue report)

Recap: with CG_SCHEME=delta and CG_GRAD=1 we had > a correct fit with CG_NORM=frobenius > a wrong fit with CG_NORM=elemental > dchi2(elemental) = -dchi2(frobenius) at first call. >> So we try changing sign to dchi2(elemental). What happens: we fix the fitted \Delta function with CG_NORM=elemental. Why this is suspicious: doing so we change sign wrt DMFT_ED code (which works just fine in this test!) • DMFT_ED (grad_chi2_delta_replica, line 363 of ED_FIT_REPLICA.f90) dchi2 = - cg_pow*sum(df,1) / Ldelta / totNso • CDMFT_ED (grad_chi2_delta_replica_elemental, lines 650-655) do ia=1,size(a) dchi2(ia) = + cg_pow * sum( df(:,:,:,:,:,:,ia) / Wmat, Hmask) dchi2(ia) = dchi2(ia) / Ldelta / count(Hmask) enddo > The change in sign has no clear justification! ———————————————————————————————————————————————————————————————————————— Similarly: • changing sign to dchi2 expression in grad_chi2_weiss_replica_elemental leads to much improved fit of the Weiss field: at least it has no min and correctly diverges for iw -> 0. > Again, there is no clear justification as for why the sign of dchi2 should change wrt the DMFT_ED implementation, which works fine. • I actually found an analogous sign discrepancy between grad_chi2_weiss and grad_chi2_delta in the Frobenius implementation. > Commit 38bb300 did swap Delta and FGmatrix in the expression defining df, effectively changing its sign (and no abs is taken downstream). But it left untouched the corresponding expression in grad_chi2_weiss_... > So I swapped G0and and FGmatrix too and got the very same results as with grad_chi2_weiss_replica_elemental (meaning that the norm of the difference between the two fitted Weiss fields is in the d-15 order) ———————————————————————————————————————————————————————————————————————— WRAPPING UP) So here I have swapped a few signs and pragmatically recovered decent fits of both Weiss and Delta, with both Frobenius and Elemental norm. But I find it very suspicious that this sign-changes make the elemental implementation diverge with respect to the analogous code in DMFT_ED, without a clear reason. One thing could be that the gradients for Delta and Weiss (not chi2, the Anderson functions themselves) introduce the wrong sign in their CDMFT_ED version, but I looked quite thoroughly at them and could not find the discrepancy. [Actually touched a bit grad_delta_replica, only to make it formally identical to DMFT_ED version, by just "compressing" some do loops...] ———————————————————————————————————————————————————————————————————————— NOTES) For both Weiss and Delta, with both codes (DMFT and CDMFT) the numerical gradients give *way better* fits. For numerical gradients DMFT_ED works fine with both CG_METHOD={0,1} but CG_METHOD=0 does *consistently* freeze (reach CG_NITER without exiting) within CDMFT_ED. Since both codes call SciFortran for this I cannot get why this happens. Again, it's not random: DMFT_ED consistently succeeds with NR-CG and CDMFT_ED consistently fails with it (but all goes well if calling minimize-CG). ———————————————————————————————————————————————————————————————————————— TODO) We may change title for issue #3, for its scope appears to be wider.

beddalumia · 2022-06-17T05:53:21Z

NEWS (relative to fit_overhaul branch)

Commits 1bab32a and 3240d40 have shown that the issue is wider (hence the title change). Here I report some evidence and try to wrap a brief recap.

RECAP

With numerical gradients everything works, with both "Frobenius" and "Elemental" definition of $\chi^2$ distance.

	Deltaˆ	Weiss
Elemental	🟡	🟢
Frobenius	🟡	🟢

ˆThe Delta fits are tagged "yellow" for they have worse quality wrt the Weiss ones, if comparing with NR-CG results with DMFT_ED code (NR-CG freezes with CDMFT code, consistently. I don't know why). Yet minimize-CG results are all on the same level accross the two codes, and very similar to the "yellow" ones (so we can take them as "fairly good"). More info on the freezing in 3240d40 commit message; plots of fitted functions are reported below. Note that instead minimize and NR give the very same results for the Weiss field, hence tagged green.

With analytical gradients we see some problems if we a) leave unchanged Frobenius gradients (wrt master branch) and b) port analytic gradients for elemental $\chi^2$ from current version in LIB_DMFT_ED. The situation is:

	Delta	Weiss
Elemental	🔴	🔴
Frobenius	🟡	🔴

If we swap sign in the elemental implementation of $\nabla\chi^2$ (thus diverging from LIB_DMFT_ED!) we get:

	Delta	Weiss
Elemental	🟡	🟡
Frobenius	🟡	🔴

If we further notice that commit 38bb300 had fixed the sign of \grad\chi^2(\Delta), but left untouched the sign of \grad\chi^2(g_0), so that applying the missing fix, we get:

	Delta	Weiss
Elemental	🟡	🟡
Frobenius	🟡	🟡

DETAILS

All plots with same input other than CG options, and at one loop.
Solid Line: FG
Dash-Dot: Fit

What I mean with "🔴" is

Delta	Weiss

What I mean with "🟡" is

Delta	Weiss

What I mean with "🟢" is

Deltaˆ	Weiss

ˆThis plot is the only one generated with DMFT_ED code, for this quality appears to be unreachable with minimize-CG, and NR-CG is de facto unavailable within CDMFT_ED.

beddalumia · 2022-06-28T15:05:09Z

Relevant update in SciFortran: QcmPlab/SciFortran@c742471

Effects on this issue to be tested (it could probably fix the freezing with CG_METHOD=0 and CG_GRAD=1).

edit: it does not.

Please be aware that issue #3 is still open and details some instances of the serious problems we still have with CG_METHOD=0. For that reason here I switch the default to CG_METHOD=1 (the legacy minimize implementations), which has proven to be very much reliable for many different clusters, on the single-band square lattice. We'll return on the newer CG and on the analytic derivatives, but for now let's move on and merge the branch, which provides he new CG_NORM input parameter, either "elemental" or "frobenius". The latter amounts to what latest master implemented, the former generalized the old implementation, with the key difference of allowing different weights on different matrix elements for the chi evaluation. You can control which weights to use with the CG_MATRIX input variable, 'flat' for the legacy way, 'spectral' for a new definition that has proven very robust in all our test cases. Note that 'flat' CG_MATRIX weights would lead to the very same chi2 as with CG_NORM="frobenius", if CG_POW=2 (and it should be 2 for a Frobenius norm). Otherwise the two norms give different chi values. Eventual restoring of a mask to select which Weiss/Delta components to totally exclude from the chi would make the two norms completely different a priori (no way to apply any mask to a whole matrix operation, as the Frobenius norm). Note that we rise a warning if you request the Frobenius norm with CG_POW \= 2 and will do so for the mask too, when implemented (even error in that case maybe).

beddalumia changed the title ~~Wrong Frobenius gradient for the Weiss field [Nlat=Nspin=Norb=1, 2d Hubbard model]~~ Problems with analytical gradients [Nlat=Nspin=Norb=1, 2d Hubbard model] Jun 17, 2022

beddalumia changed the title ~~Problems with analytical gradients [Nlat=Nspin=Norb=1, 2d Hubbard model]~~ Problems with CG_METHOD=0 [encountered in many different use cases] Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with CG_METHOD=0 [encountered in many different use cases] #3

Problems with CG_METHOD=0 [encountered in many different use cases] #3

beddalumia commented Jun 13, 2022 •

edited

Loading

beddalumia commented Jun 17, 2022

beddalumia commented Jun 28, 2022 •

edited

Loading

Problems with CG_METHOD=0 [encountered in many different use cases] #3

Problems with CG_METHOD=0 [encountered in many different use cases] #3

Comments

beddalumia commented Jun 13, 2022 • edited Loading

What's up

Evidence

Notes

beddalumia commented Jun 17, 2022

RECAP

DETAILS

beddalumia commented Jun 28, 2022 • edited Loading

beddalumia commented Jun 13, 2022 •

edited

Loading

beddalumia commented Jun 28, 2022 •

edited

Loading