About the gauge warning in svd AD #139

Confusio · 2025-02-22T18:48:50Z

When running, I sometimes encounter the following warning:

Warning: `svd` cotangents sensitive to gauge choice: (|Δgauge| = 1.1322835304708984e10)

My understanding is that this warning indicates that the gradients (cotangents) produced by the SVD operation are extremely sensitive to the choice of gauge due to complex number. Is it correct to assume that the near-degeneracy (or degeneracy) of singular values is also responsible for this gauge sensitivity? How to work around this problem?

pbrehmer · 2025-02-23T16:08:28Z

I'm not yet very sure of the exact origins of this gauge sensitivity. I also suspect that it has to do with the (near-) degeneracy of the singular values. Typically, you work around this by applying Lorentzian broadening to singular value differences in the SVD reverse-rule. Currently, the TensorKit and KrylovKit SVD reverse-rules do not yet support this broadening. For smaller gauge sensitivities I could imagine that they are not super problematic and might be projected out later in the backpropagation.

I wanted to investigate this further for a while now since these gauge warnings really pop up in many cases. I have a custom SVD reverse-rule which supports broadening, so perhaps we can implement that as a workaround and see if that improves some things.

@lkdvos Is there any ideas on how to incorporate Lorentzian broadening in the TensorKit/KrylovKit adjoints? I would have to think how to do that for KrylovKit's vector-wise formulation and also in the Arnoldi case.

lkdvos · 2025-02-23T21:16:23Z

I think if there are degeneracies in the spectrum, this warning will always fire, since then I think even our math assumptions are wrong and I'm not sure the SVD rrule implementation is expected to work at all. Otherwise, indeed this warning means the cost function depends on the choice of gauge, which should not happen. However, the actual implementation simply projects out this contribution, so this might not necessarily be a problem.

It also depends a lot on the configuration of all the algorithms, for example if we are doing linear solvers with random initial guesses, I think we are indeed giving it components that have contributions along these "gauge directions", but we really do want to project them out so that would be completely okay, although it might lead to some stability loss because of finite precision things.

I'm a bit confused about the Lorentzian broadening helping here though, as that is typically designed to resolve the problem that changes in the smallest singular values get blown up by the 1 / (si^2 - sj^2) term. This does not actually alter the fact that your cost function depends on the gauge, and you can easily trigger this even with a completely well-behaved SVD: f(x) = (U, _ = tsvd(x); tr(U)) already has this problem, and is just not a derivable cost function since it is completely discontinuous: eps changes in x can cause large changes in the phases, so the derivative is ill-defined.

Of course, the entire PEPS stack is riddled with things that seem to be sensitive to numerical stability issues, so it is not unreasonable that one might affect the other.

Practically though, this whole svd thing has been something that I should have done for quite I while, and just don't find the time to get to. With some novel changes in TensorKit coming, based around MatrixAlgebraKit, it also does not seem likely I will find the time to get to it before that is flushed out.

pbrehmer · 2025-02-24T11:21:19Z

I'm a bit confused about the Lorentzian broadening helping here though, as that is typically designed to resolve the problem that changes in the smallest singular values get blown up by the 1 / (si^2 - sj^2) term. This does not actually alter the fact that your cost function depends on the gauge

I am also quite unsure about this. Certainly, for exact degeneracies the math of the SVD adjoint breaks down but maybe quasi degeneracies are equally problematic. For those we wouldn't get a warning in the CTMRG forward pass (we only check for relatively exact degeneracies) but they still might lead to instabilities in the reverse pass. This might lead to issues further down the PEPS stack. In any case, we shouldn't forget about the Lorentzian broadening because it seems that it really is a necessary thing to have in some cases.

Practically though, this whole svd thing has been something that I should have done for quite I while, and just don't find the time to get to. With some novel changes in TensorKit coming, based around MatrixAlgebraKit, it also does not seem likely I will find the time to get to it before that is flushed out.

I wouldn't stress it :-) But perhaps in the meanwhile we can try to get a better grasp on where these gauge dependency really come up and if/when they are relevant. I also find it hard to find time for these things currently, but perhaps I can allocate some time next week.

Confusio · 2025-02-25T08:00:27Z

Many thanks for the helpful discussions. After several tests, I've gained some insights of these warnings. Aligh with above discussions, my findings suggest that nearly degenerate singular values are not the primary cause, and the broadening may not help resolve the issue.

I think the problem lies in the construction of PEPS states with specific symmetry constraints for my tests. My trail states are classified by both SU(2) and point group symmetries, and some of them become highly specialized and consequently more fragile during optimization, resulting in potentially large gradients.

I found that initializing from either random states or physically reasonable states significantly reduces the frequency of warnings. Although warnings still occasionally appear, their magnitudes typically are close to the specified tolerance value tol of order 1e-10 or smaller. In a word, this appears to be a physical issue inherent to the symmetry-constrained parameter space.

pbrehmer · 2025-02-25T09:26:32Z

Thanks a lot for reporting!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the gauge warning in svd AD #139

About the gauge warning in svd AD #139

Confusio commented Feb 22, 2025

pbrehmer commented Feb 23, 2025

lkdvos commented Feb 23, 2025

pbrehmer commented Feb 24, 2025

Confusio commented Feb 25, 2025

pbrehmer commented Feb 25, 2025

About the gauge warning in svd AD #139

About the gauge warning in svd AD #139

Comments

Confusio commented Feb 22, 2025

pbrehmer commented Feb 23, 2025

lkdvos commented Feb 23, 2025

pbrehmer commented Feb 24, 2025

Confusio commented Feb 25, 2025

pbrehmer commented Feb 25, 2025