-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update gauge fixing to work with symmetries #19
Conversation
I can try to code up the truncated eigensolve adjoint similar to the SVD truncated adjoint (also detailed in here), which would add a term to the conventional eigensolve adjoint that accounts for the truncated ranks. This could just be added to PR #15. (An SVD would also work instead of an eigendecomposition here, if we decide that implementing the |
https://github.com/Jutho/KrylovKit.jl/tree/eigsolve_ad/src/adrules This has been stuck there for quite a long time, but maybe I'll find the time to actually properly implement and test these things such that this can be merged. I am pretty sure that these rules work, and they should work for generic functions so it might be worth a try |
e0ee346
to
f7cf904
Compare
I copied the This doesn't work yet, specifically the linear problem rrule sometimes doesn't converge because things inside the anonymous function |
Some questions I have:
[EDIT] |
This is kind of expected since the default tolerance for checking the element-wise convergence is set to
Hmm, while I don't really understand why it works, I think the gradient comes out correct. If you run the optimization on the Heisenberg Hamiltonian you seem to get correct gradients; the ground-state energy converges correctly up to high accuracy. Edit: Also after some further experimenting, I still don't understand why |
…ent-wise convergence check
While I haven't made sense of the |
I'm still confused about the gradient, I'm quite sure that the @checkgrad implementation I have just discarded the gradient, you can test this by returning the gradient after printing, which breaks it again. The contribution might be small, making the Heisenberg example still converge, but it still feels wrong to just discard it. |
I fully agree that just discarding the gradient is a bad idea, we should be able to figure out how to compute the adjoint properly. Still, I wasn't able to make any progress on the Another thought I had is that the adjoint of an eigendecomposition contains possibly diverging terms with |
When I was playing around with it, I also felt like it wasn't actually the rrule of eigsolve that caused the problem, and actually the NaN appeared sooner. But it does feel like sometimes there is just some incredibly small values, which then seem to cause this behaviour |
Okay, so I think the problem must be somewhere in the differentiation of |
Just put these things here to not forget them, will try and address them myself tomorrow if I find some time |
I cleaned up the VectorInterface part, and did some minor tweaks. For me this is ready to go. |
Placeholder PR for an update to the gauge fixing algorithm to one that works for symmetric
TensorMap
s. This implementation fixes the issues #18 , but it breaks the AD since it usesKrylokvKit.eigsolve
which is not differentiable. Hopefully this works once we add the corresponding rrule.We should probably also try the other approaches, but this seemed the simplest one that worked so just adding it here for reference and discussion.