Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inelegant solution to #1821 #1934

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

ExpandingMan
Copy link
Contributor

@ExpandingMan ExpandingMan commented Oct 1, 2024

This is a particularly inelegant solution to #1821.

What I do here for Symmetric and Hermitian is take a redundant basis. This looks a bit strange because I both set the individual elements and also set the type of the resulting matrix. The reason for this is that if I do not set all the elements of the underlying buffer, Enzyme's autodiff returns the wrong result, but the wrapper type must also be set because currently BatchDuplicated only allows arguments and bases to be of the same type. I also go out of my way to call similar here (e.g. rather than zeros) to make it less likely we will run afoul of BatchDuplicated.

I think this works in general but I have not exhaustively checked it. I'm not sure how enzyme would be expected to deal with complex arguments here, so I've only tested purely real cases so far.

To see an example of how it works, consider symmetric matrix $X$ and
$$f(X) = X_{11}^2 + 2 X_{12}^2 + 3 X_{21}^2 - X_{22}^2$$
For extra clarity, write $X = \left(\matrix{a & b \cr b & c}\right)$ so that
$$f(X) = a^2 + 5 b^2 - c^2$$
The derivative with respect to the off-diagonal element $b$ is clearly $10b$.

I impelement this in Julia via

f0(x) = x[1,1]^2 + 2*x[1,2]^2 + 3*x[2,1]^2 - x[2,2]^2

Then, for example

X = Hermitian(Float64[1 2; 2 3])

so that

julia> gradient(Forward, f0, X)[1]
2×2 Enzyme.TupleArray{Float64, (2, 2), 4, 2}:
  2.0  20.0
 20.0  -6.0

This also works if the underlying data for X is [1.0 2.0; 0.0 3.0].

This will remain a draft until I come up with more comprehensive tests for it.

@ExpandingMan ExpandingMan marked this pull request as ready for review October 2, 2024 22:49
@ExpandingMan
Copy link
Contributor Author

Alright, this looks like it works. Again, while this may be a fix this solution is not very nice and involves a lot of redundancy, but I think anything better would be significantly more complicated.

I certainly would advise giving this a good think before merging. Keep in mind that if a user is really determined, they can create this on their own using only the existing public API (gradient and shadows) so I would find it perfectly reasonable if you decided it was better to close this and leave it broken for the foreseeable future rather than adding more behavior that it would be better to later break.

idx = idxs[i]
res = similar(parent(x))
for idx2 ∈ CartesianIndices(x)
@inbounds res[idx2] = _is_symmed_index(idx, idx2) ? 1 : 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need the check if symmed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking if it's either the index idx or its transpose. In my testing enzyme would return the wrong result if I did not set both elements (as if I had not wrapped it in Symmetric even though I had). I don't understand why that is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants