You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In that PR we are just saying that Hermitian and Symmetric matrices can not be used for in-place accumulation of gradients.
Conceptually though they really can; because pullbacks and pushforwards are linear operators, they preserve this structure: the differentials are also going to be Hermitian/Symmetric.
Its just that right now julia is really fussy about what operations you are allowed to do to them in such a way that its not easy to just treat them like any other matrix type. JuliaLang/LinearAlgebra.jl#773
In contrast we get away just fine treating (e.g.) Diagonal matrix's like any other matrix type (which is likewise safe because linear operator avoids doing anything illegal)
Anyway, we could do something to work around julia being fussy.
For example we could use parent to unwrap it to get the matrix behind it,
perform the inplace accumulation,
and then wrap it back up again.
This might mean we end up doing 2x as much work as we conceptually need to, since we only need to actually update one half.
and e.g. operations like parent(H1) .= H2 (where H1 and H2 are both Hermitian/Symmetric) will update both halves.
Similar for in-place multiplications.
It also feels pretty nasty; and means internally it will look like some functions violate the principle that the primal and the differential have the same structure. Which we might like to be asserting more and won't really be able to do as well with it internally manipulating the parent type.
But maybe that OK.
One might think that we could unwrap then rewrap as UpperTriangular(parent(H1)),
but that won't workout.
Even thought its much less fussy, and we don't run into these kinds of problems, it has has different invariants.
In this case that -- that you can't see lower triangular part to nonzero.
And we will end up doing operations like adding inplace a Symmetric/Hermitian matrix to it.
The text was updated successfully, but these errors were encountered:
Follow up to #234
In that PR we are just saying that Hermitian and Symmetric matrices can not be used for in-place accumulation of gradients.
Conceptually though they really can; because pullbacks and pushforwards are linear operators, they preserve this structure: the differentials are also going to be Hermitian/Symmetric.
Its just that right now julia is really fussy about what operations you are allowed to do to them in such a way that its not easy to just treat them like any other matrix type.
JuliaLang/LinearAlgebra.jl#773
In contrast we get away just fine treating (e.g.)
Diagonal
matrix's like any other matrix type (which is likewise safe because linear operator avoids doing anything illegal)Anyway, we could do something to work around julia being fussy.
For example we could use
parent
to unwrap it to get the matrix behind it,perform the inplace accumulation,
and then wrap it back up again.
This might mean we end up doing 2x as much work as we conceptually need to, since we only need to actually update one half.
and e.g. operations like
parent(H1) .= H2
(whereH1
andH2
are both Hermitian/Symmetric) will update both halves.Similar for in-place multiplications.
It also feels pretty nasty; and means internally it will look like some functions violate the principle that the primal and the differential have the same structure. Which we might like to be asserting more and won't really be able to do as well with it internally manipulating the parent type.
But maybe that OK.
One might think that we could unwrap then rewrap as
UpperTriangular(parent(H1))
,but that won't workout.
Even thought its much less fussy, and we don't run into these kinds of problems, it has has different invariants.
In this case that -- that you can't see lower triangular part to nonzero.
And we will end up doing operations like adding inplace a Symmetric/Hermitian matrix to it.
The text was updated successfully, but these errors were encountered: