Skip to content

Stop pirating eigen and svd functions from LinearAlgebra #147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

andreasnoack
Copy link
Member

Instead, define separate unexported functions in the GenericLinearAlgebra
module. Hence, users would have to explicitly opt in to using
GenericLinearAlgebra. A consequence is that the implementations
in this module will be used even for the element types support by
LAPACK.

Also, delete unused functionality.

Finally, also run the formatter.

Copy link

codecov bot commented Apr 18, 2025

Codecov Report

Attention: Patch coverage is 92.96875% with 9 lines in your changes missing coverage. Please review.

Project coverage is 91.83%. Comparing base (55a38e6) to head (e037d77).

Files with missing lines Patch % Lines
src/eigenGeneral.jl 76.92% 3 Missing ⚠️
src/eigenSelfAdjoint.jl 96.42% 3 Missing ⚠️
src/householder.jl 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #147      +/-   ##
==========================================
- Coverage   91.94%   91.83%   -0.11%     
==========================================
  Files          10        9       -1     
  Lines        1577     1556      -21     
==========================================
- Hits         1450     1429      -21     
  Misses        127      127              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andreasnoack andreasnoack force-pushed the an/nopirates branch 2 times, most recently from 6ccaab3 to 98ca476 Compare April 18, 2025 16:16
@araujoms
Copy link
Contributor

Please don't do this. My package Ket relies heavily on eigen and friends dispatching to LAPACK for BLASFloats and calling your code for general types. This will make it a nightmare to write generic code. And I'm not the only one in this situation, it will also break Hypatia and Clarabel, for example.

If your concern is avoiding type piracy, why don't you contribute instead the generic code to LinearAlgebra? I've recently done that for syrk/herk and they accepted it: JuliaLang/LinearAlgebra.jl#1249

@andreasnoack
Copy link
Member Author

Please don't do this. My package Ket relies heavily on eigen and friends dispatching to LAPACK for BLASFloats and calling your code for general types. This will make it a nightmare to write generic code.

I don't think it will be a nightmare. Packages that indent to work for a broad range of element type should just call GenericLinearAlgebra.svdvals instead of just LinearAlgebra.svdvals. Right now, I see two possible limitations

  1. If using this package for e.g. svdvals, you'd not use LAPACK for the element types that support it. I'm not sure how important that is. The actual bi- and tridiagonal solvers in this package are pretty fast. There are places where this packages will be noticeably slower. This is the general eigenvalue problem where LAPACK uses large shift (not without issues) which makes better use of BLAS3. The other case is MKLs reductions to bi- and tridiagonal which are really really fast, see Improve reductions to bi- and tridiagonal matrices #66. However, if you want your package to work for generic types, do you then also care about maximal performance for BLAS floats? Maybe you do, but my current assumption is that it is less likely.
  2. I'd have to reimplement some lines algebra functions that rely on the generic implementations. I already reimplemented cond and there will be others but it should be fairly easy.

There are two main reasons why I have prepared this PR.

  1. The piracy just causes all kinds of downstream issues, see Fix compat with GenericSchur and GenericSVD #71. There was also an issue where compilation of DiffEq methods became much slower when loading GenericLinearAlgebra because there was a call to norm or cond which then requires the compiler to analyze the svdvals implemention here for ForwardDiff.Dual elements.
  2. This package breaks whenever LinearAlgebra.jl makes changes to eigen and svd. It has happened a few times and will probably happen again. It is possible to patch up these breakages here but it is just an indication that 1. is a real issue.

why don't you contribute instead the generic code to LinearAlgebra?

I don't have an issue with contributing code to LinearAlgebra.jl. I've contributed quite a bit, actually. However, a while back when I started working on generic implementations of the linear algebra functions, we decided to limit the scope of the generic implementations in LinearAlgebra.jl to the linear solvers and delegate the spectral solvers to external modules (mostly but not only this package). Now that LinearAlgebra.jl has moved completely out of the main Julia repo, maybe the decision could be revisited. However, I think it might be nicer to opt in to the generic implementations here in the more transparent way of this PR.

As mentioned in the beginning, I don't think it should be too hard to just qualify the calls to svdvals etc with GenericLinearAlgebra. bug if it turns out that I'm wrong then an option could also be that we introduce a GenericLinearAlgebraPirates package that simply overloads the definitions in LinearAlgebra with the implementations of this package.

@araujoms
Copy link
Contributor

I just published a paper where I needed both generic types and maximal performance for BLAS floats. The point is that when I'm using BLAS floats I want to be faster than any other implementation, and I also need the capability of solving the problem with higher precision. There it's fine to be slower, because there's pretty much nobody who can do it. The situation is similar with the conic solvers I mentioned. The papers that introduce them both benchmark their performance against other solvers, and I don't think they would accept losing performance on BLAS floats for no reason.

The solution then wouldn't be to call GenericLinearAlgebra.svd instead of LinearAlgebra.svd, but create a wrapper to do the proper dispatch depending on type. It wouldn't be sensible to ask all 29 dependents of GenericLinearAlgebra to do so, though.

Sure, GenericLinearAlgebraPirates would be a workable solution, although inelegant. If piracy is so pervasive and convenient I think this means that the methods really just belong in LinearAlgebra itself. I urge you to reconsider the decision about splitting. Now that LinearAlgebra is excised from the base language I don't think making it slightly larger is a problem.

Do you mind if I ask people on Slack for their opinion?

@andreasnoack
Copy link
Member Author

I just browsed the paper but it wasn't clear to me

  1. Which part of GenericLinearAlgebra is used (is it an svd that you mentioned. Where is it being called?)
  2. Where the time is spent. Did you profile the run? Is a good chunk spent in a linear algebra routine?

Would it be easy for you to make a run based on this PR? It would be an interesting data point.

Do you mind if I ask people on Slack for their opinion?

I don't. It is fine if more people chime in. I can't promise that it will change the end result but I'm open for inputs.

@araujoms
Copy link
Contributor

  1. The source code of the paper is here. It doesn't use svd, it uses eigen. Specifically here.
  2. I did profile it, most of the time is spent doing matrix multiplications, but the time spent in eigen is significant.

I'm sorry, I'm not going to benchmark it again, it is a pain in the ass. Furthermore, I don't think there's a point, I don't think generic code can ever be competitive with LAPACK optimized for specific architectures.

Instead, define separate unexported functions in the GenericLinearAlgebra
module. Hence, users would have to explicitly opt in to using
GenericLinearAlgebra. A consequence is that the implementations
in this module will be used even for the element types support by
LAPACK.

Also, delete unused functionality.
@andreasnoack
Copy link
Member Author

andreasnoack commented Apr 20, 2025

How large, roughly, are the matrices passed to eigen?

I don't think generic code can ever be competitive with LAPACK optimized for specific architectures.

LAPACK isn't optimized for architectures beyond some blocking sizes that are rarely tuned. It is actually a central point of LAPACK that it doesn't have to be tuned for architectures. The architecture optimizations are delegated to BLAS(3).

As mentioned, the implementations here are comparable to the ones from LAPACK wrt performance, e.g.

QL algorithm from GenericLinearAlgebra

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  64.398 μs (3 allocations: 78.61 KiB)

QR/QL from LAPACK

julia> @btime GenericLinearAlgebra.LAPACK2.steqr!('V', S.dv, S.ev, Matrix(1.0*I, size(S)...)) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  105.929 μs (7 allocations: 79.89 KiB)

LinearAlgebra's default which is Inderjit Dhillon's algorithm which is fast for large matrices but has some overhead for smaller matrices

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  355.125 μs (34 allocations: 104.19 KiB)

This is for the just the symmetric tridiagonal eigensolver. For full Symmetric/Hermitian matrices the reduction to tridiagonal form will need a bit of work to make proper use of BLAS3 but that shouldn't be too hard.

@araujoms
Copy link
Contributor

The size of the matrices vary a lot according to the protocol, but 100 x 100 is a good one to benchmark.

I got very different results in my machine, though:

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  444.443 μs (14 allocations: 159.20 KiB)

julia> @btime GenericLinearAlgebra.LAPACK2.steqr!('V', S.dv, S.ev, Matrix(1.0*I, size(S)...)) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  792.057 μs (7 allocations: 79.90 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  328.312 μs (34 allocations: 104.09 KiB)

In any case, what matters to me is the complex Hermitian case:

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.Hermitian(randn(ComplexF64, 100, 100)));
  2.423 ms (28 allocations: 323.31 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.Hermitian(randn(ComplexF64, 100, 100)));
  1.359 ms (23 allocations: 394.52 KiB)

(I did install the an/nopirates branch. This is Julia 1.11.5 on an Intel chip, if it makes any difference).

@andreasnoack
Copy link
Member Author

I got very different results in my machine, though:

When preparing the previous comment, I noticed too annoying sources of overhead which are fixed in #150 and #151. It would be great if you could rerun the timings with the latest master and let me know if your results are more in line with the ones I shared.

In any case, what matters to me is the complex Hermitian case:

I just profiled this one and the majority of the time is spent in

Base.Array(Q::EigenQ) = lmul!(Q, Matrix{eltype(Q)}(I, size(Q, 1), size(Q, 1)))
and
function symtriUpper!(
AS::StridedMatrix{T},
τ = zeros(T, size(AS, 1) - 1),
u = Vector{T}(undef, size(AS, 1)),
) where {T}
n = LinearAlgebra.checksquare(AS)
# We ignore any non-real components of the diagonal
@inbounds for i = 1:n
AS[i, i] = real(AS[i, i])
end
@inbounds for k = 1:(n-2+!(T <: Real))
# This is a bit convoluted method to get the conjugated vector but conjugation is required for correctness of arrays of quaternions. Eventually, it should be sufficient to write vec(x') but it currently (July 10, 2018) hits a bug in LinearAlgebra
τk = LinearAlgebra.reflector!(vec(transpose(view(AS, k, (k+1):n)')))
τ[k] = τk'
for j = (k+1):n
tmp = AS[k+1, j]
for i = (k+2):j
tmp += AS[k, i] * AS[i, j]
end
for i = (j+1):n
tmp += AS[k, i] * AS[j, i]'
end
u[j] = τk' * tmp
end
vcAv = u[k+1]
for i = (k+2):n
vcAv += u[i] * AS[k, i]'
end
ξ = real(vcAv * τk)
for j = (k+1):n
ujt = u[j]
hjt = j > (k + 1) ? AS[k, j] : one(ujt)
ξhjt = ξ * hjt
for i = (k+1):(j-1)
hit = i > (k + 1) ? AS[k, i] : one(ujt)
AS[i, j] -= hit' * ujt + u[i]' * hjt - hit' * ξhjt
end
AS[j, j] -= 2 * real(hjt' * ujt) - abs2(hjt) * ξ
end
end
SymmetricTridiagonalFactorization(
EigenQ(
'U',
AS,
τ,
),
SymTridiagonal(real(diag(AS)), real(diag(AS, 1))),
)
end
. Those methods would have be improved by using a blocked approach. I can at least promise that I won't move forward with this PR until that work has been done.

@araujoms
Copy link
Contributor

With the latest master:

julia> @btime GenericLinearAlgebra._eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  328.754 μs (7 allocations: 80.02 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  329.976 μs (34 allocations: 104.09 KiB)

Clearly we're not running the same code, since even the number of allocations is different.

Those methods would have be improved by using a blocked approach. I can at least promise that I won't move forward with this PR until that work has been done.

That would be great. If the performance is similar I'd much rather use your code, I'll be a happier man the day I don't need to deal with syevr again.

@araujoms araujoms mentioned this pull request Apr 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants