Stop pirating eigen and svd functions from LinearAlgebra #147

andreasnoack · 2025-04-18T15:50:23Z

Instead, define separate unexported functions in the GenericLinearAlgebra
module. Hence, users would have to explicitly opt in to using
GenericLinearAlgebra. A consequence is that the implementations
in this module will be used even for the element types support by
LAPACK.

Also, delete unused functionality.

Finally, also run the formatter.

codecov · 2025-04-18T15:54:48Z

Codecov Report

Attention: Patch coverage is 92.96875% with 9 lines in your changes missing coverage. Please review.

Project coverage is 91.83%. Comparing base (55a38e6) to head (e037d77).

Files with missing lines	Patch %	Lines
src/eigenGeneral.jl	76.92%	3 Missing ⚠️
src/eigenSelfAdjoint.jl	96.42%	3 Missing ⚠️
src/householder.jl	62.50%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #147      +/-   ##
==========================================
- Coverage   91.94%   91.83%   -0.11%     
==========================================
  Files          10        9       -1     
  Lines        1577     1556      -21     
==========================================
- Hits         1450     1429      -21     
  Misses        127      127

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

araujoms · 2025-04-18T16:28:54Z

Please don't do this. My package Ket relies heavily on eigen and friends dispatching to LAPACK for BLASFloats and calling your code for general types. This will make it a nightmare to write generic code. And I'm not the only one in this situation, it will also break Hypatia and Clarabel, for example.

If your concern is avoiding type piracy, why don't you contribute instead the generic code to LinearAlgebra? I've recently done that for syrk/herk and they accepted it: JuliaLang/LinearAlgebra.jl#1249

andreasnoack · 2025-04-18T17:02:11Z

Please don't do this. My package Ket relies heavily on eigen and friends dispatching to LAPACK for BLASFloats and calling your code for general types. This will make it a nightmare to write generic code.

I don't think it will be a nightmare. Packages that indent to work for a broad range of element type should just call GenericLinearAlgebra.svdvals instead of just LinearAlgebra.svdvals. Right now, I see two possible limitations

If using this package for e.g. svdvals, you'd not use LAPACK for the element types that support it. I'm not sure how important that is. The actual bi- and tridiagonal solvers in this package are pretty fast. There are places where this packages will be noticeably slower. This is the general eigenvalue problem where LAPACK uses large shift (not without issues) which makes better use of BLAS3. The other case is MKLs reductions to bi- and tridiagonal which are really really fast, see Improve reductions to bi- and tridiagonal matrices #66. However, if you want your package to work for generic types, do you then also care about maximal performance for BLAS floats? Maybe you do, but my current assumption is that it is less likely.
I'd have to reimplement some lines algebra functions that rely on the generic implementations. I already reimplemented cond and there will be others but it should be fairly easy.

There are two main reasons why I have prepared this PR.

The piracy just causes all kinds of downstream issues, see Fix compat with GenericSchur and GenericSVD #71. There was also an issue where compilation of DiffEq methods became much slower when loading GenericLinearAlgebra because there was a call to norm or cond which then requires the compiler to analyze the svdvals implemention here for ForwardDiff.Dual elements.
This package breaks whenever LinearAlgebra.jl makes changes to eigen and svd. It has happened a few times and will probably happen again. It is possible to patch up these breakages here but it is just an indication that 1. is a real issue.

why don't you contribute instead the generic code to LinearAlgebra?

I don't have an issue with contributing code to LinearAlgebra.jl. I've contributed quite a bit, actually. However, a while back when I started working on generic implementations of the linear algebra functions, we decided to limit the scope of the generic implementations in LinearAlgebra.jl to the linear solvers and delegate the spectral solvers to external modules (mostly but not only this package). Now that LinearAlgebra.jl has moved completely out of the main Julia repo, maybe the decision could be revisited. However, I think it might be nicer to opt in to the generic implementations here in the more transparent way of this PR.

As mentioned in the beginning, I don't think it should be too hard to just qualify the calls to svdvals etc with GenericLinearAlgebra. bug if it turns out that I'm wrong then an option could also be that we introduce a GenericLinearAlgebraPirates package that simply overloads the definitions in LinearAlgebra with the implementations of this package.

araujoms · 2025-04-18T17:47:21Z

I just published a paper where I needed both generic types and maximal performance for BLAS floats. The point is that when I'm using BLAS floats I want to be faster than any other implementation, and I also need the capability of solving the problem with higher precision. There it's fine to be slower, because there's pretty much nobody who can do it. The situation is similar with the conic solvers I mentioned. The papers that introduce them both benchmark their performance against other solvers, and I don't think they would accept losing performance on BLAS floats for no reason.

The solution then wouldn't be to call GenericLinearAlgebra.svd instead of LinearAlgebra.svd, but create a wrapper to do the proper dispatch depending on type. It wouldn't be sensible to ask all 29 dependents of GenericLinearAlgebra to do so, though.

Sure, GenericLinearAlgebraPirates would be a workable solution, although inelegant. If piracy is so pervasive and convenient I think this means that the methods really just belong in LinearAlgebra itself. I urge you to reconsider the decision about splitting. Now that LinearAlgebra is excised from the base language I don't think making it slightly larger is a problem.

Do you mind if I ask people on Slack for their opinion?

src/eigenGeneral.jl

andreasnoack · 2025-04-18T20:13:57Z

I just browsed the paper but it wasn't clear to me

Which part of GenericLinearAlgebra is used (is it an svd that you mentioned. Where is it being called?)
Where the time is spent. Did you profile the run? Is a good chunk spent in a linear algebra routine?

Would it be easy for you to make a run based on this PR? It would be an interesting data point.

Do you mind if I ask people on Slack for their opinion?

I don't. It is fine if more people chime in. I can't promise that it will change the end result but I'm open for inputs.

araujoms · 2025-04-18T20:27:32Z

The source code of the paper is here. It doesn't use svd, it uses eigen. Specifically here.
I did profile it, most of the time is spent doing matrix multiplications, but the time spent in eigen is significant.

I'm sorry, I'm not going to benchmark it again, it is a pain in the ass. Furthermore, I don't think there's a point, I don't think generic code can ever be competitive with LAPACK optimized for specific architectures.

Instead, define separate unexported functions in the GenericLinearAlgebra module. Hence, users would have to explicitly opt in to using GenericLinearAlgebra. A consequence is that the implementations in this module will be used even for the element types support by LAPACK. Also, delete unused functionality.

andreasnoack · 2025-04-20T20:02:37Z

How large, roughly, are the matrices passed to eigen?

I don't think generic code can ever be competitive with LAPACK optimized for specific architectures.

LAPACK isn't optimized for architectures beyond some blocking sizes that are rarely tuned. It is actually a central point of LAPACK that it doesn't have to be tuned for architectures. The architecture optimizations are delegated to BLAS(3).

As mentioned, the implementations here are comparable to the ones from LAPACK wrt performance, e.g.

QL algorithm from `GenericLinearAlgebra`

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  64.398 μs (3 allocations: 78.61 KiB)

QR/QL from `LAPACK`

julia> @btime GenericLinearAlgebra.LAPACK2.steqr!('V', S.dv, S.ev, Matrix(1.0*I, size(S)...)) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  105.929 μs (7 allocations: 79.89 KiB)

LinearAlgebra's default which is Inderjit Dhillon's algorithm which is fast for large matrices but has some overhead for smaller matrices

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  355.125 μs (34 allocations: 104.19 KiB)

This is for the just the symmetric tridiagonal eigensolver. For full Symmetric/Hermitian matrices the reduction to tridiagonal form will need a bit of work to make proper use of BLAS3 but that shouldn't be too hard.

araujoms · 2025-04-20T20:49:25Z

The size of the matrices vary a lot according to the protocol, but 100 x 100 is a good one to benchmark.

I got very different results in my machine, though:

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  444.443 μs (14 allocations: 159.20 KiB)

julia> @btime GenericLinearAlgebra.LAPACK2.steqr!('V', S.dv, S.ev, Matrix(1.0*I, size(S)...)) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  792.057 μs (7 allocations: 79.90 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  328.312 μs (34 allocations: 104.09 KiB)

In any case, what matters to me is the complex Hermitian case:

julia> @btime GenericLinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.Hermitian(randn(ComplexF64, 100, 100)));
  2.423 ms (28 allocations: 323.31 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.Hermitian(randn(ComplexF64, 100, 100)));
  1.359 ms (23 allocations: 394.52 KiB)

(I did install the an/nopirates branch. This is Julia 1.11.5 on an Intel chip, if it makes any difference).

andreasnoack · 2025-04-20T21:07:44Z

I got very different results in my machine, though:

When preparing the previous comment, I noticed too annoying sources of overhead which are fixed in #150 and #151. It would be great if you could rerun the timings with the latest master and let me know if your results are more in line with the ones I shared.

In any case, what matters to me is the complex Hermitian case:

I just profiled this one and the majority of the time is spent in

GenericLinearAlgebra.jl/src/eigenSelfAdjoint.jl

Line 103 in 9e5cb8b

Base.Array(Q::EigenQ) = lmul!(Q, Matrix{eltype(Q)}(I, size(Q, 1), size(Q, 1)))

and

GenericLinearAlgebra.jl/src/eigenSelfAdjoint.jl

Lines 521 to 574 in 9e5cb8b

    
           function symtriUpper!( 
        
               AS::StridedMatrix{T}, 
        
               τ = zeros(T, size(AS, 1) - 1), 
        
               u = Vector{T}(undef, size(AS, 1)), 
        
           ) where {T} 
        
               n = LinearAlgebra.checksquare(AS) 
        
               # We ignore any non-real components of the diagonal 
        
               @inbounds for i = 1:n 
        
                   AS[i, i] = real(AS[i, i]) 
        
               end 
        
               @inbounds for k = 1:(n-2+!(T <: Real)) 
        
                   # This is a bit convoluted method to get the conjugated vector but conjugation is required for correctness of arrays of quaternions. Eventually, it should be sufficient to write vec(x') but it currently (July 10, 2018) hits a bug in LinearAlgebra 
        
                   τk = LinearAlgebra.reflector!(vec(transpose(view(AS, k, (k+1):n)'))) 
        
                   τ[k] = τk' 
        
                   for j = (k+1):n 
        
                       tmp = AS[k+1, j] 
        
                       for i = (k+2):j 
        
                           tmp += AS[k, i] * AS[i, j] 
        
                       end 
        
                       for i = (j+1):n 
        
                           tmp += AS[k, i] * AS[j, i]' 
        
                       end 
        
                       u[j] = τk' * tmp 
        
                   end 
        
                   vcAv = u[k+1] 
        
                   for i = (k+2):n 
        
                       vcAv += u[i] * AS[k, i]' 
        
                   end 
        
                   ξ = real(vcAv * τk) 
        
                   for j = (k+1):n 
        
                       ujt = u[j] 
        
                       hjt = j > (k + 1) ? AS[k, j] : one(ujt) 
        
                       ξhjt = ξ * hjt 
        
                       for i = (k+1):(j-1) 
        
                           hit = i > (k + 1) ? AS[k, i] : one(ujt) 
        
                           AS[i, j] -= hit' * ujt + u[i]' * hjt - hit' * ξhjt 
        
                       end 
        
                       AS[j, j] -= 2 * real(hjt' * ujt) - abs2(hjt) * ξ 
        
                   end 
        
               end 
        
               SymmetricTridiagonalFactorization( 
        
                   EigenQ( 
        
                       'U', 
        
                       AS, 
        
                       τ, 
        
                   ), 
        
                   SymTridiagonal(real(diag(AS)), real(diag(AS, 1))), 
        
               ) 
        
           end

. Those methods would have be improved by using a blocked approach. I can at least promise that I won't move forward with this PR until that work has been done.

araujoms · 2025-04-20T21:22:26Z

With the latest master:

julia> @btime GenericLinearAlgebra._eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  328.754 μs (7 allocations: 80.02 KiB)

julia> @btime LinearAlgebra.eigen!(S) setup = (S = LinearAlgebra.SymTridiagonal(ones(100), 0.2*ones(99)));
  329.976 μs (34 allocations: 104.09 KiB)

Clearly we're not running the same code, since even the number of allocations is different.

Those methods would have be improved by using a blocked approach. I can at least promise that I won't move forward with this PR until that work has been done.

That would be great. If the performance is similar I'd much rather use your code, I'll be a happier man the day I don't need to deal with syevr again.

andreasnoack added the breaking label Apr 18, 2025

andreasnoack force-pushed the an/nopirates branch from 40223e0 to 752344d Compare April 18, 2025 15:51

andreasnoack force-pushed the an/nopirates branch 2 times, most recently from 6ccaab3 to 98ca476 Compare April 18, 2025 16:16

andreasnoack force-pushed the an/nopirates branch from 98ca476 to dfe2457 Compare April 18, 2025 17:14

github-actions bot reviewed Apr 18, 2025

View reviewed changes

src/eigenGeneral.jl Outdated Show resolved Hide resolved

andreasnoack force-pushed the an/nopirates branch from 6620fb4 to 602d518 Compare April 18, 2025 20:06

andreasnoack force-pushed the an/nopirates branch from 602d518 to 5ea8fd0 Compare April 18, 2025 20:14

andreasnoack force-pushed the an/nopirates branch from 5ea8fd0 to e74623b Compare April 19, 2025 07:53

andreasnoack added 2 commits April 19, 2025 18:02

Fixup method extension for Hessenberg

e037d77

andreasnoack force-pushed the an/nopirates branch from e74623b to e037d77 Compare April 19, 2025 16:02

araujoms mentioned this pull request Apr 26, 2025

get it working on 1.12 #153

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stop pirating eigen and svd functions from LinearAlgebra #147

Stop pirating eigen and svd functions from LinearAlgebra #147

Uh oh!

andreasnoack commented Apr 18, 2025

Uh oh!

codecov bot commented Apr 18, 2025 •

edited

Loading

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

andreasnoack commented Apr 18, 2025

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

Uh oh!

andreasnoack commented Apr 18, 2025

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

andreasnoack commented Apr 20, 2025 •

edited

Loading

Uh oh!

araujoms commented Apr 20, 2025

Uh oh!

andreasnoack commented Apr 20, 2025

Uh oh!

araujoms commented Apr 20, 2025

Uh oh!

Uh oh!

Stop pirating eigen and svd functions from LinearAlgebra #147

Are you sure you want to change the base?

Stop pirating eigen and svd functions from LinearAlgebra #147

Uh oh!

Conversation

andreasnoack commented Apr 18, 2025

Uh oh!

codecov bot commented Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

andreasnoack commented Apr 18, 2025

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

Uh oh!

andreasnoack commented Apr 18, 2025

Uh oh!

araujoms commented Apr 18, 2025

Uh oh!

andreasnoack commented Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

QL algorithm from GenericLinearAlgebra

QR/QL from LAPACK

LinearAlgebra's default which is Inderjit Dhillon's algorithm which is fast for large matrices but has some overhead for smaller matrices

Uh oh!

araujoms commented Apr 20, 2025

Uh oh!

andreasnoack commented Apr 20, 2025

Uh oh!

araujoms commented Apr 20, 2025

Uh oh!

Uh oh!

codecov bot commented Apr 18, 2025 •

edited

Loading

andreasnoack commented Apr 20, 2025 •

edited

Loading

QL algorithm from `GenericLinearAlgebra`

QR/QL from `LAPACK`