sparse matrix * sparse vector is slow when the vector is relatively dense #58

augustinyiptong · 2017-02-03T21:55:57Z

Hi,

The current implementation seems to produce a dense result vector which is then converted to sparse before returning. I ran into memory scaling issues. Perhaps something along the lines of my_mul below would be worth having?

A = sprand(100000000,1000,.00001)
x = sprand(1000,  0.3)

@time y1=A*x
@time y2=my_mul(A,x)

@assert y1.nzval==y2.nzval
@assert y1.nzind==y2.nzind

 0.511000 seconds (11 allocations: 778.183 MB, 16.05% gc time)
 0.073359 seconds (59 allocations: 21.794 MB)

function my_mul{TvA,TiA,TvX,TiX}(A::SparseMatrixCSC{TvA,TiA}, x::AbstractSparseVector{TvX,TiX})
    m, n = size(A)
    length(x) == n || throw(DimensionMismatch())
    Tv = promote_type(TvA, TvX)
    Ti = promote_type(TiA, TiX)

    xnzind = x.nzind
    xnzval = x.nzval
    Acolptr = A.colptr
    Arowval = A.rowval
    Anzval = A.nzval
    ynzind=Ti[]
    ynzval=Tv[]
    
    m == 0 && return sparsevec(ynzind, ynzval, m)
    
    @inbounds for i = 1:length(xnzind)
        v = xnzval[i]
        if v != zero(v)
            j = xnzind[i]
            for r = A.colptr[j]:(Acolptr[j+1]-1)
                push!(ynzind, Arowval[r])
                push!(ynzval, Anzval[r] * v)
            end
        end
    end

    return sparsevec(ynzind, ynzval, m)
end

The text was updated successfully, but these errors were encountered:

Sacha0 · 2017-02-04T21:02:17Z

Hello Augustin! Out of curiosity, what is your application? The extreme aspect ratio matrix with which you benchmark (A = sprand(10^8, 10^3, 10.0^(-5))) is a pathological case for the existing approach; note that the existing approach performs substantially better than my_mul in e.g. the following more typical case:

julia> using BenchmarkTools

julia> A = sprand(10^3, 10^3, 10.0^(-2)); # square, about ten entries per row/column

julia> x = sprand(10^3, 10.0^(-2)); # about ten entries

julia> @benchmark $A*$x
BenchmarkTools.Trial:
  samples:          10000
  evals/sample:     9
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  23.84 kb
  allocs estimate:  4
  minimum time:     2.37 μs (0.00% GC)
  median time:      3.29 μs (0.00% GC)
  mean time:        6.84 μs (31.31% GC)
  maximum time:     369.23 μs (96.68% GC)

julia> @benchmark my_mul($A, $x)
BenchmarkTools.Trial:
  samples:          10000
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  15.38 kb
  allocs estimate:  25
  minimum time:     21.09 μs (0.00% GC)
  median time:      21.94 μs (0.00% GC)
  mean time:        23.96 μs (2.99% GC)
  maximum time:     1.84 ms (96.69% GC)

To check whether the existing code has room for (simple) improvement, I sketched a set of naive gather-scatter sparse-sparse matvec methods from scratch (code below, not performance tuned, and could probably avoid the O(size(A, 1)) scan over the workspace in the gather). These methods (spmatspvec) perform about as well as (or slightly better than) the existing code on the square case above (allocation down by about a factor of two, and likely as a result less runtime variance):

julia> @benchmark spmatspvec($A, $x, spmatspvec_alloc($A, $x))
BenchmarkTools.Trial:
  samples:          10000
  evals/sample:     9
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  10.41 kb
  allocs estimate:  4
  minimum time:     2.18 μs (0.00% GC)
  median time:      2.72 μs (0.00% GC)
  mean time:        4.11 μs (19.71% GC)
  maximum time:     302.21 μs (97.62% GC)

The story is similar for the case you benchmarked, though with a far more modest allocation reduction (the scatter-gather workspace allocation dominates):

julia> @benchmark $A*$x
BenchmarkTools.Trial:
  samples:          11
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  778.20 mb
  allocs estimate:  7
  minimum time:     421.87 ms (2.97% GC)
  median time:      466.40 ms (11.56% GC)
  mean time:        470.39 ms (12.19% GC)
  maximum time:     551.85 ms (24.54% GC)

julia> @benchmark my_mul($A, $x)
BenchmarkTools.Trial:
  samples:          89
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  21.23 mb
  allocs estimate:  52
  minimum time:     52.13 ms (0.00% GC)
  median time:      56.15 ms (3.34% GC)
  mean time:        56.29 ms (3.29% GC)
  maximum time:     62.77 ms (3.10% GC)

julia> @benchmark spmatspvec($A, $x, spmatspvec_alloc($A, $x))
BenchmarkTools.Trial:
  samples:          11
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  767.42 mb
  allocs estimate:  7
  minimum time:     428.40 ms (3.12% GC)
  median time:      460.44 ms (11.58% GC)
  mean time:        466.54 ms (12.11% GC)
  maximum time:     554.29 ms (23.98% GC)

But spmatspvec allows you to allocate the gather-scatter workspace ahead of time (and reuse it with reinitialization), so when performing a series of such sparse-sparse matvecs you can achieve substantially better performance on your benchmark:

julia> @benchmark spmatspvec($A, $x, $w)
BenchmarkTools.Trial:
  samples:          47
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  4.48 mb
  allocs estimate:  5
  minimum time:     101.96 ms (0.00% GC)
  median time:      106.70 ms (0.00% GC)
  mean time:        107.08 ms (0.35% GC)
  maximum time:     113.97 ms (0.00% GC)

Much closer to the specialized my_mul for runtime, and with allocation down a few orders of magnitude. With more information about your application, I might be able to offer more. Best!

(spmatspvec implementation and tests:

_checkshapecompat(A::SparseMatrixCSC, x::SparseVector) =
    size(A, 2) == length(x) || throw(DimensionMismatch(string("the number of ",
        "columns of the sparse matrix must match the sparse vector's length")))
_checkshapecompat(A::SparseMatrixCSC, sgw::Vector) =
    size(A, 1) <= length(sgw) || throw(DimensionMismatch(string("the scatter-gather ",
        "workspace vector's length be at least the number of rows of the sparse matrix")))
"""
Given sparse matrix `A` and sparse vector `x` that you would like to multiply (or
surrogates with the shape of `A` and `x`), checks shape compatibility of `A` and `x`
and allocates an appropriately sized scatter-gather workspace.
""" 
spmatspvec_alloc(A::SparseMatrixCSC, x::SparseVector) = 
    (_checkshapecompat(A, x); zeros(promote_type(eltype(A), eltype(x)), size(A, 1)))
"""
Given sparse matrix `A` and sparse vector `x` that you would like to multiply (or
surrogates with the shape of `A` and `x`), and also a scatter-gather workspace `sgw`,
checks shape compatibility of `A`, `x`, and `sgw`, and then zeros `sgw`.
"""
function spmatspvec_init!(A::SparseMatrixCSC, x::SparseVector, sgw::Vector)
    _checkshapecompat(A, x)
    _checkshapecompat(A, sgw)
    fill!(sgw, 0)
end
"""
Returns `A*x` using scatter-gather workspace `sgw`. Assumes shape compatibility of
`A`, `x`, and `sgw`. Also assumes `sgw` begins zero'd, and returns with `sgw` zero'd.
"""
function spmatspvec(A::SparseMatrixCSC, x::SparseVector, sgw::Vector)
    # Calculate the result's entries, scattering into sgw
    resnnz::Int = 0
    @inbounds for xk in 1:nnz(x)
        xv = x.nzval[xk]
        iszero(xv) && continue
        xi = x.nzind[xk]
        for Ak in A.colptr[xi]:(A.colptr[xi + 1] - 1)
        # for Ak in nzrange(A, xi)
            Ai = A.rowval[Ak]
            Av = A.nzval[Ak]
            sgwv = sgw[Ai]
            iszero(sgwv) && (resnnz += 1)
            sgw[Ai] = sgwv + Av * xv
        end
    end
    # Gather the result's entries, zero'ing sgw
    resnzind = Vector{eltype(x.nzind)}(resnnz)
    resnzval = Vector{eltype(sgw)}(resnnz)
    resk::Int = 1
    # @inbounds for (sgwi, sgwv) in enumerate(sgw)
    @inbounds for sgwi in 1:length(sgw)
        sgwv = sgw[sgwi]
        if !iszero(sgwv)
            resnzind[resk] = sgwi
            resnzval[resk] = sgwv
            resk += 1
            sgw[sgwi] = 0
        end
    end
    return SparseVector(size(A, 1), resnzind, resnzval)
end

using Base.Test
let N = 100, M = 101, p = 0.1
    A = sprand(N, M, p)
    x = sprand(M, p)
    # test allocation
    @test_throws DimensionMismatch 4spmatspvec_alloc(spzeros(N, M), spzeros(N))
    @test isa(spmatspvec_alloc(spzeros(Float64, N, M), spzeros(Float32, M)), Vector{Float64})
    @test length(spmatspvec_alloc(spzeros(N, M), spzeros(M))) == N
    @test iszero(spmatspvec_alloc(A, x))
    # test initialization
    w = spmatspvec_alloc(A, x)
    @test_throws DimensionMismatch spmatspvec_init!(A, spzeros(N), w)
    @test_throws DimensionMismatch spmatspvec_init!(A, x, zeros(1))
    @test (rand!(w); spmatspvec_init!(A, x, w); iszero(w))
    # test kernel
    @test spmatspvec(A, x, w) == A*x
    @test iszero(w)
end

augustinyiptong · 2017-02-07T16:52:45Z

Hi, thanks for your response.

My application is pricing/managing/reporting reinsurance contracts. We essentially have a simulation with a very large number of events which can potentially take losses. My matrix has a row for each event and a column for each contract. In practice a relatively small number of events actually take losses. Happy to talk more but perhaps this is not the right forum.

The actual length of the sparse vector has not been an issue up to now because all the operations i have used scaled with the non zeros. I dont actually store the full vector anywhere.

Unfortunately A*x allocates memory to store the full vector, which cannot fit in memory. I can see the merit of the scatter-gather method but it doesnt help in my 'pathological' case.

As a matter of principle, sparse operations should scale with nnz and never have to allocate the full vector / matrix. Which something like my_mul achieves.

Pragmatically this might not always be possible or necessary especially if there are no use cases which would benefit.

I am happy to have this issue closed. Thanks again, Gus

StefanKarpinski · 2017-02-07T17:53:40Z

Couldn't this operation use a polyalgorithm that looks at the actual size/shape to decide the best approach? As long as the result type is predictable, should be fine.

augustinyiptong · 2017-02-09T18:35:50Z

Chewed a bit more on this and here is a better solution for my specific use case. My A matrix stays the same and i have to perform many multiplications with different x

I pre calculate a condensed matrix Ac which has all the empty rows removed. The multiplication is done much faster with Ac and then the final sparse result is put together.

A = sprand(100000000,1000,.00001)
x = sprand(1000,  0.3)

@time y1=A*x
@time Ac, nzval, A_m = pre_mul(A)
@time y2=c_mul(Ac, nzval, A_m, x)

@assert y1.nzval==y2.nzval
@assert y1.nzind==y2.nzind

  0.493833 seconds (11 allocations: 778.221 MB, 17.29% gc time)
  0.692463 seconds (287 allocations: 123.926 MB, 19.02% gc time)
  0.019538 seconds (19 allocations: 15.207 MB)

function pre_mul(A::SparseMatrixCSC)
    nzval = sort(unique(A.rowval))
    row_dict = Dict{Int64,Int64}(zip(nzval,1:length(nzval)))
    Ac_r = zeros(Int64, length(A.rowval))
    for i in 1:length(A.rowval)
        Ac_r[i] = row_dict[A.rowval[i]]
    end

    Ac_m = length(nzval)
    Ac_n = A.n
    Ac_c = A.colptr
    Ac_nz = A.nzval
    Ac = SparseMatrixCSC(Ac_m, Ac_n, Ac_c, Ac_r, Ac_nz)    
    
    return Ac, nzval, A.m
end
    
function c_mul(Ac::SparseMatrixCSC, nzval::AbstractArray, A_m::Int64, x::AbstractSparseVector)
    T = promote_type(eltype(Ac), eltype(x))
    yc = Array{T}(Ac.m)
    A_mul_B!(yc, Ac, x)
    y = dropzeros!(SparseVector(A_m, deepcopy(nzval), yc))
end

Sacha0 · 2017-02-18T01:09:44Z

As a matter of principle, sparse operations should scale with nnz and never have to allocate the full vector / matrix.

<digression>
A brief technical/historical digression :). Many operations on compressed column sparse matrices use at least O(M), O(N), or O(M + N) additional storage (for matrix dimensions M-by-N): The compressed column data structure itself contains a dense vector of length at least N + 1, so additional storage of O(N) does not impact asymptotic storage requirements (and likewise for O(M) and O(M + N) additional storage where M and N are of the same order). (The implicit assumption is that the stored matrix is not hypersparse, i.e. nnz exceeds N by an at least O(1) [and typically greater] factor, such that O(N) (with small constant factor) workspace is negligible. And M being the same order as N is often also assumed, though that assumption breaks down in some applications such as yours.) As such scatter-gather via a dense vector is a widespread technique and suitable for most compressed column sparse matrices; slightly more complex implementations avoid O(M)(/O(N)) scans, though in many cases an O(M)(/O(N)) scan is negligible due to necessary O(nnz) scans where nnz exceeds N as above.
</digression>

A common approach to efficient sparse-sparse matvecs in cases like yours: For each nonzero value in x (x[j]), from the corresponding column in A generate a list of (rowind, resval) pairs (for nonzero resvals A[i,j]*x[j]). Then multi-way merge those lists into a sparse vector. (Lots of opportunity for parallelism.)

The approach you take above to exploiting the additional information that A remains the same over a series of sparse-sparse matvecs is great! Below is a version of that approach that makes the implicit scatter-gather explicit and recycles the corresponding workspace. On your test case it achieves better than a factor of three reduction in allocation and a bit shy of a factor of two runtime reduction:

julia> using BenchmarkTools

julia> m, n, p = 10^8, 10^3, 10.0^(-5);

julia> A = sprand(m, n, p);

julia> x = sprand(n, 0.3);

julia> Ac, nzrows, Am = pre_mul(A);

julia> @benchmark y = c_mul($Ac, $nzrows, $Am, $x)
BenchmarkTools.Trial:
  samples:          381
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  15.19 mb
  allocs estimate:  7
  minimum time:     10.77 ms (0.00% GC)
  median time:      13.07 ms (13.23% GC)
  mean time:        13.14 ms (12.22% GC)
  maximum time:     17.13 ms (17.60% GC)

julia> aux = spmatspvec_init(A, promote_type(eltype(A), eltype(x)));

julia> @benchmark y = spmatspvec($A, $x, $aux)
BenchmarkTools.Trial:
  samples:          581
  evals/sample:     1
  time tolerance:   5.00%
  memory tolerance: 1.00%
  memory estimate:  4.30 mb
  allocs estimate:  5
  minimum time:     7.05 ms (0.00% GC)
  median time:      8.10 ms (0.00% GC)
  mean time:        8.61 ms (5.09% GC)
  maximum time:     11.94 ms (23.38% GC)

You could avoid the remaining scatter-gather workspace scan by using e.g. a singly linked list to track which entries you populate in the scatter-gather workspace. Chances are that approach would only reduce runtime for significantly less populated x.

using Base.SparseArrays: indtype

struct SpSpMatvecAux{Tv,Ti<:Integer}
    condrowval::Vector{Ti}
    nonzerorows::Vector{Ti}
    sgw::Vector{Tv}
end
function spmatspvec_init{reseltype}(A::SparseMatrixCSC, ::Type{reseltype})
    TiA = indtype(A)
    nzrows = sort!(unique(A.rowval[1:nnz(A)]))
    indexmap = Dict{TiA,TiA}(zip(nzrows, 1:length(nzrows)))
    condrowval = TiA[ indexmap[A.rowval[Ak]] for Ak in 1:nnz(A) ]
    sgworkspace = zeros(reseltype, length(nzrows))
    return SpSpMatvecAux(condrowval, nzrows, sgworkspace)
end
"""
Returns `A*x` using auxiliary `aux`. Assumes shape compatibility of
`A` and `x`, and assumes `aux` was generated for `A` and `x` and not
modified outside this routine.
"""
function spmatspvec(A::SparseMatrixCSC, x::SparseVector, aux::SpSpMatvecAux)
    # Calculate the result's entries, scattering into aux.sgw
    resnnz::Int = 0
    @inbounds for xk in 1:nnz(x)
        xv = x.nzval[xk]
        iszero(xv) && continue
        xi = x.nzind[xk]
        # for Ak in nzrange(A, xi)
        for Ak in A.colptr[xi]:(A.colptr[xi + 1] - 1)
            Ai = aux.condrowval[Ak]
            Av = A.nzval[Ak]
            sgwv = aux.sgw[Ai]
            iszero(sgwv) && (resnnz += 1)
            aux.sgw[Ai] = sgwv + Av * xv
        end
    end
    # Gather the result's entries, zero'ing aux.sgw, and mapping back to actual row indices
    resnzind = Vector{indtype(x)}(resnnz)
    resnzval = Vector{eltype(aux.sgw)}(resnnz)
    resk::Int = 1
    sgwi, sgwimax = 1, length(aux.sgw)
    @inbounds while sgwi <= sgwimax && resk <= resnnz
        sgwi += 1
        sgwv = aux.sgw[sgwi]
        if !iszero(sgwv)
            resnzind[resk] = aux.nonzerorows[sgwi]
            resnzval[resk] = sgwv
            resk += 1
            aux.sgw[sgwi] = 0
        end
    end
    return SparseVector(size(A, 1), resnzind, resnzval)
end

Best!

augustinyiptong · 2017-03-03T18:07:44Z

Hi @Sacha0

I think it should be:
sgwi, sgwimax = 0, length(aux.sgw)
?

Sacha0 · 2017-03-03T18:18:15Z

Good catch! Looks like either: (1) sgwi should be initialized to zero and the condition in the while changed to sgwi < sgwimax; or (2) sgwi += 1 should move to the end of the while. Best!

augustinyiptong · 2017-03-12T02:24:38Z

Hi @Sacha0

iszero(sgwv) && (resnnz += 1) is not foolproof. Consider:

A = sparse(ones(1,3))
x = sparsevec([1.0, -1.0, 1.0])
aux = spmatspvec_init(A, Float64)
spmatspvec(A,x,aux)

gives

Sparse vector of length 1 with 2 Float64 nonzero entries:
  [1]  =  1.0
  [2147420144]  =  0.0

resnnz = countnz(aux.sgw) avoids this but obviously slower.

Sacha0 · 2017-03-12T02:58:22Z

Also a good catch! There are a number of ways you can fix that issue without adding another sweep (countnz). For one, you could count the number of nonzero values in the process of the last (gathering) while loop, and correct the lengths of resnzind and resnzval afterward. For two, you could use a more sophisticated check in the first (scattering) while loop, e.g. replacing the iszero(sgw)... and following line with

aux.sgwv[Ai] = newval = sgwv + Av * xv
if iszero(sgwv) && !iszero(newval)
    resnnz += 1
elseif !iszero(sgwv) && iszero(newval)
    resnnz -= 1
end

should do the trick. The second approach should be more efficient than the first in some cases. Best!

simonbyrne · 2017-10-12T23:27:33Z

Was this actually resolved? I got bitten by this today (on 0.6), and ended up hacking my own together (https://gist.github.com/simonbyrne/43a4041c76e18f79837b06dec459b820).

augustinyiptong · 2017-10-13T00:28:00Z

No official fix.

simonbyrne · 2017-10-13T00:32:16Z

Okay, in that case I think we should reopen it. Sounds like some sort of polyalgorithm is the way forward as @StefanKarpinski suggested.

oscardssmith · 2017-10-13T01:31:41Z

Would it make sense to have a "HyperSparse" matrix type (probably in a package) for situations like this?

simonbyrne · 2017-10-13T03:09:29Z

I’m not sure what’s you mean: care to expand?

StefanKarpinski · 2017-10-13T14:27:33Z

Hypersparse typically means sparse in both rows and columns, i.e. efficient even in the case when nnz << min(nrows, ncols). There are a few data structures that are typically used for this kind of thing. @ViralBShah would know more about it.

simonbyrne · 2017-10-13T16:14:57Z

Ah thanks. Well that didn't apply in my case: I had nrows = ncols = O(1e6), and nnz = O(1e8)

ViralBShah · 2017-10-13T16:14:58Z

The sparse data structure we use (CSC) is a dense vector of sparse columns. It is good for linear algebra type stuff and the traditional finite element style problems, but not good for hypersparse problems that have less than 1 nonzero per row/column on average (if you use CSC for these, you are just spinning your wheels iterating through empty columns 90% of the time). You can do significantly better as discussed here.

I think auto-detecting all this is a bit too magical. Perhaps a HyperSparse array type is the right way to go. Or perhaps implement some of this on the NDSparse data structure that @shashi and @JeffBezanson have in IndexedTables.

ViralBShah · 2017-10-13T16:17:30Z

Ah seeing @simonbyrne 's comment above, maybe we have general inefficiencies in here. His case is certainly not hypersparse.

ViralBShah · 2018-12-17T03:34:04Z

Since we have a sparse vector now, we should stop converting this one to dense in the sparse-sparse matvec.

goggle · 2019-07-17T16:14:12Z

I have tried many things to address this issue and must conclude, that the current solution (writing the results of the multiplication into a dense vector, then pruning it) is probably the best general purpose solution.

The problem is, that the number of nonzero values in the result of a multiplication x = A*v cannot be easily determined in advance. We only know that the resulting vector has at most m nonzero entries when the matrix A is of dimension (m, n).

The suggested solutions work well for certain cases where v has only a few nonzero entries compared to its length n, but they fail in more general cases like

A = sprandn(10_000, 10_000, 0.01)
v = sprand(10_000, 0.05)

Another idea was to adopt the algorithm for sparse matrix times sparse matrix multiplication. A quick benchmark (using the A and v from above and V = SparseMatrixCSC(v) shows the following:

julia> @benchmark A*v
BenchmarkTools.Trial: 
  memory estimate:  234.64 KiB
  allocs estimate:  7
  --------------
  minimum time:     111.421 μs (0.00% GC)
  median time:      122.007 μs (0.00% GC)
  mean time:        134.550 μs (8.28% GC)
  maximum time:     35.534 ms (99.58% GC)
  --------------
  samples:          10000
  evals/sample:     1

julia> @benchmark A*V
BenchmarkTools.Trial: 
  memory estimate:  180.98 KiB
  allocs estimate:  7
  --------------
  minimum time:     174.162 μs (0.00% GC)
  median time:      179.654 μs (0.00% GC)
  mean time:        190.914 μs (5.34% GC)
  maximum time:     35.640 ms (99.15% GC)
  --------------
  samples:          10000
  evals/sample:     1

So it consumes slightly less memory than the current solution, but it is also slower...
I think speed should be preferred here. People who run into those edge cases can always come up with their individual solutions.

If someone knows an algorithm that is both fast and does not blow up the memory usage for general cases, I'm happy to try it!

augustinyiptong closed this as completed Feb 17, 2017

simonbyrne reopened this Oct 13, 2017

ViralBShah changed the title ~~sparse matrix * sparse vector~~ sparse matrix * sparse vector is slow when the vector is relatively dense Oct 31, 2020

KristofferC transferred this issue from JuliaLang/julia Jan 14, 2022

ViralBShah added the performance label Jul 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sparse matrix * sparse vector is slow when the vector is relatively dense #58

sparse matrix * sparse vector is slow when the vector is relatively dense #58

augustinyiptong commented Feb 3, 2017

Sacha0 commented Feb 4, 2017

augustinyiptong commented Feb 7, 2017

StefanKarpinski commented Feb 7, 2017

augustinyiptong commented Feb 9, 2017

Sacha0 commented Feb 18, 2017

augustinyiptong commented Mar 3, 2017

Sacha0 commented Mar 3, 2017

augustinyiptong commented Mar 12, 2017

Sacha0 commented Mar 12, 2017

simonbyrne commented Oct 12, 2017

augustinyiptong commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

oscardssmith commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

StefanKarpinski commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

ViralBShah commented Oct 13, 2017

ViralBShah commented Oct 13, 2017

ViralBShah commented Dec 17, 2018

goggle commented Jul 17, 2019

sparse matrix * sparse vector is slow when the vector is relatively dense #58

sparse matrix * sparse vector is slow when the vector is relatively dense #58

Comments

augustinyiptong commented Feb 3, 2017

Sacha0 commented Feb 4, 2017

augustinyiptong commented Feb 7, 2017

StefanKarpinski commented Feb 7, 2017

augustinyiptong commented Feb 9, 2017

Sacha0 commented Feb 18, 2017

augustinyiptong commented Mar 3, 2017

Sacha0 commented Mar 3, 2017

augustinyiptong commented Mar 12, 2017

Sacha0 commented Mar 12, 2017

simonbyrne commented Oct 12, 2017

augustinyiptong commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

oscardssmith commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

StefanKarpinski commented Oct 13, 2017

simonbyrne commented Oct 13, 2017

ViralBShah commented Oct 13, 2017

ViralBShah commented Oct 13, 2017

ViralBShah commented Dec 17, 2018

goggle commented Jul 17, 2019