You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jotting down some thoughts for future work... — With the introduction of bitwise operations in the upcoming GraphBLAS release, I was thinking about whether BFS could be sped up using bitwise operations. Here's a sketch of the idea:
We can use a compressed representation of the adjacency matrix A: instead of a boolean n × n matrix, use an UINT64 n/8 x n/8 matrix (bit matrix) with each UINT64 value treated as an 8 x 8 submatrix. In order to meaningfully compress the matrix, it makes sense to reorder its vertices according to their degree so we have values in the bit matrix which correspond to multiple edges.
The frontier/seen (q/v) vectors in the BFS would be of length n/8.
The difficult part is computing the multiplication of matrix elements, i.e. A[i, j] * frontier[j]. During this, we need to treat the UINT64 values as 8 x 8 matrices and multiply them as such (we probably need to transpose the second operand as well).
Based on some quick research, this bit matrix multiplication operation is called BMM.n or BMMT.n (whether or not the second operand is transposed). Cray machines had hardware support up to n = 64 [Hilewitz et al., 2008]. For regular consumer/server CPUs, the "Four Russians method" can be adapted as discussed in [Albretch et al., 2010]. The algorithm described in this paper is available as the m4ri library.
Jotting down some thoughts for future work... — With the introduction of bitwise operations in the upcoming GraphBLAS release, I was thinking about whether BFS could be sped up using bitwise operations. Here's a sketch of the idea:
We can use a compressed representation of the adjacency matrix A: instead of a boolean
n × n
matrix, use an UINT64n/8 x n/8
matrix (bit matrix) with each UINT64 value treated as an8 x 8
submatrix. In order to meaningfully compress the matrix, it makes sense to reorder its vertices according to their degree so we have values in the bit matrix which correspond to multiple edges.The frontier/seen (q/v) vectors in the BFS would be of length
n/8
.The difficult part is computing the multiplication of matrix elements, i.e.
A[i, j] * frontier[j]
. During this, we need to treat the UINT64 values as8 x 8
matrices and multiply them as such (we probably need to transpose the second operand as well).Based on some quick research, this bit matrix multiplication operation is called
BMM.n
orBMMT.n
(whether or not the second operand is transposed). Cray machines had hardware support up to n = 64 [Hilewitz et al., 2008]. For regular consumer/server CPUs, the "Four Russians method" can be adapted as discussed in [Albretch et al., 2010]. The algorithm described in this paper is available as them4ri
library.m4ri
library on BitBucketThe text was updated successfully, but these errors were encountered: