See [...]
CUDA (tested on 12.4), cub (part of CCCL) (included in the CUDA Toolkit), a c++ compiler.
To compile the code use (and match your target architecture in the Makefile
, the default is sm_86
):
$ make
This creates one executable named tc
which includes both the preprocessing and the GPU code.
To clean use:
$ make clean
To run the code:
$ ./tc -m <Matrix Market graph> -s <spread> -a <adjacency matrix length>
or:
$ ./tc -e <edge list graph> -s <spread> -a <adjacency matrix length>
Run ./tc
for more usage information.
$ ./tc -m Amazon0302.mtx -s 5 -a 8192 -l 10
gives the following output:
$ graph n m s a triangles prepro (s) GPU copy (s) GPU exec (s) GPU total (s) CPU+GPU (s)
$ Amazon0302.mtx 262111 899792 5 8192 717719 0.056998 0.002199 0.000407 0.002606 0.003735
$ Amazon0302.mtx 262111 899792 5 8192 717719 0.056998 0.002165 0.000406 0.002570 0.004149
$ Amazon0302.mtx 262111 899792 5 8192 717719 0.056998 0.002393 0.000438 0.002831 0.004080
$ ...
The experiments folder contains all implementations of all additional algorithms used in the paper. This includes:
- implementations for different edge-retrieval strategies.
- an implementation for a wedge-parallel arrow-wedge-style algorithm.
- implementations for the vertex- and edge-parallel algorithms (with outgoing, arrow, and mixed wedge styles).
- implementations for WeTriC without optimizations, WeTriC with reordering, and WeTriC with reordering and spreading (note: for WeTriC with reordering, spreading, and cooperation use the main implementation with
-a
set to0
). - implementations for versions of most of the above algorithms able to handle very large graphs (m > 2^32), see the
tc_big_graphs...
files.
[...]