FlashX tools

Interact with SAFS

utils/SAFS-util is a tool provided by SAFS that helps users to interact with SAFS. It provides a few commands to operate SAFS:

create file_name size: create an SAFS file with the specified size
delete file_name: delete an SAFS file
help: print this help info
list: list existing files in SAFS
load safs_file [linux_file]: load data to the SAFS file safs_file. linux_file is a file in Linux filesystem. If linux_file is provided, the data in linux_file is loaded to safs_file. Otherwise, safs_file is filled with 0.
verify safs_file [linux_file]: verify data in the SAFS file safs_file. If linux_file is provided, the tool verifies data in the SAFS file against linux_file. Otherwise, the tool checks if safs_file is filled with 0.
export safs_file linux_file: export an SAFS file to Linux filesystem
info file_name: show the information of an SAFS file
rename file_name new_name: rename an SAFS file

Construct FlashGraph graphs

el2fg takes a file of edge lists in the text format and converts them to the FlashGraph format, which is adjacency list in the binary format. The edge list file(s) can be gzip'd. The gzip'd file needs to have a filename extension .gz in order to be considered as a gzip'd file. el2fg generates two binary files: a graph file that contains the adjacency lists of the graph and an index file that contains the locations of vertices in the graph file. By default, el2fg keeps all intermediate data in memory.

matrix/utils/el2fg [options] conf_file edge_file graph_name
-u: undirected graph
-U: unqiue edges
-e: use external memory
-s size: sort buffer size
-g size: groupby buffer size
-t type: the edge attribute type

el2fg takes a FlashX configuration file, a text edge list file and a graph name, and outputs two files: graph_name.adj and graph_name.index.

-u: To construct an undirected graph, users have to explicitly enable this flag.
-U: enable this flag to remove redundant edges in a graph.
-e: enable this flag to use disks to construct a graph. It requires users to configure SAFS correctly.
-t type: users can specify the edge attribute type. When this option is specified, the input edge list has to have three columns and the third column provides the edge attribute. Currently, four attribute types are supported. "I": 32-bit integer attributes, "L": 64-bit integer attributes, "F": single-precision float-point, "D": double-precision float-point.

An example:

To convert a directed graph in the edge list format to the FlashGraph format, we can run the following command:

flash-graph/tools/el2al -w wiki-Vote.adj wiki-Vote.index wiki-Vote.txt

This converts an edge list in wiki-Vote.txt to the FlashGraph format and generate two files: wiki-Vote.adj-v4 and wiki-Vote.index-v4. wiki-Vote.adj-v4 contains the graph data and wiki-Vote.index-v4 contains the index to the graph data.

To convert an undirected graph in the edge list format to the FlashGraph format, we run:

flash-graph/tools/el2al -w -u facebook.adj facebook.index facebook_combined.txt

When converting a large graph, el2al uses a lot of memory if it keeps all intermediate data in memory, and takes a long time to convert a graph. el2al can keep intermediate data on disks and run in parallel. When -d flag is used, el2al uses stxxl to store intermediate data on disks. Note: users may need to create .stxxl in the current directory to configure the stxxl library where running el2al with -d flag. The page describes how to configure stxxl. -T flag specifies the number of threads used for graph format conversion.

For example, the following command converts a twitter graph with around 60 million vertices in the text edge list format to the adjacency list format out of core and in parallel.

el2al -w -T 32 -d twitter.adj twitter.index twitter_rv.net.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FlashX tools

Interact with SAFS

Construct FlashGraph graphs

Construct FlashMatrix sparse matrices

Clone this wiki locally