Skip to content

Graph: Computation and Data Types

Lifeng Nai edited this page Mar 7, 2015 · 2 revisions

Real-world graph computing contains a broad scope of use cases, from cognitive analytics to data exploration. The wide range of use cases introduce diverse features of graph computing, which are far more comprehensive than simple graph traversals. To understand the diversity and uniqueness of graph computing, we analyze the use cases from IBM System G and summarize the graph computing behaviors by computation types and data sources.

Computation type is the key factor when classifying the numerous existing graph applications. Despite the variance of implementation details, graph computing applications can be classified into a few computation types according to their computation behaviors. As shown in Table 1, we summarize the applications into three categories according to their computation targets: graph structures, graph properties, and dynamic graphs. They have different features in terms of read/write/numeric intensity. Computation on the graph structure performs graph traversal mostly. It incorporates a large number of memory accesses and limited numeric operations. Their irregular memory access pattern leads to extremely poor spatial locality. On the contrary, the computation on graphs with rich properties introduces lots of numeric computations on properties, which leads to hybrid workload behaviors. For computation on dynamic graphs, it also shows an irregular access pattern. However, the dynamic updates of graph structures lead to high write intensity and dynamic memory footprint.

Graph Computation Type (CompType) Example
1 Computation on graph structures (CompStruct) BFS traversal
2 Computation on graphs with rich properties (CompProp) Belief propagation
3 Computation on dynamic graphs (CompDyn) Streaming graph
Table 1. Graph Computation Type Summary

Computation Types Figure 1. Illustration of Computation Types

Graph data also has crucial impact because of the data-centric nature of graph computing. The diversity of graph data sources will eventually lead to diverse behaviors even for the same graph applications. To understand the graph data diversity, we summarize graph data into four sources as shown in Table 2. Each data type contains different topological features, which can bring various impact on different graph applications. The social network represents the interactions between individuals/organizations. It has high degree assortativity, small shortest path lengths, and large connected components. On the contrary, an information network is a structure, in which the dominant interaction is the dissemination of information along edges. It usually shows large vertex degrees, and large two-hop neighbourhoods. The nature network is used for learning and interacting naturally with people to extend both sides. Examples include deep belief network (DBN) and biological network. They typically incorporate structured topologies and rich properties addressing different targets. Man-made technology networks are formed by specific man-made technologies. A typical example is a road network, which usually maintains small vertex degrees and a regular topology.

Graph Data Type Example
1 Social(/economic/political) network Twitter graph
2 Information(/knowledge) network Knowledge graph
3 Nature(/bio/cognitive) network Gene network
4 Man-made technology network Road network
Table 2. Graph Data Type Summary
Clone this wiki locally