-
Notifications
You must be signed in to change notification settings - Fork 330
mutual
Common neighbor calculation algorithm, the goal is to find the number of common neighbors of two nodes in the network, and output the neighbor list.
Parameter Name | Description | Comments |
---|---|---|
--thread | Number of threads | |
--input_edges | Path to input data. | Supports HDFS. input path in CSV format , undirected graph, support gzip |
--output | Path to output data. | Supports HDFS. output path in CSV format , use gzip compression. |
--ouput_list | Output the coomon neighbor list or not | default value is false, only the common number is output, the format is 'src,dst,common_cnt'; otherwise, the common neighbor list is output, the format is 'src,dst,item1,item2,item3...' |
--common | calculate common neighbors | default value is false, that is, the common neighbor calculation of homogeneous nodes (only input_edges is used to calculate the common number), parameters such as input_vertices / separator / vdata_bits are ignored; when true, the common number of heterogeneous nodes is calculated (using the list provided by input_vertices for calculation) So INPUT_VERTICES must be non-null. |
--input_vertices | Path to node's neighbors list | Effective when COMMON is true, the format is 'user,item1:item2:item3:...'. User can appear repeatedly, append operation will be performed on items |
--separator | Separator of the node's neighbors list | The default is ':', which takes effect when COMMON is true. If the value is '/', the input data format for the input_vertices path is 'user,item1/item2/item3/...' |
--vdata_bits | Input vertex's state data_bits | Effective when COMMON is true, vertex state data_bits: 16/32/64. Try to choose a small number without overflowing |
Input files should be formatted as follows:
<src>,<dst>
where <src>
and <dst>
are integers of type uint32_t
, representing the end nodes of an edge.
Note that Plato treats every input graph as undirected by default. For a directed graph, please ensure both <A, B> and <B, A> appear in the input file if they exist. Edges that appear more than once will be considered as multiple edges between the same pair of nodes.
Input example (Following numbers are synthetic and are for demonstration purpose only.):
4564,823192
...
1996,973033
Output files are formatted as follows:
<src>,<dst>,<common_cnt>
where <src>,<dst>
represents an edge.
where <common_cnt>
represents the common neigbors between <src>
and <dst>
Output example (Following numbers are synthetic and are for demonstration purpose only.):
4564,823192,12
...
1996,973033,3
https://github.com/Tencent/plato/blob/master/example/mutual.cc
- Graph Attributes
- Tree Depth/Width
- Graph Attributes All-in-One: Number of Nodes/Edges, Density, Degree Distribution
- N-step Degrees
- HyperANF
- Node Centrality Metrics
- Connectivity & Community Discovery
- Graph Representation Learning
- Clustering/Unfolding Algorithms
- Other Graph Algorithms
Algorithms to open source:
- Network Embedding
- LINE
- Word2Vec
- GraphVite
- GNN
- GCN
- GraphSage