Authors: GEMS Lab Team @ University of Michigan (Mark Jin, Ruowang Zhang, Mark Heimann)
This SEMB library allows fast onboarding to get and evaluate structural node embeddings. With the unified API interface and the modular codebase, SEMB library enables easy intergration of 3rd-party methods and datasets.
The library itself has already included a set of popular methods and datasets ready for immediate use.
-
Built-in methods: node2vec, struc2vec, GraphWave, xNetMF, role2vec, DRNE, MultiLENS, RiWalk, SEGK, (more methods to add in the near future)
-
Built-in datasets:
Dataset # Nodes # Edges BlogCatalog 10,312 333,983 Facebook 4,039 88,234 ICEWS 1,255 1,414 PPI 56,944 818,786 BR air-traffic 131 1,038 EU air-traffic 399 5,995 US air-traffic 1,190 13,599 DD6 4,152 20,640 Synthetic Datasets
The library requires *Python 3.6.2 for best usage. In Python 3.8, the Tensorflow 1.14.0 used in DRNE might not be successfully installed.
Make sure you are using Python 3.6+ for all below!
-
First, creat a virtual environment and activate the virtual environment using conda
conda create -n "<VENV_NAME>" python=3.6.2 ipython conda activate <VENV_NAME>
-
Change directory to the
StrucEmbeddingLibrary
and install the dependencies(<VENV_NAME>) cd StrucEmbeddingLibrary (<VENV_NAME>) python3 -m pip install -r requirements.txt --no-cache-dir
-
Install the
SEMB
package(<VENV_NAME>) cd StrucEmbeddingLibrary (<VENV_NAME>) python3 setup.py install
After installation, we highly recommend you go through our Tutorial to see how SEMB library works.
-
To enable using the jupyter notebook, do the following,
(<VENV_NAME>) python3 -m pip install ipykernel --no-cache-dir (<VENV_NAME>) python3 -m ipykernel install --name=<VENV_NAME> (<VENV_NAME>) jupyter notebook
Choose
<VENV_NAME>
at the top right corner of the page when creating a new jupyter notebook / running the tutorial notebook.
First make sure the semb
library is installed.
Currently, SEMB only supports embedding and evaluation on undirected and unweighted graphs.
- Create a Python 3.6+ package with a name in form at
semb/datasets/[$YOUR_CHOSEN_DATASET_ID]
- Within the package root directory, make sure
__init__.py
is present - Create a
dataset.py
and make aDataset
class that inherits fromfrom semb.datasets import BaseDataset
and implement the required methods. Seesemb/datasets/airports/dataset.py
for more details.- To use the built-in
load_dataset()
method, we accept the graph edgelist with the following format<Node1_id (int)> <Blank> <Node2_id (int)> <\n>
- Otherwise, you can overload and implement your own
load_dataset()
function. Please make sure that the returned graph is ofnetworkx.classes.graph.Graph
datatype.
- If the dataset is accompanied by the label file, to use the built-in
load_label()
function, we accept the label file with the following format<Node_id (int)> <delimeter> <Node_label (int)>
- Otherwise, you can overload and implement your own
load_label()
function. Please make sure that the returned type is python built-indict()
with the key as<Node_id (int)>
and value as<Node_label (int)>
- To use the built-in
- Install the package via
setup.py
or pip. - Now the dataset is loadable by the main client program that uses
semb
!
- Create a Python 3.6+ package with a name in form of
semb/methods/[$YOUR_CHOSEN_METHOD_ID]
- Within the package root directory, make sure
__init__.py
is present - Create a
method.py
and make aMethod
class that inherits fromfrom semb.methods import BaseMethod
and implement the required methods. Seesemb/methods/node2vec/method.py
for more details.- Please make sure that your implemented method accepts
networkx.classes.graph.Graph
as input. - Please make sure that when
train()
is called, theself.embeddings
should be a Python built-indict()
with key as<Node_id (int)>
and value(embedding) as<List (float)>
.
- Please make sure that your implemented method accepts
- Install the package via
setup.py
or pip. - Now the method is load-able by the main client program that uses
semb
!
For both dataset
and method
extensions, make sure the get_id()
to be overridden and returns the same id as your chosen id in your package name.
If you encounter any question using our SEMB library, feel free to raise an issue or send an email to [email protected]. Go Blue!