-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 44fd820
Showing
17,230 changed files
with
335,122 additions
and
0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
Copyright (c) 2020 Jiaxuan You, Rex Ying, Jonathan Gomes Selman | ||
Copyright (c) Facebook, Inc. and its affiliates. | ||
Additional copyrights are specified in relevant subdirectories. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,324 @@ | ||
# ROLAND: Graph Learning Framework for Dynamic Graphs | ||
This repository contains implementations for the ROLAND paper. | ||
|
||
## Replication Materials | ||
`./run/roland_publication` | ||
|
||
## Getting Table Representation of Experiment Results | ||
We first upload logs to the TensorBoard Dev following tutorial [here](https://www.tensorflow.org/tensorboard/dataframe_api). This would allow you to obtain a `experiment_id`. | ||
|
||
Dependencies file: `roland_environment.yml`. | ||
|
||
To generate table in pandas format, you need to run the following command: | ||
|
||
```bash | ||
cd ./run | ||
python3 make_table_tensorboard_dev.py <The Experiment ID You Obtained> | ||
``` | ||
|
||
### Configuration Fils for the Best Model | ||
The configuration for the model corresponding to each entry in Table 2-4 are located at `./run/replication_configs/` | ||
|
||
### Experiment Records (Tensorboard Logs) For Tables in the Paper. | ||
Please see `./roland_raw_records` for tensorbaord records for generating Table 2-4. | ||
|
||
### Testing environment: | ||
June 2021, Camera Ready Version: Ubuntu 20.04 + RTX3090 (CUDA 11.4) | ||
|
||
## Citing the ROLAND Paper | ||
TODO: | ||
|
||
# Backend: the `GraphGym` Platform | ||
GraphGym is a platform for designing and evaluating Graph Neural Networks (GNN). | ||
### Highlights | ||
**1. Highly modularized pipeline for GNN** | ||
- **Data:** Data loading, data splitting | ||
- **Model:** Modularized GNN implementation | ||
- **Tasks:** Node / edge / graph level GNN tasks | ||
- **Evaluation:** Accuracy, ROC AUC, ... | ||
|
||
**2. Reproducible experiment configuration** | ||
- Each experiment is *fully described by a configuration file* | ||
|
||
**3. Scalable experiment management** | ||
- Easily launch *thousands of GNN experiments in parallel* | ||
- *Auto-generate* experiment analyses and figures across random seeds and experiments. | ||
|
||
**4. Flexible user customization** | ||
- Easily *register your own modules* in [`graphgym/contrib/`](graphgym/contrib), such as data loaders, GNN layers, loss functions, etc. | ||
|
||
## Why GraphGym? | ||
**TL;DR:** GraphGym is great for GNN beginners, domain experts and GNN researchers. | ||
|
||
**Scenario 1:** You are a beginner to GNN, who wants to understand how GNN works. | ||
|
||
You probably have read many exciting papers on GNN, and try to write your own GNN implementation. | ||
Using existing packages for GNN, you still have to code up the essential pipeline on your own. | ||
GraphGym is a perfect place for your to start learning *standardized GNN implementation and evaluation*. | ||
|
||
<div align="center"> | ||
<img align="center" src="docs/design_space.png" width="400px" /> | ||
<figcaption><b><br>Figure 1: Modularized GNN implementation.</b></figcaption> | ||
</div> | ||
|
||
<br> | ||
|
||
**Scenario 2:** You want to apply GNN to your exciting applications. | ||
|
||
You probably know that there are hundreds of possible GNN models, and selecting the best model is notoriously hard. | ||
Even worse, we have shown in our [paper](https://arxiv.org/abs/2011.08843) that the best GNN designs for different tasks differ drastically. | ||
GraphGym provides a *simple interface to try out thousands of GNNs in parallel* and understand the best designs for your specific task. | ||
GraphGym also recommends a "go-to" GNN design space, after investigating 10 million GNN model-task combinations. | ||
|
||
<div align="center"> | ||
<img align="center" src="docs/rank.png" width="1000px" /> | ||
<figcaption><b><br>Figure 2: A guideline for desirable GNN design choices. <br>(Sampling from 10 million GNN model-task combinations.) </b></figcaption> | ||
</div> | ||
|
||
<br> | ||
|
||
|
||
**Scenario 3:** You are a GNN researcher, who want to innovate GNN models / propose new GNN tasks. | ||
|
||
Say you have proposed a new GNN layer `ExampleConv`. | ||
GraphGym can help you convincingly argue that `ExampleConv` is better than say `GCNConv`: | ||
when randomly sample from 10 millions possible model-task combinations, how often `ExampleConv` will outperform `GCNConv`, when everything else is fixed (including the computational cost). | ||
Moreover, GraphGym can help you easily do hyper-parameter search, and *visualize* what design choices are better. | ||
In sum, GraphGym can greatly facilitate your GNN research. | ||
|
||
<div align="center"> | ||
<img align="center" src="docs/evaluation.png" width="1000px" /> | ||
<figcaption><b><br>Figure 3: Evaluation of a given GNN design dimension (BatchNorm here).</b></figcaption> | ||
</div> | ||
|
||
<br> | ||
|
||
## Installation | ||
|
||
**Requirements** | ||
|
||
- CPU or NVIDIA GPU, Linux, Python3 | ||
- PyTorch, various Python packages; Instructions for installing these dependencies are found below | ||
|
||
|
||
**1. Python environment** | ||
We recommend using Conda package manager | ||
|
||
```bash | ||
conda create -n graphgym python=3.7 | ||
source activate graphgym | ||
``` | ||
|
||
**2. Pytorch:** | ||
Install [PyTorch](https://pytorch.org/). | ||
We have verified under PyTorch 1.4.0 and torchvision 0.5.0. For example: | ||
```bash | ||
pip install torch==1.4.0 torchvision==0.5.0 | ||
``` | ||
|
||
**3. Pytorch Geometric:** | ||
Install [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html), | ||
follow their instructions. For example: | ||
```bash | ||
# CUDA versions: cpu, cu92, cu101, cu102, cu110 | ||
# TORCH versions: 1.4.0, 1.5.0, 1.6.0, 1.7.0 | ||
CUDA=cu101 | ||
TORCH=1.4.0 | ||
pip install torch-scatter==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html | ||
pip install torch-sparse==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html | ||
pip install torch-cluster==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html | ||
pip install torch-spline-conv==latest+${CUDA} -f https://pytorch-geometric.com/whl/torch-${TORCH}.html | ||
pip install torch-geometric | ||
``` | ||
|
||
**4. Clone GraphGym and install other dependencies:** | ||
|
||
```bash | ||
git clone https://github.com/snap-stanford/GraphGym | ||
cd GraphGym | ||
pip install -r requirements.txt | ||
python setup.py develop | ||
``` | ||
|
||
|
||
**5. Test the installation** | ||
|
||
Run a test GNN experiment using GraphGym, specified in [`run/configs/example.yaml`](run/configs/example.yaml). | ||
The experiment is about node classification on Cora dataset (random 80/20 train/val split). | ||
```bash | ||
cd run | ||
bash run_single.sh | ||
``` | ||
|
||
## GraphGym Usage | ||
|
||
### 1 Run a single GNN experiment | ||
A full example is specified in [`run/run_single.sh`](run/run_single.sh). | ||
|
||
**1.1 Specify a configuration file.** | ||
In GraphGym, an experiment is fully specified by a `.yaml` file. | ||
Unspecified configurations in the `.yaml` file will be populated by the default values in [`graphgym/config.py`](graphgym/config.py). | ||
For example, in [`run/configs/example.yaml`](run/configs/example.yaml), there are configurations on dataset, training, model, GNN, etc. | ||
Descriptions for each configuration is described also in [`graphgym/config.py`](graphgym/config.py). | ||
|
||
**1.2 Launch an experiment.** | ||
For example, in [`run/run_single.sh`](run/run_single.sh): | ||
```bash | ||
python main.py --cfg configs/example.yaml --repeat 3 | ||
``` | ||
You can specify the number of different random seeds to repeat via `--repeat`. | ||
|
||
**1.3 Understand the results.** | ||
Experimental results will be automatically saved in directory `run/results/${CONFIG_NAME}/`; | ||
in the example above, it is `run/results/example/`. | ||
Results for different random seeds will be saved in different subdirectories, such as `run/results/example/2`. | ||
The aggregated results over all the random seeds are *automatically* generated into `run/results/example/agg`, | ||
including the mean and standard deviation `_std` for each metric. | ||
Train/val/test results are further saved into subdirectories, such as `run/results/example/agg/val`; here, | ||
`stats.json` stores the results after each epoch aggregated across random seeds, | ||
`best.json` stores the results at *the epoch with the highest validation accuracy*. | ||
|
||
### 2 Run a batch of GNN experiments | ||
A full example is specified in [`run/run_batch.sh`](run/run_batch.sh). | ||
|
||
**2.1 Specify a base file.** | ||
GraphGym supports running a batch of experiments. | ||
To start, a user needs to select a base architecture `--config`. | ||
The batch of experiments will be created by perturbing certain configurations of the base architecture. | ||
|
||
**2.2 (Optional) Specify a base file for computational budget.** | ||
Additionally, GraphGym allows a user to select a base architecture to *control the computational budget* for the grid search, `--config_budget`. | ||
The computational budget is currently measured by the number of trainable parameters; the control is achieved by auto-adjust | ||
the hidden dimension size for GNN. | ||
If no `--config_budget` is provided, GraphGym will not control the computational budget. | ||
|
||
**2.3 Specify a grid file.** | ||
A grid file describes how to perturb the base file, in order to generate the batch of the experiments. | ||
For example, the base file could specify an experiment of 3-layer GCN for Cora node classification. | ||
Then, the grid file specifies how to perturb the experiment along different dimension, such as number of layers, | ||
model architecture, dataset, level of task, etc. | ||
|
||
|
||
**2.4 Generate config files for the batch of experiments,** based on the information specified above. | ||
For example, in [`run/run_batch.sh`](run/run_batch.sh): | ||
```bash | ||
python configs_gen.py --config configs/${DIR}/${CONFIG}.yaml \ | ||
--config_budget configs/${DIR}/${CONFIG}.yaml \ | ||
--grid grids/${DIR}/${GRID}.txt \ | ||
--out_dir configs | ||
``` | ||
|
||
**2.5 Launch the batch of experiments.** | ||
For example, in [`run/run_batch.sh`](run/run_batch.sh): | ||
```bash | ||
bash parallel.sh configs/${CONFIG}_grid_${GRID} $REPEAT $MAX_JOBS $SLEEP | ||
``` | ||
Each experiment will be repeated for `$REPEAT` times. | ||
We implemented a queue system to sequentially launch all the jobs, with `$MAX_JOBS` concurrent jobs running at the same time. | ||
In practice, our system works great when handling thousands of jobs. | ||
|
||
**2.6 Understand the results.** | ||
Experimental results will be automatically saved in directory `run/results/${CONFIG_NAME}_grid_${GRID_NAME}/`; | ||
in the example above, it is `run/results/example_grid_example/`. | ||
After running each experiment, GraphGym additionally automatically averages across different models, saved in | ||
`run/results/example_grid_example/agg`. | ||
There, `val.csv` represents validation accuracy for each model configuration at the *final* epoch, | ||
and `val_best.csv` represents the results at the epoch with the highest validation error. | ||
When test set split is provided, `test.csv` represents test accuracy for each model configuration at the *final* epoch, | ||
and `test_best.csv` represents the results at the epoch with the highest validation error. | ||
|
||
|
||
|
||
|
||
|
||
### 3 Analyze the results | ||
We provides a handy tool to automatically provide an overview of a batch of experiments in | ||
[`analysis/example.ipynb`](analysis/example.ipynb). | ||
```bash | ||
cd analysis | ||
jupyter notebook | ||
example.ipynb # automatically provide an overview of a batch of experiments | ||
``` | ||
|
||
|
||
|
||
### 4 User customization | ||
A highlight of GraphGym is that it allows users to easily register their customized modules. | ||
The supported customized modules are provided in directory [`graphgym/contrib/`](graphgym/contrib/), including: | ||
- Activation [`graphgym/contrib/act/`](graphgym/contrib/act/), | ||
- Configuration [`graphgym/contrib/config/`](graphgym/contrib/config/), | ||
- Feature augmentation [`graphgym/contrib/feature_augment/`](graphgym/contrib/feature_augment/), | ||
- Feature encoder [`graphgym/contrib/feature_encoder/`](graphgym/contrib/feature_encoder/), | ||
- GNN head [`graphgym/contrib/head/`](graphgym/contrib/head/), | ||
- GNN layer [`graphgym/contrib/layer/`](graphgym/contrib/layer/), | ||
- Data loader [`graphgym/contrib/loader/`](graphgym/contrib/loader/), | ||
- Loss function [`graphgym/contrib/loss/`](graphgym/contrib/loss/), | ||
- GNN network [`graphgym/contrib/network/`](graphgym/contrib/network/), | ||
- Optimizer [`graphgym/contrib/optimizer/`](graphgym/contrib/optimizer/), | ||
- GNN global pooling (graph classification only) [`graphgym/contrib/pooling/`](graphgym/contrib/pooling/), | ||
- GNN stage [`graphgym/contrib/stage/`](graphgym/contrib/stage/), | ||
- Data transformations [`graphgym/contrib/transform/`](graphgym/contrib/transform/). | ||
|
||
Within each directory, (at least) an example is provided, showing how to register user customized modules. | ||
Note that new user customized modules may result in new configurations; in these cases, new configuration fields | ||
should be registered at [`graphgym/contrib/config/`](graphgym/contrib/config/). | ||
|
||
**Note: Applying to your own datasets.** | ||
A common use case will be applying GraphGym to your favorite datasets. | ||
To do so, you may follow our example in [`graphgym/contrib/loader/example.py`](graphgym/contrib/loader/example.py). | ||
To provide more flexibility, GraphGym currently accepts a list of [NetworkX](https://networkx.org/documentation/stable/index.html) graphs | ||
or [DeepSNAP](https://github.com/snap-stanford/deepsnap) graphs as the input; | ||
the following attributes in the graphs will be auto_loaded and parsed `node_feature`, `node_label`, `edge_feature`, `edge_label`, | ||
`graph_feature`, `graph_label`. | ||
Additionally, we have provided examples on how to transform [PyG](https://pytorch-geometric.readthedocs.io/en/latest/) datasets into the accepted format. | ||
Further details on the data representation is described in [DeepSNAP documentation](https://snap.stanford.edu/deepsnap/notes/introduction.html#graph-in-deepsnap). | ||
|
||
|
||
|
||
## Use case: Design Space for Graph Neural Networks (NeurIPS 2020 Spotlight) | ||
|
||
Reproducing experiments in *[Design Space for Graph Neural Networks](https://arxiv.org/abs/2011.08843)*, Jiaxuan You, Rex Ying, Jure Leskovec, **NeurIPS 2020 Spotlight**. | ||
You may refer to the [paper](https://arxiv.org/abs/2011.08843) or [project webpage](http://snap.stanford.edu/gnn-design/) for more details. | ||
|
||
```bash | ||
# NOTE: We include the raw results with GraphGym | ||
# If you run the following code, the results will be overridden. | ||
cd run | ||
bash run_design_round1.sh # first round experiments, on a design space of 315K GNN designs | ||
bash run_design_round2.sh # second round experiments, on a design space of 96 GNN designs | ||
cd ../analysis | ||
jupyter notebook | ||
design_space.ipynb # reproducing all the analyses in the paper | ||
``` | ||
|
||
<div align="center"> | ||
<img align="center" src="docs/overview.png" width="900px" /> | ||
<figcaption><b><br>Figure 4: Overview of the proposed GNN design space and task space.</b></figcaption> | ||
</div> | ||
|
||
|
||
## Contributors | ||
[Jiaxuan You](https://cs.stanford.edu/~jiaxuan/) initiates the project and majorly contributes to the entire GraphGym platform. | ||
[Rex Ying](https://cs.stanford.edu/people/rexy/) contributes to the feature augmentation modules. | ||
Jonathan Gomes Selman enables GraphGym to have OGB support. | ||
|
||
GraphGym is inspired by the framework of [pycls](https://github.com/facebookresearch/pycls). | ||
GraphGym adopt [DeepSNAP](https://github.com/snap-stanford/deepsnap) as the data representation, which is a Python library that assists efficient deep learning on graphs. | ||
Part of GraphGym relies on [Pytorch Geometric](https://github.com/rusty1s/pytorch_geometric) functionalities. | ||
|
||
## Contributing | ||
|
||
We warmly welcome the community to contribute to GraphGym. | ||
GraphGym is particularly designed to enable contribution / customization in a simple way. | ||
For example, you may contribute your modules to [`graphgym/contrib/`](graphgym/contrib/) by creating pull requests. | ||
|
||
## Citing our paper | ||
If you find GraphGym or our paper useful, please cite our paper: | ||
``` | ||
@InProceedings{you2020design, | ||
title = {Design Space for Graph Neural Networks}, | ||
author = {You, Jiaxuan and Ying, Rex and Leskovec, Jure}, | ||
booktitle = {NeurIPS}, | ||
year = {2020} | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
Copyright (c) 2020 Jiaxuan You | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in | ||
all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
THE SOFTWARE. |
Oops, something went wrong.