CS224W - Bag of Tricks for Node Classification with GNN - LogE Loss #6

mattjhayes3 · 2024-12-08T13:17:44Z

Implement $\text{log-}\epsilon$ loss functions

Part of #4 (TODO edit this), as described in “Bag of Tricks for Node Classification with Graph Neural Networks”, this non-convex loss is thought to be less sensitive to outliers, providing a maximal gradient at decision boundaries, but still significant signal for all misclassified examples.

Details

Implements drop-in module and functional variants corresponding to nll, cross_entropy, binary_cross_entropy, and binary_cross_entropy_with_logits
Longer term pytorch might be the best place for these to live, but we think PyG might be a good place in the meantime as the paper shows it is more effective on GNNs than MLPs
PyG does not yet have any loss functions as far as we could tell, but torch_geometric.nn.functional seemed like a reasonable place for them to live. Happy to move to contrib as well.
Implemented as simple wrappers around pytorch losses for easy maintainability

Benchmarks

From benchmarks/citation using Colab's T4s we see it can bring small but statistically significant gains
- Seems to work well with GAT more consistently than other models
It can however, in some settings cause big losses, e.g. it seems to work very poorly with SGC, and we recommend users trying more traditional losses as well to see what works best
- Validation delta is not always correlated with with test delta, but in these cases it usually does not cause too big a regression
Typically slower due to the bit of extra computation, but can be faster due to early stopping
A selection of deltas which were statistically significant on test accuracy with at least 95% confidence are included below, expressed in direction loge - nll
- For Arxiv, default GAT parameters from benchmarks/citation were used with a batch norm inserted, though these are surely suboptimal settings
- Full results are available here

nll_command	val_acc_abs_delta	test_acc_abs_delta	duration_rel_delta
gcn.py --dataset=CiteSeer	0.73%	0.76%	2.62%
gcn.py --dataset=PubMed	0.42%	-0.47%	7.38%
gat.py --dataset=Cora	0.19%	-0.21%	2.81%
gat.py --dataset=CiteSeer	0.86%	0.70%	-6.07%
gat.py --dataset=PubMed --lr=0.01 --output_heads=8 --weight_decay=0.001	0.40%	0.18%	2.32%
gat.py --batch_norm --dataset=Arxiv --no_normalize_features --runs=20	0.73%	0.67%	-1.54%
cheb.py --dataset=Cora --num_hops=3	0.65%	0.66%	-4.62%
arma.py --dataset=Cora --num_layers=1 --num_stacks=2 --shared_weights	-0.10%	0.19%	-1.23%
arma.py --dataset=CiteSeer --num_layers=1 --num_stacks=3 --shared_weights	0.50%	0.61%	2.76%
sgc.py --K=3 --dataset=Cora --weight_decay=0.0005	-13.69%	-12.37%	-27.50%
sgc.py --K=2 --dataset=PubMed --weight_decay=0.0005	-13.00%	-15.80%	10.01%
appnp.py --alpha=0.1 --dataset=Cora	-0.38%	-0.28%	-3.66%
appnp.py --alpha=0.1 --dataset=CiteSeer	0.50%	0.53%	0.69%

liuvince · 2024-12-08T18:32:38Z

lgtm!
Two remarks:

The metrics table is too much detailed - we should help the reader by writting in bold the relevant numbers - the command column is hard to read with so many numbers - we could also split the tables to make reading easier.
I believe it should be good to have a few examples (not mandatory imo tho).

mattjhayes3 · 2024-12-09T14:02:20Z

Fair point! cut down the table quite a bit and linked to the rest. Still wanted to include at least a couple results for most methods to show when they are consistent/inconsistent across datasets. Could move the commands below but I think it is fairly readable now and that would make the description longer

liuvince · 2024-12-09T14:10:52Z

lgtm

for more information, see https://pre-commit.ci

mattjhayes3 · 2024-12-12T23:23:41Z

Closing because I can't figure out how to unlink with the PyG issue, this was just a draft for our internal review

mattjhayes3 added 7 commits November 6, 2024 21:09

Iniital LogE implementation (benchmarked but no unit tests yet)

4448b7e

Documentation, formatting, etc.

b44bda3

nll variant

c4d9073

loge benchmarks

dec13c9

loge benchmarks

7c18f57

cleanup benchmarking changes, support per-class weighting, addl tests

67304f2

more cleanups

561928c

liuvince approved these changes Dec 9, 2024

View reviewed changes

chriskynguyen approved these changes Dec 9, 2024

View reviewed changes

pre-commit-ci bot and others added 4 commits December 10, 2024 01:54

[pre-commit.ci] auto fixes from pre-commit.com hooks

b87e34d

for more information, see https://pre-commit.ci

Merge branch 'master' into loge

9941118

changelog and typos

972c4ef

End first lines with period

cecae05

mattjhayes3 closed this Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CS224W - Bag of Tricks for Node Classification with GNN - LogE Loss #6

CS224W - Bag of Tricks for Node Classification with GNN - LogE Loss #6

Uh oh!

mattjhayes3 commented Dec 8, 2024 •

edited

Loading

Uh oh!

liuvince commented Dec 8, 2024

Uh oh!

mattjhayes3 commented Dec 9, 2024 •

edited

Loading

Uh oh!

liuvince commented Dec 9, 2024

Uh oh!

mattjhayes3 commented Dec 12, 2024

Uh oh!

Uh oh!

CS224W - Bag of Tricks for Node Classification with GNN - LogE Loss #6

CS224W - Bag of Tricks for Node Classification with GNN - LogE Loss #6

Uh oh!

Conversation

mattjhayes3 commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details

Benchmarks

Uh oh!

liuvince commented Dec 8, 2024

Uh oh!

mattjhayes3 commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liuvince commented Dec 9, 2024

Uh oh!

mattjhayes3 commented Dec 12, 2024

Uh oh!

Uh oh!

mattjhayes3 commented Dec 8, 2024 •

edited

Loading

mattjhayes3 commented Dec 9, 2024 •

edited

Loading