something about loading `sort_feature.pt` in Paper100MDataset #138

zhaoyin214 · 2022-12-01T08:55:42Z

it seems strange when loading feature from sort_feature.pt in the Paper100MDataset class in benchmarks/ogbn-papers100M/dist_sampling_ogb_paper100M_quiver.py

i found that the sort_feature has been sorted according to the in-degree order by the statement feature = feature[prev_order] in preprocess.py. in papers100m benchmark, the sorted feature was sorted again. in my opinion, the Paper100MDataset should load feature from feature.pt rather than sort_feature.pt.

is that correct?

The text was updated successfully, but these errors were encountered:

Anlarry · 2022-12-06T11:01:23Z

I also recently learned how quiver accelerates training. I think the purpose of sorting the sorted feature is load balance. Reordering the sorted feature randomly will allow the hottest features to be evenly stored on each GPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

something about loading `sort_feature.pt` in Paper100MDataset #138

something about loading `sort_feature.pt` in Paper100MDataset #138

zhaoyin214 commented Dec 1, 2022

Anlarry commented Dec 6, 2022

something about loading sort_feature.pt in Paper100MDataset #138

something about loading sort_feature.pt in Paper100MDataset #138

Comments

zhaoyin214 commented Dec 1, 2022

Anlarry commented Dec 6, 2022

something about loading `sort_feature.pt` in Paper100MDataset #138

something about loading `sort_feature.pt` in Paper100MDataset #138