Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something about loading sort_feature.pt in Paper100MDataset #138

Open
zhaoyin214 opened this issue Dec 1, 2022 · 1 comment
Open

something about loading sort_feature.pt in Paper100MDataset #138

zhaoyin214 opened this issue Dec 1, 2022 · 1 comment

Comments

@zhaoyin214
Copy link

it seems strange when loading feature from sort_feature.pt in the Paper100MDataset class in benchmarks/ogbn-papers100M/dist_sampling_ogb_paper100M_quiver.py

image

i found that the sort_feature has been sorted according to the in-degree order by the statement feature = feature[prev_order] in preprocess.py. in papers100m benchmark, the sorted feature was sorted again. in my opinion, the Paper100MDataset should load feature from feature.pt rather than sort_feature.pt.

is that correct?

@Anlarry
Copy link

Anlarry commented Dec 6, 2022

I also recently learned how quiver accelerates training. I think the purpose of sorting the sorted feature is load balance. Reordering the sorted feature randomly will allow the hottest features to be evenly stored on each GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants