Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shuffle-bn has no effect on single-GPU #39

Open
sorenrasmussenai opened this issue Feb 6, 2020 · 1 comment
Open

shuffle-bn has no effect on single-GPU #39

sorenrasmussenai opened this issue Feb 6, 2020 · 1 comment

Comments

@sorenrasmussenai
Copy link

It appears to me that shuffle-bn has no effect, when run on a single GPU.

Example:

import torch
import torch.nn as nn

(B,C,H,W) = 4,3,2,2

model1 = nn.Sequential(nn.BatchNorm2d(C))
model2 = nn.Sequential(nn.BatchNorm2d(C))
print("Before:")
print("  model1 stats: ", model1[0].running_mean, model1[0].running_var)
print("  model2 stats: ", model2[0].running_mean, model2[0].running_var)
shuffle_ids = torch.randperm(B).long()
x1 = torch.randn(B,C,H,W)*3+1
x2 = x1[shuffle_ids]
model1(x1)
model2(x2)
print("After:")
print("  model1 stats: ", model1[0].running_mean, model1[0].running_var)
print("  model2 stats: ", model2[0].running_mean, model2[0].running_var)
Before:
  model1 stats:  tensor([0., 0., 0.]) tensor([1., 1., 1.])
  model2 stats:  tensor([0., 0., 0.]) tensor([1., 1., 1.])
After:
  model1 stats:  tensor([0.2285, 0.1523, 0.1447]) tensor([1.6193, 1.4863, 1.6332])
  model2 stats:  tensor([0.2285, 0.1523, 0.1447]) tensor([1.6193, 1.4863, 1.6332])

I guess another approach is necessary on single-GPU. Any thoughts?

Thanks for releasing this code.

@sorenrasmussenai
Copy link
Author

The simplest solution would probably be to emulate the multi-gpu implementation in single GPU:

  1. Shuffle batch
  2. Split batch in N parts
  3. Do N independent batchnorms
  4. Gather parts
  5. Unshuffle

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant