shuffle-bn has no effect on single-GPU #39

sorenrasmussenai · 2020-02-06T10:58:24Z

It appears to me that shuffle-bn has no effect, when run on a single GPU.

Example:

import torch
import torch.nn as nn

(B,C,H,W) = 4,3,2,2

model1 = nn.Sequential(nn.BatchNorm2d(C))
model2 = nn.Sequential(nn.BatchNorm2d(C))
print("Before:")
print("  model1 stats: ", model1[0].running_mean, model1[0].running_var)
print("  model2 stats: ", model2[0].running_mean, model2[0].running_var)
shuffle_ids = torch.randperm(B).long()
x1 = torch.randn(B,C,H,W)*3+1
x2 = x1[shuffle_ids]
model1(x1)
model2(x2)
print("After:")
print("  model1 stats: ", model1[0].running_mean, model1[0].running_var)
print("  model2 stats: ", model2[0].running_mean, model2[0].running_var)

Before:
  model1 stats:  tensor([0., 0., 0.]) tensor([1., 1., 1.])
  model2 stats:  tensor([0., 0., 0.]) tensor([1., 1., 1.])
After:
  model1 stats:  tensor([0.2285, 0.1523, 0.1447]) tensor([1.6193, 1.4863, 1.6332])
  model2 stats:  tensor([0.2285, 0.1523, 0.1447]) tensor([1.6193, 1.4863, 1.6332])

I guess another approach is necessary on single-GPU. Any thoughts?

Thanks for releasing this code.

The text was updated successfully, but these errors were encountered:

sorenrasmussenai · 2020-02-06T12:00:47Z

The simplest solution would probably be to emulate the multi-gpu implementation in single GPU:

Shuffle batch
Split batch in N parts
Do N independent batchnorms
Gather parts
Unshuffle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

shuffle-bn has no effect on single-GPU #39

shuffle-bn has no effect on single-GPU #39

sorenrasmussenai commented Feb 6, 2020

sorenrasmussenai commented Feb 6, 2020

shuffle-bn has no effect on single-GPU #39

shuffle-bn has no effect on single-GPU #39

Comments

sorenrasmussenai commented Feb 6, 2020

sorenrasmussenai commented Feb 6, 2020