Bias masking in BN layers #5

KengChiLiu · 2018-06-13T11:26:54Z

I'm not sure whether you are able to mask out bias in BN layers too. (v.bias:cmul(mask))
Since what you minimized and pruned are actually weight not bias.
For BN layers, y=γx+β.
You pruned small γ ones, but how about β ? It may be large or important.
For me, after I masked out β I got an enormous accuracy drop.

If there is any misunderstanding of the works just please tell me.
Thank you.

liuzhuang13 · 2018-06-13T22:02:17Z

In my experiment, masking out bias did not seem to change accuracy much. I thought this was because if γ is zero, then the output of that channel is the same for all input (all β), so that channel is not important and the network learned to let β be small. Even β is large, that channel outputs the same activations for all input, so I think it is not that important. If there is accuracy drop in your experiment, I think fine-tuning can recover that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias masking in BN layers #5

Bias masking in BN layers #5

KengChiLiu commented Jun 13, 2018

liuzhuang13 commented Jun 13, 2018

Bias masking in BN layers #5

Bias masking in BN layers #5

Comments

KengChiLiu commented Jun 13, 2018

liuzhuang13 commented Jun 13, 2018