BatchNormalization $$y = (\hat x - \frac{\hat x_{mean}}{\sqrt{(\hat x_{var}} + \varepsilon)}) * \gamma + \beta$$ $$\gamma 和\beta$$可学习 bn可以底消cnn里的bias