You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When reading the BatchEnsemble paper, I get the impression that each model of the ensemble is trained with a different part of the mini-batch:
Section 3.1: "To match the input and the ensemble weight, we can divide the input mini-batch into M sub-batches and each sub-batch receives ensemble weight"
Appendix B: "Also note that the scheme that each ensemble member is trained with different sub-batch of input can encourage diversity as well"
However, in the current implementation, each model of the ensemble is trained on the same mini-batch because the mini-batch is replicated before send it to the BatchEnsemble model:
@ywen666 can clarify. TLDR is that we initially tried different sub-batches per ensemble member, but later that tiling to duplicate the same sub-batch typically worked a tad better.
Hi guys,
When reading the BatchEnsemble paper, I get the impression that each model of the ensemble is trained with a different part of the mini-batch:
Section 3.1: "To match the input and the ensemble weight, we can divide the input mini-batch into M sub-batches and each sub-batch receives ensemble weight"
Appendix B: "Also note that the scheme that each ensemble member is trained with different sub-batch of input can encourage diversity as well"
However, in the current implementation, each model of the ensemble is trained on the same mini-batch because the mini-batch is replicated before send it to the BatchEnsemble model:
uncertainty-baselines/baselines/cifar/batchensemble.py
Lines 215 to 216 in adc2d41
Could you please clarify? Have you tried both approaches?
Thanks,
The text was updated successfully, but these errors were encountered: