BatchEnsemble and mini-batches #188

andresmasegosa · 2020-10-07T11:08:31Z

Hi guys,

When reading the BatchEnsemble paper, I get the impression that each model of the ensemble is trained with a different part of the mini-batch:

Section 3.1: "To match the input and the ensemble weight, we can divide the input mini-batch into M sub-batches and each sub-batch receives ensemble weight"

Appendix B: "Also note that the scheme that each ensemble member is trained with different sub-batch of input can encourage diversity as well"

However, in the current implementation, each model of the ensemble is trained on the same mini-batch because the mini-batch is replicated before send it to the BatchEnsemble model:

uncertainty-baselines/baselines/cifar/batchensemble.py

Lines 215 to 216 in adc2d41

    
           images = tf.tile(images, [FLAGS.ensemble_size, 1, 1, 1]) 
        
           labels = tf.tile(labels, [FLAGS.ensemble_size])

Could you please clarify? Have you tried both approaches?

Thanks,

dustinvtran · 2020-10-26T18:09:44Z

@ywen666 can clarify. TLDR is that we initially tried different sub-batches per ensemble member, but later that tiling to duplicate the same sub-batch typically worked a tad better.

andresmasegosa · 2020-10-27T11:03:04Z

Thanks!

andresmasegosa closed this as completed Oct 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BatchEnsemble and mini-batches #188

BatchEnsemble and mini-batches #188

andresmasegosa commented Oct 7, 2020

dustinvtran commented Oct 26, 2020

andresmasegosa commented Oct 27, 2020

BatchEnsemble and mini-batches #188

BatchEnsemble and mini-batches #188

Comments

andresmasegosa commented Oct 7, 2020

dustinvtran commented Oct 26, 2020

andresmasegosa commented Oct 27, 2020