Dynamic batch sizes #71

Ebanflo42 · 2024-04-04T12:49:39Z

The MNIST example is too nice insofar as both the train and test set have a number of samples divisible by the batch size of 100. In general we should not assume this is the case. Our abstract model API should support some form of dynamic batch size.

I am not sure what the best approach for this is yet. However, if we assume that the model is executing on a fixed batch size and that only once per epoch will it receive a differing batch size, then when we average the loss we could multiply by a mask along the batch axis (1s where there are samples, 0s where there are not) and divide by the sum of the mask. If we do it like this, maybe it should be opt-in, since it does introduce a few extra floating point operations.

Ebanflo42 · 2024-09-20T18:28:07Z

Still need to do layer construction for dynamic batch normalization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic batch sizes #71

Dynamic batch sizes #71

Ebanflo42 commented Apr 4, 2024

Ebanflo42 commented Sep 20, 2024

Dynamic batch sizes #71

Dynamic batch sizes #71

Comments

Ebanflo42 commented Apr 4, 2024

Ebanflo42 commented Sep 20, 2024