Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic batch sizes #71

Open
Ebanflo42 opened this issue Apr 4, 2024 · 1 comment
Open

Dynamic batch sizes #71

Ebanflo42 opened this issue Apr 4, 2024 · 1 comment

Comments

@Ebanflo42
Copy link
Collaborator

The MNIST example is too nice insofar as both the train and test set have a number of samples divisible by the batch size of 100. In general we should not assume this is the case. Our abstract model API should support some form of dynamic batch size.

I am not sure what the best approach for this is yet. However, if we assume that the model is executing on a fixed batch size and that only once per epoch will it receive a differing batch size, then when we average the loss we could multiply by a mask along the batch axis (1s where there are samples, 0s where there are not) and divide by the sum of the mask. If we do it like this, maybe it should be opt-in, since it does introduce a few extra floating point operations.

@Ebanflo42
Copy link
Collaborator Author

Still need to do layer construction for dynamic batch normalization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant