You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using multiple GPUs (n > 1) with a batch size that's not divisible by the number of GPUs, you'll encounter a ZeroDivisionError. I understand that library needs to split the batch evenly across devices. In tf.data api, something call drop_reminder exist. Now, my concern is in case of evaluation or inference, we can't drop any sample in order to measure performance. That makes it to do some sort of padding or something.
When using multiple GPUs (n > 1) with a batch size that's not divisible by the number of GPUs, you'll encounter a
ZeroDivisionError
. I understand that library needs to split the batch evenly across devices. In tf.data api, something calldrop_reminder
exist. Now, my concern is in case of evaluation or inference, we can't drop any sample in order to measure performance. That makes it to do some sort of padding or something.In short
The model should be able to handle data generation, right? The remaining data would be allocated to the available GPU.
The text was updated successfully, but these errors were encountered: