Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-GPU support still WIP #17

Open
anttiryt opened this issue Apr 25, 2023 · 0 comments
Open

multi-GPU support still WIP #17

anttiryt opened this issue Apr 25, 2023 · 0 comments

Comments

@anttiryt
Copy link

anttiryt commented Apr 25, 2023

image

According to this post, is needed:

strategy = tf.distribute.MirroredStrategy()
with strategy.scope():

It is still not enough.

File "/home/rac/slomo/multifix/super-slomo/train.py", line 207, in train_step  *
    loss_values = loss_obj.compute_losses(
File "/home/rac/slomo/multifix/super-slomo/models/losses.py", line 122, in compute_losses  *
    p_rec_loss += self.reconstruction_loss(true, pred)
File "/home/rac/slomo/multifix/super-slomo/models/losses.py", line 26, in reconstruction_loss  *
    return self.mae(y_true, y_pred)
File "/home/rac/slomo/tf-cuda.env/lib/python3.10/site-packages/keras/losses.py", line 166, in __call__  **
    reduction = self._get_reduction()
File "/home/rac/slomo/tf-cuda.env/lib/python3.10/site-packages/keras/losses.py", line 217, in _get_reduction
    raise ValueError(

ValueError: Please use `tf.keras.losses.Reduction.SUM` or `tf.keras.losses.Reduction.NONE` for loss reduction when losses are used with `tf.distribute.Strategy` outside of the built-in training loops. You can implement `tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE` using global batch size like:
```
with strategy.scope():
    loss_obj = tf.keras.losses.CategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
....
    loss = tf.reduce_sum(loss_obj(labels, predictions)) * (1. / global_batch_size)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant