-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use GPU for SGD weight update calculations #88
Comments
I've hacked in I've tried to add those scalars to
I've tried to move initialization from Any ideas? Edit: there is Before patch:
After patch:
2x speedup and lower CPU usage, nice! |
Currently weight updates are calculated on
Native
backend. Profiling shows that about 40% of CPU time is spent doing corresponding BLAS operations. Another 40% are in an area without debug info, quite likely that's nvidia driver doing i/o. In the same time according tonvidia-smi
GPU load is about 20% even on my relatively slow GTX 960.I think it's possible to get 3x-5x speedup if weight updates are implemented on GPU. It should be quite easy since update is a simple BLAS operation
y = a * x + b * y
wherea
andb
are scalars,x
andy
are tensors of equal dimensions.The text was updated successfully, but these errors were encountered: