-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA Error when pytorch distribution training... #63
Comments
Hi, what hardware are you using ? See an interesting PR about it here |
Hi, I have 3 NVIDIA 1080 Ti, I am sure they have the same compute capibilities... There is my GPU info: |
Ok so this is not this problem. I just tested your code with my computer, that has 1 1080 Ti and I didn't get the "segmentation fault" at the end of your script. How did you install the correlation module ? From pip ? From source ? It might not be the root cause, but I can only advice you to upgrade to 1.7 for now and try to install from this repo with setup.py |
Thanks a lot ! I installed the correlation module from pip, I will upgrade pytorch to 1.7 tomorrow and reply to you! |
hi @ClementPinard , I try to install PyTorch 1.7.1, and then use pip to install the tool, there is no warnning or error during installation, but I cannot import this repo:
When I install the module via
and for single-GPU training:
|
When I use pytorch 1.1 and install via
It is really odd, I do not konw how to deal with, could you provide some suggestions? |
Hi, thanks for your contribution, when I using distribution training, there is always RuntimeError:
RuntimeError: CUDA error: invalid device function
, here is my test code:My enviroment is
The whole error info is:
For un-distribution training, there is no error, but still some strange info:
The text was updated successfully, but these errors were encountered: