You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have two .h/.cu pairs, each implementing a similar structure of C++ object with member device vectors and methods which use Thrust functions calling functors defined as structs locally in their respective .cu files (not mentioned in the .h files, no cross-#include issues). Same name-spacing on each side, different C++ class names, five functors in each case, three with different names/sigs/bodies, one with same name/sig/body (which I will factor out), and the last with the same name but different sigs and body.
One of these code paths runs correctly, the second falls over with Thrust/CUDA memory allocation corruption after the step which invokes that functor which has the same name on both sides.
Renaming that functor in one side fixes the problem, as does putting an anonymous namespace around all the functors on both sides.
My first suspicion was that because the namespacing was the same, and it was a struct (so no addiitonal C++ mangling) that the Thrust run-time was somehow calling the same version of that functor from both code paths. I tried putting a printf in each functor with different messages, but those messages came out as expected, so it does not appear to be this simple.
I can think of any number of reasons why spotting and rejecting this situation at CPU compile time would be hard, as they are separate files and separate compilation units, and the pre-compiled CUDA kernel isn't "linked" until run-time, but I guess I would have expected normal C/C++ compilation-unit-scope rules to still apply, and therefore for what I did not to be a problem.
This is all on Ubuntu 18.04 with gcc 7.3.0 and CUDA/nvcc/Thrust 10.0.130.
Actual code supplied on request.
The text was updated successfully, but these errors were encountered:
Please provide the source code. I'm not 100% sure of this, but your description smells like an ODR violation (specifically a violation of [basic.def.odr]/10). ODR violations are extremely hard to diagnose and tend to result in behavior that is completely unpredictable ("ill-formed, no diagnostic required" is pretty much the compile-time equivalent of undefined behavior: all bets are off).
I have two .h/.cu pairs, each implementing a similar structure of C++ object with member device vectors and methods which use Thrust functions calling functors defined as structs locally in their respective .cu files (not mentioned in the .h files, no cross-#include issues). Same name-spacing on each side, different C++ class names, five functors in each case, three with different names/sigs/bodies, one with same name/sig/body (which I will factor out), and the last with the same name but different sigs and body.
One of these code paths runs correctly, the second falls over with Thrust/CUDA memory allocation corruption after the step which invokes that functor which has the same name on both sides.
Renaming that functor in one side fixes the problem, as does putting an anonymous namespace around all the functors on both sides.
My first suspicion was that because the namespacing was the same, and it was a struct (so no addiitonal C++ mangling) that the Thrust run-time was somehow calling the same version of that functor from both code paths. I tried putting a
printf
in each functor with different messages, but those messages came out as expected, so it does not appear to be this simple.I can think of any number of reasons why spotting and rejecting this situation at CPU compile time would be hard, as they are separate files and separate compilation units, and the pre-compiled CUDA kernel isn't "linked" until run-time, but I guess I would have expected normal C/C++ compilation-unit-scope rules to still apply, and therefore for what I did not to be a problem.
This is all on Ubuntu 18.04 with gcc 7.3.0 and CUDA/nvcc/Thrust 10.0.130.
Actual code supplied on request.
The text was updated successfully, but these errors were encountered: