-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hotfix/ccomplex #1485
Hotfix/ccomplex #1485
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code change looks all good to me but just curious why would the typename I
matter here?
I am compiling this branch on Vista at TACC. Things seemed to be going well, getting past the point where there was an error before. However, there was a long pause here:
[ 78%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5pre.cu.o
[ 78%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5pre_m5inv.cu.o
[ 79%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5inv_m5pre.cu.o
[ 79%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5inv_m5inv.cu.o
[ 79%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5mob.cu.o
[ 79%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_domain_wall_4d_m5pre_m5mob.cu.o
[ 80%] Building CUDA object lib/CMakeFiles/quda.dir/dslash_pack2.cu.o
[ 80%] Building CUDA object lib/CMakeFiles/quda.dir/laplace.cu.o
[ 80%] Building CUDA object lib/CMakeFiles/quda.dir/covariant_derivative.cu.o
[ 80%] Building CUDA object lib/CMakeFiles/quda.dir/staggered_quark_smearing.cu.o
Then compilation failed on a host reference function:
[ 84%] Building CXX object tests/CMakeFiles/quda_test.dir/host_reference/gauge_force_reference.cpp.o
[ 84%] Building CXX object tests/CMakeFiles/quda_test.dir/utils/misc.cpp.o
"/home1/00282/tg455536/from_frontera/compile_vista/build/_deps/eigen-src/Eigen/src/Core/arch/NEON/Complex.h", line 397: error: statement expressions are only allowed in block scope
static uint64x2_t p2ul_CONJ_XOR = vld1q_u64( p2ul_conj_XOR_DATA );
^
1 error detected in the compilation of "/home1/00282/tg455536/from_frontera/compile_vista/quda/tests/host_reference/clover_force_reference.cpp".
make[2]: *** [tests/CMakeFiles/quda_test.dir/build.make:296: tests/CMakeFiles/quda_test.dir/host_reference/clover_force_reference.cpp.o] Error 2
Is this something specific to ARM? I am not sure what NEON refers to.
Thanks,
Steve
On Aug 13, 2024, at 4:39 PM, maddyscientist ***@***.***> wrote:
Fixes a bug observed with some compilers (nvc++, rocm clang) where the compiler fails to compile if:
* The source code includes the complex.h / complex header
* A template class name is given as I
I've also fixed a warning with nvc++ and reduced the argument size for the dilution kernel.
…________________________________
You can view, comment on, or merge this pull request online at:
#1485
Commit Summary
* c76ae44<c76ae44> Remove complex.h inclusion
* 7ef6d87<7ef6d87> Fix nvc++ warning
* 1ec0607<1ec0607> Reduce arg size for spinor dilution
File Changes
(4 files<https://github.com/lattice/quda/pull/1485/files>)
* M include/kernels/spinor_dilute.cuh<https://github.com/lattice/quda/pull/1485/files#diff-c8996253331dd5b951565183eb4c563b377e0a494ea3bf460f1d9941ed2a872f> (2)
* M lib/interface_quda.cpp<https://github.com/lattice/quda/pull/1485/files#diff-0cea12be36de2a7423cca02391b81d66534bda503b565735dc5b4000f4fcda10> (1)
* M lib/targets/cuda/blas_lapack_cublas.cpp<https://github.com/lattice/quda/pull/1485/files#diff-38727af1272c2f4e15d727785175dd75e678fd5edc476cf1d66992d4960382b7> (1)
* M tests/staggered_eigensolve_test_gtest.hpp<https://github.com/lattice/quda/pull/1485/files#diff-f3931e42a3ceb9e7422ad9c94eac8ecaa9ad74dca3e3c2e6b1897c737ff39898> (7)
Patch Links:
* https://github.com/lattice/quda/pull/1485.patch
* https://github.com/lattice/quda/pull/1485.diff
—
Reply to this email directly, view it on GitHub<#1485>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGG3BNBPXIQ6OKJO7ST6MDZRJVIZAVCNFSM6AAAAABMPAO2YWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQ3DIMJXHEZDAMI>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Thanks @stevengottlieb for testing this. This is indeed an Arm issue, with NEON being one of the vector instruction sets equivalent to SEE on Intel. Moreover, I see the issue is with compiling the Eigen headers. I'll investigate and report back. |
Thanks, Kate! I appreciate your help.
On Aug 14, 2024, at 1:30 PM, maddyscientist ***@***.***> wrote:
Thanks @stevengottlieb<https://github.com/stevengottlieb> for testing this. This is indeed an Arm issue, with NEON being one of the vector instruction sets equivalent to SEE on Intel. Moreover, I see the issue is with compiling the Eigen headers. I'll investigate and report back.
—
Reply to this email directly, view it on GitHub<#1485 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGG3BOO35KPI5UV3LNAPJ3ZROH35AVCNFSM6AAAAABMPAO2YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGQYDQMRWGA>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@stevengottlieb this is a bug in Eigen it seems. The patch to apply is to replace
with
in I'm trying to work out the best way to fix this issue in the short term (until Eigen is fixed at source). |
@maddyscientist Thanks, Kate.
I found two such lines the the Complex.h file and applied the fix to both. I then returned to the build directory and typed
make -j 32
Everything seems fine now.
Thanks again!
Steve
On Aug 14, 2024, at 2:59 PM, maddyscientist ***@***.***> wrote:
@stevengottlieb<https://github.com/stevengottlieb> this is a bug in Eigen it seems. The patch to apply is to replace
#if EIGEN_COMP_CLANG || EIGEN_COMP_CASTXML
with
#if EIGEN_COMP_CLANG || EIGEN_COMP_CASTXML || __NVCOMPILER_LLVM__
in $PATH_TO_QUDA_BUILD_DIR/_deps/eigen-src/Eigen/src/Core/arch/NEON/Complex.h. Can you verify this fixes the issue for you?
I'm trying to work out the best way to fix this issue in the short term (until Eigen is fixed at source).
—
Reply to this email directly, view it on GitHub<#1485 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGG3BKSQOYNWEDUEOACC3TZROSIBAVCNFSM6AAAAABMPAO2YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGYYTANRWHE>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
…e found (which will cause infinite recursion)
Thanks for confirming the fix works @stevengottlieb. I've updated the build system to now apply the patch automatically if using the NVHPC compiler, so this should now work out of the box for you. |
cscs-ci run |
1 similar comment
cscs-ci run |
Thanks @maddyscientist. I started a fresh compile yesterday on Vista and noticed that the build completed without my having to edit the Complex.h. I was wondering how that came about.
On Aug 18, 2024, at 1:24 PM, maddyscientist ***@***.***> wrote:
Thanks for confirming the fix works @stevengottlieb<https://github.com/stevengottlieb>. I've updated the build system to now apply the patch automatically if using the NVHPC compiler, so this should now work out of the box for you.
—
Reply to this email directly, view it on GitHub<#1485 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ABGG3BNT2KSX5NJ4B6CCSTDZSDKGNAVCNFSM6AAAAABMPAO2YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGMZTENRXGQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
cscs-ci run |
1 similar comment
cscs-ci run |
Fixes a bug observed with some compilers (nvc++, rocm clang) where the compiler fails to compile if:
complex.h
/complex
headerI
I've also fixed a warning with nvc++ and reduced the argument size for the dilution kernel.