Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMP target implementation #2

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open

OMP target implementation #2

wants to merge 19 commits into from

Conversation

tom91136
Copy link
Member

This PR contains the full OMP target implementation, it has been tested on the following configurations with correct results:

  • LLVM10 + GTX2080Ti
  • GCC10 + GTX2080Ti
  • GCC10 + GT210
  • ICC + IrisPro580

GCC 10.1 with RadeonVII fails at link time with the following:

/usr/bin/cmake3 -E cmake_link_script CMakeFiles/clover_leaf.dir/link.txt --verbose=1
/nfs/software/x86_64/gcc/10.1.0/bin/g++  -O2 -DNDEBUG  -fopenmp -foffload=amdgcn-amdhsa=-march=gfx906 -foffload=-lm -fno-fast-math -fno-associative-math -Wl,-rpath -Wl,/nfs/software/x86_64/openmpi/4.0.1/gcc-8.3.0/lib -Wl,--enable-new-dtags -pthread CMakeFiles/clover_leaf.dir/sr
c/accelerate.cpp.o CMakeFiles/clover_leaf.dir/src/advec_cell.cpp.o CMakeFiles/clover_leaf.dir/src/advec_mom.cpp.o CMakeFiles/clover_leaf.dir/src/advection.cpp.o CMakeFiles/clover_leaf.dir/src/build_field.cpp.o CMakeFiles/clover_leaf.dir/src/finalise_field.cpp.o CMakeFiles/clove
r_leaf.dir/src/calc_dt.cpp.o CMakeFiles/clover_leaf.dir/src/clover_leaf.cpp.o CMakeFiles/clover_leaf.dir/src/comms.cpp.o CMakeFiles/clover_leaf.dir/src/field_summary.cpp.o CMakeFiles/clover_leaf.dir/src/flux_calc.cpp.o CMakeFiles/clover_leaf.dir/src/generate_chunk.cpp.o CMakeFi
les/clover_leaf.dir/src/hydro.cpp.o CMakeFiles/clover_leaf.dir/src/ideal_gas.cpp.o CMakeFiles/clover_leaf.dir/src/initialise_chunk.cpp.o CMakeFiles/clover_leaf.dir/src/initialise.cpp.o CMakeFiles/clover_leaf.dir/src/pack_kernel.cpp.o CMakeFiles/clover_leaf.dir/src/PdV.cpp.o CMa
keFiles/clover_leaf.dir/src/read_input.cpp.o CMakeFiles/clover_leaf.dir/src/report.cpp.o CMakeFiles/clover_leaf.dir/src/reset_field.cpp.o CMakeFiles/clover_leaf.dir/src/revert.cpp.o CMakeFiles/clover_leaf.dir/src/start.cpp.o CMakeFiles/clover_leaf.dir/src/timer.cpp.o CMakeFiles
/clover_leaf.dir/src/timestep.cpp.o CMakeFiles/clover_leaf.dir/src/update_halo.cpp.o CMakeFiles/clover_leaf.dir/src/update_tile_halo.cpp.o CMakeFiles/clover_leaf.dir/src/update_tile_halo_kernel.cpp.o CMakeFiles/clover_leaf.dir/src/viscosity.cpp.o CMakeFiles/clover_leaf.dir/src/
visit.cpp.o  -o clover_leaf  /nfs/software/x86_64/openmpi/4.0.1/gcc-8.3.0/lib/libmpi.so /nfs/software/x86_64/gcc/10.1.0/lib64/libgomp.so -lpthread 
unhandled relocation type
UNREACHABLE executed at /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp:83!
Stack dump:
0.      Program arguments: /nfs/software/x86_64/gcc/10.1.0/lib/gcc/x86_64-pc-linux-gnu/10.1.0/accel/amdgcn-amdhsa/../../../../../../amdgcn-amdhsa/bin/as -triple=amdgcn--amdhsa -mattr=-code-object-v3 -mcpu=gfx906 -filetype=obj -o /tmp/ccvYqwBZ.o /tmp/ccux38O6.mkoffload.2.s 
 #0 0x0000000000759eeb llvm::sys::PrintStackTrace(llvm::raw_ostream&) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Support/Unix/Signals.inc:533:0
 #1 0x0000000000759f7c PrintStackTraceSignalHandler(void*) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Support/Unix/Signals.inc:594:0
 #2 0x0000000000758181 llvm::sys::RunSignalHandlers() /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Support/Signals.cpp:68:0
 #3 0x000000000075996d SignalHandler(int) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Support/Unix/Signals.inc:385:0
 #4 0x00007f4c76cac5d0 __restore_rt (/lib64/libpthread.so.0+0xf5d0)
 #5 0x00007f4c757d32c7 raise (/lib64/libc.so.6+0x362c7)
 #6 0x00007f4c757d49b8 abort (/lib64/libc.so.6+0x379b8)
 #7 0x000000000070070e bindingsErrorHandler(void*, std::string const&, bool) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Support/ErrorHandling.cpp:218:0
 #8 0x000000000051feaa (anonymous namespace)::AMDGPUELFObjectWriter::getRelocType(llvm::MCContext&, llvm::MCValue const&, llvm::MCFixup const&, bool) const /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp:84:0
 #9 0x000000000065f4b0 (anonymous namespace)::ELFObjectWriter::recordRelocation(llvm::MCAssembler&, llvm::MCAsmLayout const&, llvm::MCFragment const*, llvm::MCFixup const&, llvm::MCValue, unsigned long&) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/ELFObject
Writer.cpp:1492:0
#10 0x00000000005a9522 llvm::MCAssembler::handleFixup(llvm::MCAsmLayout const&, llvm::MCFragment&, llvm::MCFixup const&) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCAssembler.cpp:754:0
#11 0x00000000005a9f48 llvm::MCAssembler::layout(llvm::MCAsmLayout&) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCAssembler.cpp:849:0
#12 0x00000000005aa0bc llvm::MCAssembler::Finish() /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCAssembler.cpp:864:0
#13 0x00000000006089a1 llvm::MCObjectStreamer::FinishImpl() /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCObjectStreamer.cpp:731:0
#14 0x00000000005ed2c4 llvm::MCELFStreamer::FinishImpl() /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCELFStreamer.cpp:677:0
#15 0x0000000000612d85 llvm::MCStreamer::Finish() /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCStreamer.cpp:914:0
#16 0x0000000000675ba0 (anonymous namespace)::AsmParser::Run(bool, bool) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/lib/MC/MCParser/AsmParser.cpp:981:0
#17 0x0000000000408533 AssembleInput(char const*, llvm::Target const*, llvm::SourceMgr&, llvm::MCContext&, llvm::MCStreamer&, llvm::MCAsmInfo&, llvm::MCSubtargetInfo&, llvm::MCInstrInfo&, llvm::MCTargetOptions&) /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/tools/ll
vm-mc/llvm-mc.cpp:301:0
#18 0x00000000004099fb main /nfs/home/tom/software/gcc-10.1-offload/llvm-project/llvm/tools/llvm-mc/llvm-mc.cpp:505:0
#19 0x00007f4c757bf495 __libc_start_main (/lib64/libc.so.6+0x22495)
#20 0x0000000000407b29 _start (/nfs/software/x86_64/gcc/10.1.0/lib/gcc/x86_64-pc-linux-gnu/10.1.0/accel/amdgcn-amdhsa/../../../../../../amdgcn-amdhsa/bin/as+0x407b29)

Note that GCC+RadeonVII worked correctly at 6456a30, it was then broken in d7857cd.
The changes between the two commits are non trivial so investigating this will take some time.

This PR will try to address the error if time allows, but it is good for review as is.

Copy link
Contributor

@tomdeakin tomdeakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of great stuff here. The kernels all look good to me.
Some specific comments throughout that it would be good to look over.
Thanks for the good work here, and the great documentation in the README about building it with different compilers.

src/definitions.h Outdated Show resolved Hide resolved
src/advec_cell.cpp Outdated Show resolved Hide resolved
src/viscosity.cpp Show resolved Hide resolved
src/clover_leaf.cpp Show resolved Hide resolved
src/finalise_field.cpp Outdated Show resolved Hide resolved
src/finalise_field.cpp Outdated Show resolved Hide resolved
src/pack_kernel.cpp Outdated Show resolved Hide resolved
src/comms.cpp Show resolved Hide resolved
src/comms.cpp Outdated Show resolved Hide resolved
@tom91136
Copy link
Member Author

All comments have been addressed. This is ready for merge I think.

Leave out native opt flags (to be added by the user)
Copy link

@andreipoe andreipoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you update the README with the new names. We should also probably remove CentOS 7 as a prerequisite, since the application should work everywhere.

Copy link
Contributor

@tomdeakin tomdeakin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the justification for changing the binary name? It's clover_leaf in all other implementations, so we should just leave that.

@tom91136
Copy link
Member Author

Ah, so for cloverleaf_kokkos, it uses cloverleaf, and I thought I have aligned all of them to use that.
I can revert this and change the kokkos one to use clover_leaf, or I can rename everything else to use cloverleaf, although that would break all existing build scripts.

@tom91136
Copy link
Member Author

tom91136 commented Feb 16, 2021

I think I'll go with the easy path now, which is to rename the kokkos' one to clover_leaf.

@tomdeakin
Copy link
Contributor

Ah, so for cloverleaf_kokkos, it uses cloverleaf, and I thought I have aligned all of them to use that.
I can revert this and change the kokkos one to use clover_leaf, or I can rename everything else to use cloverleaf, although that would break all existing build scripts.

We won't change the reference versions, so all ports should match that. I guess the Kokkos one slipped through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants