Skip to content

Latest commit

 

History

History
90 lines (67 loc) · 3.19 KB

File metadata and controls

90 lines (67 loc) · 3.19 KB

SYCL Academy

Exercise 9: Synchronization


In this exercise you will learn how to use different techniques for synchronizing commands and data.


1.) Waiting on events

Take a look at the vector add applications using the buffer/accessor model in exercise 6 and the USM model in exercise 8, and familiarize yourself with how they call wait on returned events to synchronize the completion of the work.

2.) Waiting on queues

With those same applications convert them to call wait on the queue to synchronize instead.

3.) Buffer destruction

Take a look at the vector add application using the buffer/accessor mode in exercise 6 and how it synchronizes on the destruction of the buffers.

4.) Copy back

Take a look that two applications again and familiarize yourself with how the result of the computation is copied back to the host.

In the case of the application using the buffer/accessor model note how this occurs implicitly on the destruction of the buffer.

In the case of the application using the USM model note how this occurs explicitly by calling memcpy.

4.) Host accessor

Finally with the application which is using the buffer/accessor model introduce a host accessor by calling get_host_access on the buffer. The host accessor can be used to check the result of the computation on the host while the buffer is still alive.

Remember to do this within a scope to ensure the host accessor is destroyed.

Also note that creating a host accessor may copy the data back to the original pointer provided to the buffer but this is not guaranteed.

Build And Execution Hints

For DPC++: Using CMake to configure then build the exercise:

mkdir build
cd build
cmake .. "-GUnix Makefiles" -DSYCL_ACADEMY_USE_DPCPP=ON -DSYCL_ACADEMY_ENABLE_SOLUTIONS=OFF -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx
make exercise_9

Alternatively from a terminal at the command line:

icpx -fsycl -o sycl-ex-9 -I../External/Catch2/single_include ../Code_Exercises/Exercise_09_Synchronization/source.cpp
./sycl-ex-9

In Intel DevCloud, to run computational applications, you will submit jobs to a queue for execution on compute nodes, especially some features like longer walltime and multi-node computation is only available through the job queue. Please refer to the guide.

So wrap the binary into a script job_submission and run:

qsub job_submission

For AdaptiveCpp:

# <target specification> is a list of backends and devices to target, for example
# "omp;generic" compiles for CPUs with the OpenMP backend and GPUs using the generic single-pass compiler.
# The simplest target specification is "omp" which compiles for CPUs using the OpenMP backend.
cmake -DSYCL_ACADEMY_USE_ADAPTIVECPP=ON -DSYCL_ACADEMY_INSTALL_ROOT=/insert/path/to/adaptivecpp -DACPP_TARGETS="<target specification>" ..
make exercise_9

alternatively, without CMake:

cd Code_Exercises/Exercise_09_Synchronization
/path/to/adaptivecpp/bin/acpp -o sycl-ex-9 -I../../External/Catch2/single_include --acpp-targets="<target specification>" source.cpp
./sycl-ex-9