-
Notifications
You must be signed in to change notification settings - Fork 13
Programming helpers
Some operations need to perform an addition over all elements of an array on the device. To execute this, instead of running some AtomicAdd that is suboptimal and not binary reproducible, you can run a kernel that parallelises the operation.
Use the reduction add operation defined in src/headers/reduction_add.h
. To run a sum on an array on the device, use:
double gpu_sum_on_device<BLOCKSIZE>(double *data_d, long length);
Arguments:
-
BLOCKSIZE
: template argument for size of blocks to work on. Must be a power of two, 1024 is a good value. It will sum over sub-arrays of this size on the device and then sum up the results on the host, so bigger is better. -
data_d
: pointer to memory on the device to operate on. This memory has to be aligned to memory boundaries. Some buffers needed padding to get to the correct alignement. -
length
: size of the array to operate on. Doesn't need to be a power of 2, the function will do the padding. Can be smaller thanBLOCKSIZE
.
To work on vector values, CUDA provides some basic N-dimensional data types. THOR can use double3
for 3D vectors. It defines operators in src/headers/vector_operations.h
so that this datatype can be used in amore convenient way.
- standard math operators
+
,-
, adding two vectors or a scalar to all elements of a vector. (also as inplace operators,+=
,-=
). - multiplication by a scalar
*
and division/
by a scalar. (in place as*=
and/=
) - dot product
dot(v1,v2)
- length
length(v)
- normalisation
normalize(v)
- cross product
cross(v1, v2)
Usage examples in src/grid/grid.cu
.
Some versions of CUDA provide their own version of these operators, but not available on all platforms tested.
To load the data for each kernel execution on a rhomboid needing horizontal data and neighbours, we need to load the data for the grid points in the rhombiod and the data for the nearest neighbours in the neighbouring rhomboid. The indexing to get the neighbours is not linear, so there is a helper function to get the needed data for each kernel in kernel_halo_helpers.h
. See usage in thor_adv_cor.h
. For each kernel index, it returns the index of the point to load and if needed, the index of the halo point to load.
To do logging to the console and to a log file storing the output, there is a helper function in log_writer.h
. The function wraps printf
to log to a file also and is used like printf
. Call it with:
log::printf(format, parameters);
Data logging is done to some text files at each timestep, with functions in
log_writer.h
. It stores one line per step.
-
esp_diagnostics_<planetname>.txt
: diagnostics file, timing of the simulation, memory used -
esp_global_<planetname>.txt
: global conservation values -
esp_log_<planetname>.log
: simulation console output log -
esp_write_log_<planetname>.txt
: log of valid files written. Appends the name of HDF5 datafiles that have been written to disk, to check the validity of files and where to restart from in batch modes. This helps to know what files comes from what run when re-running the simulation and overwriting output.