Integrate changes from NERSC GPU hackathon. #713

* Disable cmake-format and clang-format checks. * Disable GitLab CI except for NMODL + GPU.

* Add a hackathon-specific argument for benchmarks. * Add a reference comparison for channel-benchmark.

* create build/benchmark folder before trying to use it * run nrnivmodl-core in parallel than serially (too slow)

…umber and update the related documentation (#700)

* Add memory pool for Random123 streams. This speeds up initialisation when running on GPU. * Make Boost optional.

This was a silly bug in #702.

* Simplify unified memory logic. * Pass -mp=gpu when we pass -acc * Pass -gpu=lineinfo for better debug information. * Pass -Minfo=accel,mp for better compile time diagnostics. * Add nrn_pragma_{acc,omp} macros for single-source Open{ACC,MP} support. * Call omp_set_default_device. * Drop cc60 because of OpenMP offload incompatibility. * Add --gpu to test. * Default (BB5-valid) CORENRN_EXTERNAL_BENCHMARK_DATA. * Remove cuda_add_library. * Don't print number of GPUs when quiet. * Set OMP_NUM_THREADS=1 for lfp_test. * Update NMODL to emit nrn_pragma{acc,omp} macros. Co-authored-by: Pramod Kumbhar <[email protected]>

* Add wrapper functions for using OpenMP or OpenACC API * Add -mp=gpu in order to link gpu runtime with tests as well * Avoid copying VecPlay members twice otherwise association fails with OpenMP * IvocVect members t_ and y_ were copied twice * only discon_indices_ is pointer and hence that needs to be copied

…erGrid & threadsPerBlock (#710)

* Use #pragma omp instead of runtime API in `cnrn_target_{copyin,delete}` * Fix `VecPlayContinuous::discon_indices_` device transfer. * Name `cnrn_target_` wrappers more consistently. Co-authored-by: Olli Lupton <[email protected]>

We prefer selective host-to-device updates.

Code fixes for XLC and Clang execution without build system changes. This mainly adds missing OpenMP pragmas and makes cnrn_target_ wrappers visible to NMODL.

omp_get_mapped_ptr was added in OpenMP 5.1 and is not widely supported. With this change then calling cnrn_target_deviceptr on a pointer that is not present on the device is a hard error instead of returning nullptr, so avoid calling it for artificial cells.

* Set nwarp to very big number for optimal parallelization and improve a bit grid config of CUDA solve_interleaved2

* Re-enable GitLab CI. * Add NMODL + OpenACC test. * Restore {clang,cmake}-format checks. * Prefer OpenACC with MOD2C. * Do not enable OpenACC in NMODL + OpenMP mode. * Convert more #pragma acc to nrn_pragma_acc(...). * Call cudaSetDevice in OpenMP mode. Co-authored-by: Ioannis Magkanaris <[email protected]>

Presumably this was working before because our nvhpc localrc files accidentally included CUDA include directories before BlueBrain/spack#1392.

* Compile NVHPC+Open{ACC,MP} with -cuda. * Pull in NMODL+Eigen fixes to make this work.

Commits on Dec 10, 2021

small openacc fixes (#707 )

Christos Kotsalos authored Dec 10, 2021

Configuration menu

View commit details

Copy full SHA for 57f7724

Browse repository at this point

Copy the full SHA

57f7724 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate changes from NERSC GPU hackathon. #713

Integrate changes from NERSC GPU hackathon. #713

Commits on Nov 23, 2021

Commits on Nov 25, 2021

Commits on Nov 26, 2021

Commits on Nov 29, 2021

Commits on Dec 1, 2021

Commits on Dec 2, 2021

Commits on Dec 7, 2021

Commits on Dec 9, 2021

Commits on Dec 10, 2021

Commits on Dec 13, 2021

Commits on Dec 14, 2021

Commits on Dec 16, 2021

Commits on Dec 17, 2021

Commits on Dec 21, 2021

Commits on Dec 22, 2021