CUDA shared case (bug ?) #24

incardon · 2022-05-03T21:10:57Z

For the CUDA example

// Get index of first TD
int ix = blockIdx.xblockDim.xNUM_TD_PER_THREAD + threadIdx.x;

// Have extra threads do the last member intead of return.
// A return would disable use of barriers, so not using return is better
ix = ix < numTransforms ? ix : numTransforms - NUM_TD_PER_THREAD;

#ifdef USE_SHARED
extern shared FFParams forcefield[];
if(ix < num_atom_types)
{
forcefield[ix] = global_forcefield[ix];
}
#else

I think the ix in the shared case should be threadIdx.x. should't be ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA shared case (bug ?) #24

CUDA shared case (bug ?) #24

incardon commented May 3, 2022

CUDA shared case (bug ?) #24

CUDA shared case (bug ?) #24

Comments

incardon commented May 3, 2022