GPU issues while running LAMMPS with Allegro #97

kuntalg97 · 2024-08-02T17:46:35Z

kuntalg97
Aug 2, 2024

Hi,

I'm having some issues with CUDA while running LAMMPS with Allegro. For training the model and deploying it into a .pth file, I used 4 GPUs (gres=gpu:4), 1 node and 20 cores. It ran successfully and I obtained a .pth file.

While using the same GPU and node specifications for running LAMMPS with the .pth file, I'm getting the following error message :
.
.
.
reading velocities ...
3916 velocities
read_data CPU = 0.068 seconds
ERROR (Allegro): my rank (4) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (13) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (15) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (16) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (18) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (19) is bigger than the number of visible devices (4)!!!Allegro is using device cuda:0
Allegro is using device cuda:1
Allegro is using device cuda:2
Allegro is using device cuda:3
ERROR (Allegro): my rank (6) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (7) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (8) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (9) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (10) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (11) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (12) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (14) is bigger than the number of visible devices (4)!!!ERROR (Allegro): my rank (17) is bigger than the number of visible devices (4)!!!Allegro: Loading model from water.pth
Allegro: Loading model from water.pth
Allegro: Loading model from water.pth
Allegro: Loading model from water.pth
ERROR (Allegro): my rank (5) is bigger than the number of visible devices (4)!!!Allegro: Freezing TorchScript model...
Allegro: Freezing TorchScript model...
Allegro: Freezing TorchScript model...
Allegro: Freezing TorchScript model...
Type mapping:
Allegro type | Allegro name | LAMMPS type | LAMMPS name
.
.
.

How do I fix this?

Thanks!
Kuntal

P.S - I am very new to using Allegro and GPUs in general, so pardon my rookie mistakes please!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU issues while running LAMMPS with Allegro #97

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

GPU issues while running LAMMPS with Allegro #97

kuntalg97 Aug 2, 2024

Replies: 0 comments

kuntalg97
Aug 2, 2024