Replies: 1 comment 5 replies
-
Hm... a weird error like this probably indicates an upstream backward/forward compatibility bug in PyTorch, which we have seen at least for the case of train in 1.13 and load in C++ in 1.11 (though with a different error). Is your libtorch the same version as your PyTorch for training? See for example pytorch/pytorch#47917 |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I'm trying to run a LAMMPS simulation for silicon but encounter an error relating to the Allegro trained model. I'm using a CPU-only release of Pytorch, LAMMPS is integrated to libtorch through conda, lmp was built successfully, model was trained and deployed, and lammps input script was detailed properly using pair_style allegro.
However when trying to run the LAMMPS script the following error is observed:
Created 4032 atoms
using lattice units in orthogonal box = (0.0000000 0.0000000 0.0000000) to (38.014900 43.445600 48.876300)
create_atoms CPU = 0.000 seconds
Allegro is using device cpu
Allegro: Loading model from ../../../../allegro/work/Silicon/results/silicon-tutorial/si/log/silicon-deployed.pth
terminate called after throwing an instance of 'c10::Error'
what(): Unrecognized data format
Exception raised from load at /localdisk/npirooza/Allegro_Source/pytorch/torch/csrc/jit/serialization/import.cpp:449 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x55 (0x7efdc5b622c5 in /localdisk/npirooza/anaconda3/envs/Allegro_Source/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xb1 (0x7efdc5b5f461 in /localdisk/npirooza/anaconda3/envs/Allegro_Source/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&) + 0x191 (0x7efdbe01e6a1 in /localdisk/npirooza/anaconda3/envs/Allegro_Source/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #3: ../../../build/lmp() [0x7b1c19]
frame #4: ../../../build/lmp() [0x455dd8]
frame #5: ../../../build/lmp() [0x45992b]
frame #6: ../../../build/lmp() [0x459b73]
frame #7: ../../../build/lmp() [0x433828]
frame #8: __libc_start_main + 0x102 (0x7efdb9e27362 in /lib64/libc.so.6)
frame #9: ../../../build/lmp() [0x433e8e]
Aborted (core dumped)
I appreciate your time and apologies if this is a dumb mistake on my part :)
Beta Was this translation helpful? Give feedback.
All reactions