You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/.local/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1159, in launch_command
multi_gpu_launcher(args)
File "/local/lib/python3.9/site-packages/accelerate/commands/launch.py", line 769, in multi_gpu_launcher
import torch.distributed.run as distrib_run
File "/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 383, in
from torch.distributed.elastic.multiprocessing import Std
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/init.py", line 68, in
from torch.distributed.elastic.multiprocessing.api import ( # noqa: F401
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 26, in
from torch.distributed.elastic.multiprocessing.redirects import (
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/redirects.py", line 35, in
libc = get_libc()
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/redirects.py", line 32, in get_libc
return ctypes.CDLL("libc.so.6")
File "/usr/local/conda/lib/python3.9/ctypes/init.py", line 382, in init
self._handle = _dlopen(self._name, mode)
OSError: /usr/local/conda/lib/python3.9/site-packages/amp_C.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
before I operate the script, I install all the requirements as the repo mentioned (pip install -r requirements.txt).
I don't know why this happened. Could you please tell me your exact python version (3.9.x?)? Or any other suggestions would be deeply appreciated.
The text was updated successfully, but these errors were encountered:
I try to operate the following scripts:
accelerate launch train_tokenizer.py
--exp_name bair_tokenizer_ft --output_dir log_vqgan --seed 0 --mixed_precision bf16
--model_type ctx_vqgan
--train_batch_size 16 --gradient_accumulation_steps 1 --disc_start 1000005
--oxe_data_mixes_type bair --resolution 64 --dataloader_num_workers 16
--rand_select --video_stepsize 1 --segment_horizon 16 --segment_length 8 --context_length 1
--pretrained_model_name_or_path pretrained_models/ivideogpt-oxe-64-act-free/tokenizer
However, an error occured:
File "/.local/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1159, in launch_command
multi_gpu_launcher(args)
File "/local/lib/python3.9/site-packages/accelerate/commands/launch.py", line 769, in multi_gpu_launcher
import torch.distributed.run as distrib_run
File "/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 383, in
from torch.distributed.elastic.multiprocessing import Std
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/init.py", line 68, in
from torch.distributed.elastic.multiprocessing.api import ( # noqa: F401
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 26, in
from torch.distributed.elastic.multiprocessing.redirects import (
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/redirects.py", line 35, in
libc = get_libc()
File "/.local/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/redirects.py", line 32, in get_libc
return ctypes.CDLL("libc.so.6")
File "/usr/local/conda/lib/python3.9/ctypes/init.py", line 382, in init
self._handle = _dlopen(self._name, mode)
OSError: /usr/local/conda/lib/python3.9/site-packages/amp_C.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
before I operate the script, I install all the requirements as the repo mentioned (pip install -r requirements.txt).
I don't know why this happened. Could you please tell me your exact python version (3.9.x?)? Or any other suggestions would be deeply appreciated.
The text was updated successfully, but these errors were encountered: