You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Error executing method determine_num_available_blocks: vLLM multi node fails for both DeepSeek-Coder-V2-Instruct and DeepSeek-Coder-V2-Lite-Instruct
#76
Open
liangfang opened this issue
Jul 28, 2024
· 1 comment
首先想问一下DeepSeek有没有试过在vLLM multi node上运行过?
我是通过ray在2个node x 8 GPUs V100上以half(float16)运行
这是运行参数:
CUDA_LAUNCH_BLOCKING=1 OMP_NUM_THREADS=1 vllm serve deepseek-ai/DeepSeek-Coder-V2-Instruct --tensor-parallel-size 16 --dtype half --trust-remote-code --enforce-eager --enable-chunked-prefill=False
DeepSeek-Coder-V2-Lite-Instruct也是在determine_num_available_blocks 处fails, 但是报一个NCCL error:
(RayWorkerWrapper pid=23558, ip=10.0.128.18) ERROR 07-28 13:53:40 worker_base.py:382] RuntimeError: NCCL Error 3: internal error - please report this issue to the NCCL developers
The text was updated successfully, but these errors were encountered: