You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hi @krishnanpooja we don't test the latest TRT-LLM functionality on T4 platform as we remove its supporting since 0.14 release. So there's no guarantee that your case could run succesfully on T4.
System Info
-GPU Name - [4 x NVIDIA Tesla T4]
NVIDIA-SMI 470.256.02 Driver Version: 470.256.02 CUDA Version: 12.1
Who can help?
@byshiue , can you please guide me as how can I resolve this error?
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
python modeling/convert_checkpoint.py --model_dir ${{inputs.model_path}}
--output_dir ${{outputs.output_dir}}/tllm_checkpoint_1gpu_mistral
--dtype float16
--tp_size 4
echo "converted checkpoint"
export CUDA_MODULE_LOADING=LAZY
echo $CUDA_MODULE_LOADING
trtllm-build --checkpoint_dir ${{outputs.output_dir}}/tllm_checkpoint_1gpu_mistral
--output_dir ${{outputs.output_dir}}/tllm_checkpoint_1gpu_mistral/trt_build_output
--gemm_plugin auto
--max_input_len 8000
--context_fmha enable
Expected behavior
I am able to load the model on T4 using vLLM engine. Expectation is that it should work for TensorRT-LLM as well.
actual behavior
*** Process received signal *** [37f2f5cd59bd4d738ff60b08585e947b000000:00279] Signal: Aborted (6) [37f2f5cd59bd4d738ff60b08585e947b000000:00279] Signal code: (-6) [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 0] /usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420)[0x15299ac49420] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x15299a92c00b] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x15299a90b859] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 3] /opt/conda/envs/ptca/bin/../lib/libstdc++.so.6(+0xb135a)[0x15290c29435a] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 4] /opt/conda/envs/ptca/bin/../lib/libstdc++.so.6(+0xb03b9)[0x15290c2933b9] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 5] /opt/conda/envs/ptca/bin/../lib/libstdc++.so.6(__gxx_personality_v0+0x87)[0x15290c293ae7] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 6] /opt/conda/envs/ptca/bin/../lib/libgcc_s.so.1(+0x111e4)[0x1529976221e4] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 7] /opt/conda/envs/ptca/bin/../lib/libgcc_s.so.1(_Unwind_Resume+0x12e)[0x152997622c1e] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 8] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_llm/libs/libtensorrt_llm.so(+0x7891f4)[0x1527237051f4] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [ 9] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZN12tensorrt_llm7plugins24GPTAttentionPluginCommon10initializeEv+0x275)[0x1525ea662375] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [10] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_llm/libs/libnvinfer_plugin_tensorrt_llm.so(_ZNK12tensorrt_llm7plugins24GPTAttentionPluginCommon9cloneImplINS0_18GPTAttentionPluginEEEPT_v+0x333)[0x1525ea6a9313] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [11] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_libs/libnvinfer.so.10(+0xb7a5a7)[0x1528fe3865a7] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [12] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_libs/libnvinfer.so.10(+0xa8188e)[0x1528fe28d88e] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [13] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_bindings/tensorrt.so(+0xfe062)[0x15284e5d2062] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [14] /opt/conda/envs/ptca/lib/python3.10/site-packages/tensorrt_bindings/tensorrt.so(+0x4c91e)[0x15284e52091e] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [15] /opt/conda/envs/ptca/bin/python[0x4fcaf7] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [16] /opt/conda/envs/ptca/bin/python(_PyObject_MakeTpCall+0x25b)[0x4f657b] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [17] /opt/conda/envs/ptca/bin/python[0x50861f] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [18] /opt/conda/envs/ptca/bin/python(_PyEval_EvalFrameDefault+0x4b2c)[0x4f1e9c] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [19] /opt/conda/envs/ptca/bin/python(_PyFunction_Vectorcall+0x6f)[0x4fcf3f] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [20] /opt/conda/envs/ptca/bin/python(PyObject_Call+0xb8)[0x508cd8] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [21] /opt/conda/envs/ptca/bin/python(_PyEval_EvalFrameDefault+0x2de4)[0x4f0154] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [22] /opt/conda/envs/ptca/bin/python(_PyFunction_Vectorcall+0x6f)[0x4fcf3f] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [23] /opt/conda/envs/ptca/bin/python(PyObject_Call+0xb8)[0x508cd8] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [24] /opt/conda/envs/ptca/bin/python(_PyEval_EvalFrameDefault+0x2de4)[0x4f0154] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [25] /opt/conda/envs/ptca/bin/python[0x50832e] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [26] /opt/conda/envs/ptca/bin/python(PyObject_Call+0xb8)[0x508cd8] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [27] /opt/conda/envs/ptca/bin/python(_PyEval_EvalFrameDefault+0x2de4)[0x4f0154] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [28] /opt/conda/envs/ptca/bin/python(_PyFunction_Vectorcall+0x6f)[0x4fcf3f] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] [29] /opt/conda/envs/ptca/bin/python(_PyObject_FastCallDictTstate+0x17d)[0x4f5afd] [37f2f5cd59bd4d738ff60b08585e947b000000:00279] *** End of error message *** /bin/bash: line 5: 279 Aborted (core dumped) trtllm-build --checkpoint_dir output_dir/tllm_checkpoint_1gpu_mistral --output_dir output_dir/tllm_checkpoint_1gpu_mistral/trt_build_output --gemm_plugin auto --max_input_len 8000 --context_fmha enable
I tried to context-fmha=disable. I got memory error.
additional notes
I using the convert_checkpoint.py (from llama example) and then trtllm-build command
The text was updated successfully, but these errors were encountered: