Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance] What is the purpose of compiling a model? #2617

Open
4 tasks
Flynn-Zh opened this issue Dec 24, 2024 · 1 comment
Open
4 tasks

[Performance] What is the purpose of compiling a model? #2617

Flynn-Zh opened this issue Dec 24, 2024 · 1 comment
Labels
triaged Issue has been triaged by maintainers

Comments

@Flynn-Zh
Copy link

System Info

CPU: x86_64
GPU: NVIDIA L40
CUDA: 12.2
OS: ubuntu 22.04
TensorRT-LLM: 0.15.0

Who can help?

@kaiyux

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

convert model: python3 qwen/convert_checkpoint.py --model_dir ./Qwen2.5-32B-Instruct-GPTQ-Int4/ --dtype float16 --use_weight_only --weight_only_precision int4_gptq --output_dir ./trt_engines/Int4/
compile model: trtllm-build --checkpoint_dir ./trt_engines/Int4 --gemm-plugin auto --output_dir ./trt_engines/compiled-model/
run server-1: trtllm-serve ./trt_engines/Int4/ --host 0.0.0.0 --port 8000
run server-2: trtllm-serve ./trt_engines/compiled-model/ --host 0.0.0.0 --port 8000
when i use v1/chat/completions with 9k words prompt to test server-1 and server-2, they need about 12 seconds to return all the answers,so,What is the purpose of compiling a model?Or require certain configurations?

Expected behavior

compile model can improve performance or others

actual behavior

none

additional notes

none

@Flynn-Zh Flynn-Zh added the bug Something isn't working label Dec 24, 2024
@nv-guomingz
Copy link
Collaborator

@nv-guomingz nv-guomingz added triaged Issue has been triaged by maintainers and removed bug Something isn't working labels Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants