You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What are the problems?(screenshots or detailed error messages)
I need to benchmark llama 2 7b time to first token(ttft) with openppl, and I have to benchmark it with static input and output. But I cannot find the scripts to generate custom dataset,can you provide it?
What are the types of GPU/CPU you are using?
4090/A100 40G
What's the operating system ppl.llm.serving runs on?
ubuntu 22.04
What's the compiler and its version?
gcc 11.4
Which version(commit id or tag) of ppl.llm.serving is used?
What are the problems?(screenshots or detailed error messages)
I need to benchmark llama 2 7b time to first token(ttft) with openppl, and I have to benchmark it with static input and output. But I cannot find the scripts to generate custom dataset,can you provide it?
What are the types of GPU/CPU you are using?
4090/A100 40G
What's the operating system ppl.llm.serving runs on?
ubuntu 22.04
What's the compiler and its version?
gcc 11.4
Which version(commit id or tag) of ppl.llm.serving is used?
3abe5d2
What are the commands used to build ppl.llm.serving?
./build.sh -DPPLNN_USE_LLM_CUDA=ON -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'80;86;87;89'" -DPPLCOMMON_CUDA_ARCHITECTURES="'80;86;87;89'"
What are the execution commands?
minimal code snippets for reproducing these problems(if necessary)
models and inputs for reproducing these problems (send them to [email protected] if necessary)
The text was updated successfully, but these errors were encountered: