[Bug] generation benchmark failed on llama2-chat-7b-w4 #505

del-zhenwu · 2023-09-27T12:59:56Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.

Describe the bug

session 1 stats: 
[[0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0], [0.0, 2048, 0.0]]
--------------------------------------------------
profile_generation.py:143: RuntimeWarning: divide by zero encountered in scalar divide
  throughput = np.sum(stats[:, 1], axis=0) / np.sum(stats[:, 2],
--------------------------------------------------
concurrency: 1, input_tokens: 0, output_tokens: 2048
elapsed_time: 0.12s
first_token latency(min, max, ave): 0.00s, 0.00s, 0.00s
token latency(min, max, ave): 0.00s, 0.00s, 0.00s
throughput: inf token/s

and the following error returned.

concurrency: 8, input_tokens: 0, output_tokens: 2048
elapsed_time: 0.18s
first_token latency(min, max, ave): 0.00s, 0.01s, 0.01s
token latency(min, max, ave): 0.00s, 0.01s, 0.01s
throughput: -3200.00 token/s

Reproduction

python3 -m lmdeploy.serve.turbomind.deploy --model-name llama2 --model-path ./llama2-chat-7b-w4 --model-format awq --group-size 128

Error traceback

No response

The text was updated successfully, but these errors were encountered:

lvhan028 · 2023-10-11T01:56:38Z

#507 has resolved this issue.

lvhan028 self-assigned this Sep 27, 2023

irexyc mentioned this issue Sep 28, 2023

reset shared_state when previous AbstractTransformerModelInstance deconstruct #506

Closed

lvhan028 closed this as completed Oct 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] generation benchmark failed on llama2-chat-7b-w4 #505

[Bug] generation benchmark failed on llama2-chat-7b-w4 #505

del-zhenwu commented Sep 27, 2023 •

edited

Loading

lvhan028 commented Oct 11, 2023

[Bug] generation benchmark failed on llama2-chat-7b-w4 #505

[Bug] generation benchmark failed on llama2-chat-7b-w4 #505

Comments

del-zhenwu commented Sep 27, 2023 • edited Loading

Checklist

Describe the bug

Reproduction

Error traceback

lvhan028 commented Oct 11, 2023

del-zhenwu commented Sep 27, 2023 •

edited

Loading