Skip to content

xzzWZY/open-framework-measurement

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

LLM open-framework inference measuremrnt

Framework:

  1. vLLM
  2. text-generation-inference
  3. TensorRT-LLM
  4. Deepspeed-mii

Metrics:

Client side

  1. Token latency
    1. Avg latency
    2. Variance
  2. Pause time
    1. Total pause time
    2. Pause ratio: pause time / end-to-end inference time
  3. Time to first token
    1. Prefilling time
    2. Queuing time

Server side

  1. Memory
  2. Memory IO
  3. Compute
  4. Energy

Model:

  • Llama v2
    • 13B

About

include LLM open framework measurement

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published