update mscale calculation to keep back compatible with previous phi models#20

Closed

zelinms wants to merge 1175 commits intoxiaoxiawu-microsoft:mainfrom wenxcs-msft:dev/zelin/msft-phimoe-mscale

This pull request is big! We're only showing the most recent 250 commits

Commits on Jul 31, 2024

[MISC] Introduce pipeline parallelism partition strategies (vllm-project#6920 )

comaniac
and
youkaichao
authored
[Bugfix] Support cpu offloading with fp8 quantization (vllm-project#6960 )
mgoin
authored
[Kernel] Enable FP8 Cutlass for Ada Lovelace (vllm-project#6950 )

varun-sundar-rabindranath
and
Varun Sundar Rabindranath
authored
[Kernel] Tuned int8 Cutlass Kernels for SM75 (T4) (vllm-project#6996 )

varun-sundar-rabindranath
and
Varun Sundar Rabindranath
authored
[Misc] Add compressed-tensors to optimized quant list (vllm-project#7006 )
mgoin
authored
Revert "[Frontend] Factor out code for running uvicorn" (vllm-project#7012 )

simon-mo
and
robertgshaw2-redhat
authored

Commits on Aug 1, 2024

Commits on Aug 2, 2024

Commits on Aug 3, 2024

Commits on Aug 4, 2024

Commits on Aug 5, 2024

Commits on Aug 6, 2024

Commits on Aug 7, 2024

Commits on Aug 8, 2024

Commits on Aug 9, 2024

Commits on Aug 10, 2024

[Bugfix] Fix phi3v batch inference when images have different aspect ratio (vllm-project#7392 )
Isotr0py
authored

Commits on Aug 12, 2024

Commits on Aug 13, 2024

Commits on Aug 14, 2024

Commits on Aug 16, 2024

Commits on Aug 17, 2024

Commits on Aug 19, 2024

Commits on Aug 20, 2024

Commits on Aug 22, 2024

update mscale calculation to keep back compatible with previous phi models
linzeqipku
committed