-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] fix moe_wna16 get_quant_method
ready
ONLY add when PR is ready to merge/full CI is needed
#12648
opened Feb 1, 2025 by
jinzhen-lin
Loading…
Fix: benchmark_prioritization.py has problems constructing requests w…
#12646
opened Feb 1, 2025 by
Accelerator1996
Loading…
[Core] BatchLLM for better shared prefix utilizing in offline scenarios
frontend
#12641
opened Feb 1, 2025 by
xinji1
Loading…
Fix get_device_name for cuda platforms that return bytes
#12636
opened Feb 1, 2025 by
mgoin
Loading…
[Model][Quant] Fix GLM, Fix fused module mappings for quantization
needs-rebase
#12634
opened Jan 31, 2025 by
kylesayrs
Loading…
[Core] choice-based structured output with xgrammar
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
#12632
opened Jan 31, 2025 by
russellb
Loading…
[Build] update requirements of no-device for plugin usage
ci/build
#12630
opened Jan 31, 2025 by
sducouedic
Loading…
[Misc] Add SPDX-License-Identifier headers to python source files
ci/build
documentation
Improvements or additions to documentation
frontend
#12628
opened Jan 31, 2025 by
russellb
Loading…
[Hardware][TPU] Multi-LoRA implementation for the TPU backend
#12623
opened Jan 31, 2025 by
Akshat-Tripathi
Loading…
[Core] Improve hash collision avoidance in prefix caching
needs-rebase
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#12621
opened Jan 31, 2025 by
russellb
Loading…
[Core] Silence unnecessary deprecation warnings
ready
ONLY add when PR is ready to merge/full CI is needed
#12620
opened Jan 31, 2025 by
russellb
Loading…
Adding cpu inference with VXE ISA for s390x architecture
ci/build
#12613
opened Jan 31, 2025 by
dilipgb
Loading…
[Core][v1] Unify allocating slots in prefill and decode in KV cache manager
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#12608
opened Jan 31, 2025 by
ShawnD200
Loading…
[ROCm][Quantization][Kernel] Using HIP FP8 header
ci/build
#12593
opened Jan 30, 2025 by
gshtras
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.