Skip to content

Pull requests: ROCm/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Fix kernel cache miss and add RDNA configs
#246 opened Oct 25, 2024 by hyoon1 Loading…
Fused ROPE and reshape cache kernel
#229 opened Oct 11, 2024 by maleksan85 Loading…
Update run-amd-test.sh
#192 opened Sep 17, 2024 by Alexei-V-Ivanov-AMD Loading…
multi-gpu fused_moe tuning support
#143 opened Aug 16, 2024 by divakar-amd Loading…
1 task done
[DO NOT MERGE] Vinayak/moe final hashem
#127 opened Aug 11, 2024 by carlushuang Loading…
Add max-batch-size to benchmark_throughput.py
#122 opened Aug 7, 2024 by dllehr-amd Loading…
Add truncate to all files after json dump
#117 opened Aug 2, 2024 by jpvillam-amd Loading…
[Misc] Use main triton branch
#115 opened Aug 1, 2024 by binarman Loading…
Adding SHM broadcast to ROCm/vllm
#113 opened Jul 31, 2024 by Lzy17 Loading…
optimizations for process output step
#104 opened Jul 25, 2024 by sanyalington Loading…
Update QueueLLM
#97 opened Jul 22, 2024 by gyulaz-htec Loading…
Add benchmark_latency_batched.py
#96 opened Jul 22, 2024 by dllehr-amd Loading…
New LLM for MLPerf Server scenario serving
#94 opened Jul 19, 2024 by gyulaz-htec Loading…
Add VLLM_SCHED_PREFILL_KVC_FREEPCT
#89 opened Jul 18, 2024 by sanyalington Loading…
Torchrun api server
#71 opened Jun 27, 2024 by gshtras Loading…
Update on naive_attn module
#21 opened May 28, 2024 by seungrokj Loading…
ProTip! Follow long discussions with comments:>50.