Skip to content

Commit

Permalink
With chunked prefil, for large prompts, the sampler can encounter a z…
Browse files Browse the repository at this point in the history
…ero-sized tensor, on which skinny gemm fails
  • Loading branch information
gshtras committed Sep 23, 2024
1 parent 57ea101 commit ff4e478
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/model_executor/layers/tuned_gemm.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def apply_skinny(self, m, n, k, inp_view, weights):
return None
if inp_view.dtype != torch.float16 or k % 8 != 0:
return None
if m > 8 and n <= 4:
if m > 8 and 0 < n <= 4:
out = torch.empty(inp_view.shape[0],
weights.shape[0],
dtype=inp_view.dtype,
Expand Down

0 comments on commit ff4e478

Please sign in to comment.