Skip to content

Commit

Permalink
Fix f8f8bf16_lite quantize op input in quantize_and_compute (pytorc…
Browse files Browse the repository at this point in the history
…h#3667)

Summary:
Pull Request resolved: pytorch#3667

X-link: facebookresearch/FBGEMM#745

A minor fix for trt-llm cudaCoreGemm `cuda_lite` op in quantize_bench script.

when testing with `--bench_quantize` detected a failure with input

```
...
tree/deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/bench/quantize_ops.py", line 797, in quantize_and_compute
    return self.compute(xq, wq, x_scale * w_scale)
TypeError: FP8LiteGemm.compute() missing 1 required positional argument: 'w_scale'
```

Reviewed By: jwfromm

Differential Revision: D69272912

fbshipit-source-id: c184954b4d2d1543277a9e56ac899534597a56e6
  • Loading branch information
YUNQIUGUO authored and facebook-github-bot committed Feb 7, 2025
1 parent 2cef43a commit a914871
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion fbgemm_gpu/experimental/gen_ai/bench/quantize_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -719,7 +719,7 @@ def compute(self, xq, wq, x_scale, w_scale):

def quantize_and_compute(self, x, w):
xq, wq, x_scale, w_scale = self.quantize(x, w)
return self.compute(xq, wq, x_scale * w_scale)
return self.compute(xq, wq, x_scale, w_scale)

@property
def name(self) -> str:
Expand Down

0 comments on commit a914871

Please sign in to comment.