You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
y and C produces somewhat similar result and yet y is always better in terms of closeness with torch.mm when the input is not cast to torch.float8_e4m3fn (only 17-24% of the elements are better in C in terms of numerical precision, not the expected 50% compared to torch._scaled_mm).
However, if I set
scale_a_inv_s.fill_(1)
scale_b_inv_s.fill_(2)
right after I cast A and B to float8_e4m3fn, the numerical accuracy matches that of torch._scaled_mm. To further illustrate, I used
and the results are wildly different between torch._scaled_mm and TK's version with torch._scaled_mm still closely following bf16 computation results (I cast the scale-adjusted FP8 tensors back to BF16).
The text was updated successfully, but these errors were encountered:
When I tested with the following snippet:
y
andC
produces somewhat similar result and yety
is always better in terms of closeness withtorch.mm
when the input is not cast totorch.float8_e4m3fn
(only 17-24% of the elements are better inC
in terms of numerical precision, not the expected 50% compared totorch._scaled_mm
).However, if I set
right after I cast
A
andB
tofloat8_e4m3fn
, the numerical accuracy matches that oftorch._scaled_mm
. To further illustrate, I usedand the results are wildly different between
torch._scaled_mm
andTK
's version withtorch._scaled_mm
still closely followingbf16
computation results (I cast the scale-adjusted FP8 tensors back to BF16).The text was updated successfully, but these errors were encountered: