You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While trying out INT8 mixed precision pretraining (#748) with torchtitan, I came across an issue that if the model is FSDP-sharded, quantize_() won't work. The fix would be adding an extra logic to handle DTensor, similar to what FP8 is doing
While trying out INT8 mixed precision pretraining (#748) with torchtitan, I came across an issue that if the model is FSDP-sharded,
quantize_()
won't work. The fix would be adding an extra logic to handle DTensor, similar to what FP8 is doingao/torchao/float8/float8_tensor.py
Lines 161 to 183 in f5703b0
The text was updated successfully, but these errors were encountered: