[Quantization + FSDP] Support `quantize_()` for DTensor #803

gau-nernst · 2024-09-04T02:32:33Z

While trying out INT8 mixed precision pretraining (#748) with torchtitan, I came across an issue that if the model is FSDP-sharded, quantize_() won't work. The fix would be adding an extra logic to handle DTensor, similar to what FP8 is doing

ao/torchao/float8/float8_tensor.py

Lines 161 to 183 in f5703b0

    
           if isinstance(bits_fp8, DTensor): 
        
               assert isinstance( 
        
                   scale, DTensor 
        
               ), "Expected Float8 scale to be a DTensor if bits_fp8 is a DTensor" 
        
               bits_mesh = bits_fp8.device_mesh 
        
               bits_placements = bits_fp8.placements 
        
               local_bits = bits_fp8.to_local() 
        
               local_scale = scale.to_local() 
        
               inner_float8_tensor = Float8Tensor( 
        
                   local_bits, 
        
                   local_scale, 
        
                   tensor.dtype, 
        
                   linear_mm_config=linear_mm_config, 
        
                   gemm_input_role=gemm_input_role, 
        
               ) 
        
               return DTensor.from_local( 
        
                   inner_float8_tensor, 
        
                   bits_mesh, 
        
                   bits_placements, 
        
                   run_check=False, 
        
                   shape=bits_fp8.size(), 
        
                   stride=bits_fp8.stride(), 
        
               )

The text was updated successfully, but these errors were encountered:

msaroufim · 2024-09-04T05:29:39Z

Yeah this came up in some discussions with inference providers like SGLang as well

gau-nernst mentioned this issue Sep 13, 2024

Update quantization to use tensor subclasses pytorch/torchtune#1403

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quantization + FSDP] Support `quantize_()` for DTensor #803

[Quantization + FSDP] Support `quantize_()` for DTensor #803

gau-nernst commented Sep 4, 2024

msaroufim commented Sep 4, 2024

[Quantization + FSDP] Support quantize_() for DTensor #803

[Quantization + FSDP] Support quantize_() for DTensor #803

Comments

gau-nernst commented Sep 4, 2024

msaroufim commented Sep 4, 2024

[Quantization + FSDP] Support `quantize_()` for DTensor #803

[Quantization + FSDP] Support `quantize_()` for DTensor #803