Update on "[2/x] clean up casting functions: delayed scaling"

Summary: Removes delayed scaling from `float8_tensor.py`. After this PR, the invariant is that everything in `float8_tensor.py` requires the scale to be calculated elsewhere. This moves the codebase towards separation of concerns for calculating the scale (via various scaling strategies), separated from creating an instance of `Float8Tensor`. Note that stateful delayed scaling is the reason we need this separation. Test Plan: ``` ./test/test_everything.sh ``` Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D60291447](https://our.internmc.facebook.com/intern/diff/D60291447) [ghstack-poisoned]
pytorch-labs · Jul 26, 2024 · e7c0463 · e7c0463
1 parent 09d770e
commit e7c0463
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/test/test_compile.py b/test/test_compile.py
@@ -20,9 +20,9 @@
     get_float8_layers,
     sync_float8_amax_and_scale_history,
 )
+from float8_experimental.float8_scaling_utils import cast_to_float8_delayed
 from float8_experimental.float8_tensor import LinearMMConfig
 from float8_experimental.float8_utils import e4m3_dtype
-from float8_experimental.float8_scaling_utils import cast_to_float8_delayed
 
 from torch._dynamo.test_case import TestCase as DynamoTestCase
 from torch._dynamo.testing import CompileCounterWithBackend