Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Add Clamp for FakeConvertDecomposition #28651

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

xuchen-intel
Copy link
Contributor

Details:

  • Ngraph FakeConvert layer applies clamp for f8 (f8e4m3 applies clamp, f8e5m2 partially applies clamp), while ngraph Convert layer doesn't apply clamp. So the idea is to add Clamp layer in FakeConvertDecomposition to assure the clamp behavior of FakeConvert is still included for plugins after decomposition.
  • Ngraph reference emulate_f8e4m3_on_fp16 applies clamp for overflowed value as well as for NaN(f8e4m3 does not have INF in Specification). However, it seems emulate_f8e5m2_on_fp16 only applies clamp for overflowed value (by the flag can_round), but not for INF. To align behavior between f8e4m3 and f8e5m2, clamp for INF is added in emulate_f8e5m2_on_fp16.
  • Test cases are added to reproduce the issue beforehand.

Tickets:

@xuchen-intel xuchen-intel added the category: CPU OpenVINO CPU plugin label Jan 24, 2025
@xuchen-intel xuchen-intel added this to the 2025.1 milestone Jan 24, 2025
@xuchen-intel xuchen-intel requested review from a team as code owners January 24, 2025 03:22
@xuchen-intel xuchen-intel requested review from itikhono and removed request for a team January 24, 2025 03:22
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: IE Tests OpenVINO Test: plugins and common category: transformations OpenVINO Runtime library - Transformations labels Jan 24, 2025
@@ -175,6 +175,9 @@ std::vector<std::string> disabledTestPatterns() {
R"(.*ConvertCPULayerTest.*outFmts=(nhwc|nChw8c|nChw16c).*)",
// Issue: MFDNN-12917. The oneDNN emitter of conversion from fp32 to fp8 has rounding issue.
R"(.*ConvertCPULayerTest.*(\[1.1.1080.1920\]|\(2.17.5.4\))_.*_inputPRC=f32_targetPRC=f8e4m3_.*)",
// Issue: 123320
Copy link
Contributor Author

@xuchen-intel xuchen-intel Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These bf16 related test cases for FakeConvert are skipped because of the following code block, which has already been tracked by ticket 123320.

The code block here

// todo: issue: 123320
if (!((inf_prc != config.end() && inf_prc->second == element::undefined)
|| (inf_prc == config.end() && exec_mode != config.end() && exec_mode->second == hint::ExecutionMode::ACCURACY))) {
test->convert_precisions.insert({ov::element::bf16, ov::element::f32});
test->convert_precisions.insert({ov::element::f16, ov::element::f32});
}
avoids input data precision of FakeConvert to be bf16 in expected ngraph reference test. So for expected FakeConvert reference input precision is fp32, while for actual FakeConvert (before decomposition) of plugins input precision is bf16 as we set. For values at the edge of rounding, the gap between expected and actual result is significant while they are both correct.

Here is a case for FakeConvert layer. data=3504, scale=0.546875, shift=0, destination_type=f8e5m2. Expected result is 3264, actual result is 3744, but both correct.

Expected: bf16(3504) -> fp32(3504) -> *scale(0.546875) ->fp32(1916.25) -> fp16(1916) -> fp16(constrained_in_f8_range)(1792) -> fp32(1792) -> /scale(0.546875) -> fp32(3276.8) -> bf16(3264)

Actual: bf32(3504) -> *scale(0.546875) -> bf16(1920) -> fp16(1920) -> fp16(constrained_in_f8_range)(2048) -> bf16(2048) -> /scale(0.546875) ->bf16(3744)

@@ -48,6 +48,7 @@ void emulate_f8e5m2_on_fp16(const float16* const arg_f, float16* out_f, size_t c
val_bit_repr += (((rnmask > 0x0080) || (rnmask_tie == rne_tie)) << lshift);
}
val_bit_repr &= mask_mant; /* truncation */
val_bit_repr -= (((val_bit_repr & 0x7F00) == fp16_inf) << lshift); /* clamp */
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the behavior of master branch for converting different f32 values to f8e4m3 is as follows:

Value type Convert f32 to f16 Convert f16 to f8e5m2
f16 overflowed value INF INF (this PR change it to not-INF)
f8e5m2 overflowed value not-INF not-INF
f8e5m2 other value not-INF not-INF

For overflowed values, if we apply clamp in FakeConvertDecomposition we gets not-INF, otherwise we gets INF. We can not get different behavior for "f16 overflowed value" and "f8e5m2 overflowed value".

So here, I applies this revision to make "f16 overflowed value" also get not-INF result. The other reason is to align with behavior of f8e4m3.

@dmitry-gorokhov @mitruska Please feel free to comment. Thanks a lot!

@xuchen-intel xuchen-intel force-pushed the fix/fake_convert_clamp branch from 876d289 to a6b2559 Compare January 24, 2025 05:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Core OpenVINO Core (aka ngraph) category: CPU OpenVINO CPU plugin category: IE Tests OpenVINO Test: plugins and common category: transformations OpenVINO Runtime library - Transformations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant