-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ONNX parsing for SimplifiedLayerNormalization #3129
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3129 +/- ##
========================================
Coverage 92.26% 92.26%
========================================
Files 499 500 +1
Lines 20020 20048 +28
========================================
+ Hits 18471 18497 +26
- Misses 1549 1551 +2 ☔ View full report in Codecov by Sentry. |
Check results before merge 🔆 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
Where is this spec for this operator? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, minor comments. I would like to see the equation that it's supposed to be doing.
auto rms = info.add_instruction(make_op("reduce_mean", {{"axes", {axis}}}), x_sq); | ||
auto mean = rms; | ||
epsilon = | ||
(x_dtype == migraphx::shape::half_type and std::abs(epsilon) < 1e-7) ? 1e-7 : epsilon; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we limiting the epsilon for half type? It looks like a user input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is how we handle epsilon in our regular LayerNorm parser, so I did the same here.
There isn't actually one for SimplifiedLayerNormalization, but this is the spec for SkipSimplifiedLayerNormalization, which is just add + SLN. That spec does include an optional bias input, but neither of the ORT implementations utilize it, so I omitted it from ours. |
The equation is the same as RMS LayerNorm. |
std::vector<half> x{half{0.8}, | ||
half{-0.5}, | ||
half{0.0}, | ||
half{1.0}, | ||
half{0.5}, | ||
half{0.2}, | ||
half{0.3}, | ||
half{-0.6}, | ||
half{10.0}, | ||
half{-1.0}, | ||
half{0.0}, | ||
half{1.0}, | ||
half{1.2}, | ||
half{3.2}, | ||
half{-4.1}, | ||
half{5.3}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't require casting all the elements to half. Same applies to all other places.
auto result = info.add_common_op("mul", x, rrms); | ||
result = info.add_common_op("mul", result, scale); | ||
|
||
return {result, mean, rrms}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this being matched with LayerNorm kernel on the GPU target ?
* Add simplified_layer_normalization
* Add simplified_layer_normalization
No description provided.