-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update finalize function for hipBLASLt to find tuned solution only when solution index is 0. #3593
Conversation
Windows build failures seem unrelated. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #3593 +/- ##
========================================
Coverage 92.13% 92.13%
========================================
Files 512 512
Lines 21424 21424
========================================
Hits 19740 19740
Misses 1684 1684 ☔ View full report in Codecov by Sentry. |
src/targets/gpu/hip_gemm_impl.cpp
Outdated
auto gemm_item = hip_gemm_impl(output_shape, input_shapes, alpha, beta); | ||
int32_t solution = gemm_item.tune(ctx, input_shapes); | ||
hip_gemm_save_solution(ctx, output_shape, input_shapes, solution_idx); | ||
int solution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this variable and use solution_idx
variable instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
4f8d530
to
100da97
Compare
@@ -76,14 +76,17 @@ inline auto hipblaslt_invoke(F f, Ts... xs) | |||
} | |||
|
|||
template <class F, class Pack, class... Ts> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template <class F, class Pack, class... Ts> | |
// Invoke a Hipblaslt call. If used to validate a call, set fatal_error = false to prevent | |
// throwing an exception on failure. | |
template <class F, class Pack, class... Ts> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I have added the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggested comment add, but looks OK to me now.
…s in addition to throwing an exception
…owing an exception
eec5c0c
to
26a8d3d
Compare
This build is not recommended to merge 🔴 |
🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. I'm assuming this is faster to compile because we're not running tune every time? Also, does a solution index of zero always mean it needs tuning?
Yes, this should be faster because previously we were running tuning regardless of the solution index provided. Now we only do tuning when the solution index provided is zero.
We do tuning if the solution index provided is zero. However, we do not try to tune if the non-zero solution index is invalidated by the |
Currently, finalize function runs tuning for all solution indices.
This PR modifies the function to run tuning only when
solution index is '0'. For all other solution indices, it runs validate.
This PR also fixes a bug in functionality of hipBLASLt validate function.