[AMD] Fix uniform offset computation #4678
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a bug in how we were splitting the uniform/non-uniform offset contribution for addptr.
Consider this IR (where U is a uniform value, e.g., , coming from a splat and NU is non-uniform, coming e.g., from a
make_range
).It would have been rewritten to
The main issue here is that
%b
's operand #0 has changed, i.e., the scalar contribution has been removed. This is fine ifaddptr
is the only operation that uses%b
. If any other operation uses%b
, they need the "old"%b
.The solution is to accumulate both the uniform and non-uniform contributions in a separate IR and leave the original
%b
untouched. Possible duplications will be removed by the canonicalizer .Doing things in this way, I also could generalize the pass to all expressions of the form
(U+NU)*(U+NU)
.I tried enabling this pass and running all the suite and it is working fine