-
Notifications
You must be signed in to change notification settings - Fork 19
[AIE2P] Fix align requirement for wide vector load/store and split 32 aligned load/store instead of scalarizing them #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Nice, I think we need the same for AIE2. |
AIE2 doesn't have 512-bit load/store instructions, so those will be broken down into 256-bit accesses which only require 32-byte alignment. I think AIE2 is fine |
Nice, I think we need the same for AIE2.
The problem there is a bit different. For AIE2 we are considering alignments of 32 bits (probably a misunderstanding of how alignment is expressed during legalization.). |
1f59bfd
to
eae60df
Compare
This now splits the large loads/stores when the alignment is 256 bits. |
eae60df
to
4c82b3f
Compare
4c82b3f
to
d9810c1
Compare
; AIE2P: liveins: $p0 | ||
; AIE2P-NEXT: {{ $}} | ||
; AIE2P-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $p0 | ||
; AIE2P-NEXT: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p0) :: (load (<32 x s8>)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we don't need the alignment anymore, but it seems wrong that it disappears. It's still 32 byte aligned, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it always disappears post legalizer, even if we just use .legal()
, not sure why
const uint64_t Alignment = Query.MMODescrs[0].AlignInBits; | ||
const unsigned SplitFactor = MemoryType.getSizeInBits() / Alignment; | ||
return std::make_pair( | ||
0, LLT::fixed_vector(MemoryType.getNumElements() / SplitFactor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return std::make_pair(0, MemoryType.divide(SplitFactor));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that, if some unaligned scalar load appears (let's say s512), divide will give us a valid scalar type, not replacing the type by a vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point but I don't think fewerElements
works with scalars. I can assert that we have a vector here. Otherwise, we will simply scalarize as we already do and currently, it will fail during the process because we can't legalize the resulting merge and unmerge opcodes.
[=](const LegalityQuery &Query) { | ||
const LLT &MemoryType = Query.MMODescrs[0].MemoryTy; | ||
const uint64_t Alignment = Query.MMODescrs[0].AlignInBits; | ||
if (MemoryType.getSizeInBits() >= 512 && Alignment == 256) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return (MemoryType.getSizeInBits() >= 512 && Alignment == 256);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also consider MemoryType.getSizeInBits() % 256 == 0
, or at least assert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assert because we have other types above and it will break for them but we can return false instead
d9810c1
to
a4bb6fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
No description provided.