-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu: misc fixes #5176
gpu: misc fixes #5176
Conversation
test:mpich/ch4/most |
test:mpich/ch4/ofi |
src/include/mpir_gpu_util.h
Outdated
MPI_Aint extent, true_lb, true_extent; | ||
MPIR_Datatype_get_extent_macro(datatype, extent); | ||
MPIR_Type_get_true_extent_impl(datatype, &true_lb, &true_extent); | ||
// extent = MPL_MAX(extent, true_extent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat surprised this works without this line. In any case, if it's not needed, we should remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant to remove it but have forgotten. The data is packed by extent
not true_extent
, so I believe take true_extent * count
is wrong or at least imprecise. true_extent
should only be used to adjust the final boundary.
src/include/mpir_gpu_util.h
Outdated
MPI_Aint size = extent * count - extent + true_extent; | ||
if (true_lb > 0) { | ||
size += true_lb; | ||
} else if (true_lb < 0) { | ||
size += (-true_lb); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I didn't look closely at these manipulations down here. Now I see where the size is adjusted for true_extent
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, the adjustment based on true_lb
is not necessary -- I meant to delete them. The new code returns a pointer shifted by the true_lb
so the exact size should work now.
ddc82bb
to
ab9af0c
Compare
MPIR_Assert(host_buf); | ||
|
||
host_buf = (char *) host_buf - shift; | ||
return host_buf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raffenet Thought about it a bit more. The previous fix didn't account for the case when extent
is negative, which will put the extra count data before the first data, thus we may need to allocate extra size to account for.
Could you review this commit again? Meanwhile, I'll rebase and test it in PR #5000 -- although I don't think DTPool will generate any datatype with negative extent.
test:mpich/ch4/ofi |
We can't directly rdma into a gpu buffer yet, force it to go through unpacking.
Provide higher level common utilities. For one, the code is more readable with higher-level names focusing on the semantics rather than details. For two, it allows easier modification for alternate mechanism.
The old allocates host buffer without consider potential non-zero lower bound, resulting out-of-bound memory overwriting. Use the higher level wrappers provided in the last commit to fix the issue and provider cleaner semantics.
void *host_buf = MPL_malloc(size, MPL_MEM_OTHER); | ||
MPIR_Assert(host_buf); | ||
|
||
host_buf = (char *) host_buf - shift; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So true_lb
is always <= 0 when count == 1
or extent >= 0
? Am I reading this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. true_lb
works whether it's positive or negative, it is an adjustment from the buffer pointer and the arithmetic will work for either positive or negative. However, the true_lb
returned from MPIR_Type_get_true_extent_impl
is the true_lb
for a single count datatype. When extent
is positive, the true_lb
remain the same for multiple count. so no adjustment is needed. However, when extent
is negative, the true_lb
with multiple count will be further extended into the negative direction, thus requires further adjustment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm trying to understand is we are returning an allocated buffer, but with the address potentially shifted outside of the malloc'd region in the case of positive true_lb
. Wouldn't writing to or dereferencing that address lead to a potential crash? How do we guard against these scenarios? Is it because MPIR_Localcopy
will only be used with a datatype that would not do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't writing to or dereferencing that address lead to a potential crash?
The dataype packing/unpacking routine should only access the actual data location specified by the MPI datatype. Only a bug will result in a wrong access.
PS: think about MPI_BOTTOM
, it is okay as long as the datatype
is correctly constructed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, only accessing this with a datatype-aware copy routine makes sense. I think it should be OK. I'd still like to see it go thru Jenkins again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code has already passed Jenkins both here #5176 (comment) and in the gpu testing #5000 (comment) . The last push is only a rebase with PR #5005
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per my previous comments, I am going to go ahead merge this, then get #5000 rebased. Any issues should pop up in that PR anyway.
Pull Request Description
Some fixes that breaks the GPU path.
It is tested in #5000.
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.