gpu: misc fixes #5176

hzhou · 2021-03-26T21:41:38Z

Pull Request Description

Some fixes that breaks the GPU path.

It is tested in #5000.

Author Checklist

Provide Description
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits Follow Good Practice
Commits are self-contained and do not do two things at once.
Commit message is of the form: module: short description
Commit message explains what's in the commit.
Passes All Tests
Whitespace checker. Warnings test. Additional tests via comments.
Contribution Agreement
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.

hzhou · 2021-03-26T21:50:04Z

test:mpich/ch4/most

hzhou · 2021-03-27T03:59:54Z

test:mpich/ch4/ofi

raffenet · 2021-03-29T14:24:21Z

src/include/mpir_gpu_util.h

+        MPI_Aint extent, true_lb, true_extent;
+        MPIR_Datatype_get_extent_macro(datatype, extent);
+        MPIR_Type_get_true_extent_impl(datatype, &true_lb, &true_extent);
+        // extent = MPL_MAX(extent, true_extent);


I'm somewhat surprised this works without this line. In any case, if it's not needed, we should remove it.

Yeah, I meant to remove it but have forgotten. The data is packed by extent not true_extent, so I believe take true_extent * count is wrong or at least imprecise. true_extent should only be used to adjust the final boundary.

raffenet · 2021-03-29T14:43:44Z

src/include/mpir_gpu_util.h

+        MPI_Aint size = extent * count - extent + true_extent;
+        if (true_lb > 0) {
+            size += true_lb;
+        } else if (true_lb < 0) {
+            size += (-true_lb);
+        }


Oh, I see. I didn't look closely at these manipulations down here. Now I see where the size is adjusted for true_extent.

Oops, the adjustment based on true_lb is not necessary -- I meant to delete them. The new code returns a pointer shifted by the true_lb so the exact size should work now.

hzhou · 2021-03-29T16:09:03Z

src/include/mpir_gpu_util.h

+        MPIR_Assert(host_buf);
+
+        host_buf = (char *) host_buf - shift;
+        return host_buf;


@raffenet Thought about it a bit more. The previous fix didn't account for the case when extent is negative, which will put the extra count data before the first data, thus we may need to allocate extra size to account for.

Could you review this commit again? Meanwhile, I'll rebase and test it in PR #5000 -- although I don't think DTPool will generate any datatype with negative extent.

hzhou · 2021-03-29T16:18:20Z

test:mpich/ch4/ofi
✔️

We can't directly rdma into a gpu buffer yet, force it to go through unpacking.

Provide higher level common utilities. For one, the code is more readable with higher-level names focusing on the semantics rather than details. For two, it allows easier modification for alternate mechanism.

The old allocates host buffer without consider potential non-zero lower bound, resulting out-of-bound memory overwriting. Use the higher level wrappers provided in the last commit to fix the issue and provider cleaner semantics.

raffenet · 2021-03-29T18:44:47Z

src/include/mpir_gpu_util.h

+        void *host_buf = MPL_malloc(size, MPL_MEM_OTHER);
+        MPIR_Assert(host_buf);
+
+        host_buf = (char *) host_buf - shift;


So true_lb is always <= 0 when count == 1 or extent >= 0? Am I reading this right?

No. true_lb works whether it's positive or negative, it is an adjustment from the buffer pointer and the arithmetic will work for either positive or negative. However, the true_lb returned from MPIR_Type_get_true_extent_impl is the true_lb for a single count datatype. When extent is positive, the true_lb remain the same for multiple count. so no adjustment is needed. However, when extent is negative, the true_lb with multiple count will be further extended into the negative direction, thus requires further adjustment.

What I'm trying to understand is we are returning an allocated buffer, but with the address potentially shifted outside of the malloc'd region in the case of positive true_lb. Wouldn't writing to or dereferencing that address lead to a potential crash? How do we guard against these scenarios? Is it because MPIR_Localcopy will only be used with a datatype that would not do this?

Wouldn't writing to or dereferencing that address lead to a potential crash?

The dataype packing/unpacking routine should only access the actual data location specified by the MPI datatype. Only a bug will result in a wrong access.

PS: think about MPI_BOTTOM, it is okay as long as the datatype is correctly constructed.

OK, only accessing this with a datatype-aware copy routine makes sense. I think it should be OK. I'd still like to see it go thru Jenkins again.

The code has already passed Jenkins both here #5176 (comment) and in the gpu testing #5000 (comment) . The last push is only a rebase with PR #5005

Per my previous comments, I am going to go ahead merge this, then get #5000 rebased. Any issues should pop up in that PR anyway.

hzhou requested a review from raffenet March 26, 2021 21:49

hzhou mentioned this pull request Mar 27, 2021

bug/jenkins: timeout with rma/win_dynamic_rma_flush_get_collattach #4976

Closed

raffenet approved these changes Mar 29, 2021

View reviewed changes

raffenet reviewed Mar 29, 2021

View reviewed changes

hzhou force-pushed the 2103_gpu_fix branch 2 times, most recently from ddc82bb to ab9af0c Compare March 29, 2021 16:05

hzhou commented Mar 29, 2021

View reviewed changes

hzhou added 4 commits March 29, 2021 13:26

ch4/ofi: use do_long_am_recv_unpack for gpu recv buffer

f35b82a

We can't directly rdma into a gpu buffer yet, force it to go through unpacking.

mpir/gpu: add gpu/host buffer swapping utilities

5e07de3

Provide higher level common utilities. For one, the code is more readable with higher-level names focusing on the semantics rather than details. For two, it allows easier modification for alternate mechanism.

ch4: use MPIR_gpu_host_swap and fix memory bug

d5357df

The old allocates host buffer without consider potential non-zero lower bound, resulting out-of-bound memory overwriting. Use the higher level wrappers provided in the last commit to fix the issue and provider cleaner semantics.

coll: use MPIR_gpu_host_swap and fix memory bug

90f2dbe

hzhou force-pushed the 2103_gpu_fix branch from ab9af0c to 90f2dbe Compare March 29, 2021 18:27

hzhou requested a review from raffenet March 29, 2021 18:27

raffenet reviewed Mar 29, 2021

View reviewed changes

raffenet approved these changes Mar 29, 2021

View reviewed changes

hzhou merged commit 4c1b448 into pmodels:main Mar 29, 2021

hzhou deleted the 2103_gpu_fix branch March 29, 2021 19:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu: misc fixes #5176

gpu: misc fixes #5176

hzhou commented Mar 26, 2021 •

edited

Loading

hzhou commented Mar 26, 2021

hzhou commented Mar 27, 2021

raffenet Mar 29, 2021

hzhou Mar 29, 2021

raffenet Mar 29, 2021

hzhou Mar 29, 2021 •

edited

Loading

hzhou Mar 29, 2021 •

edited

Loading

hzhou commented Mar 29, 2021 •

edited

Loading

raffenet Mar 29, 2021

hzhou Mar 29, 2021

raffenet Mar 29, 2021

hzhou Mar 29, 2021 •

edited

Loading

raffenet Mar 29, 2021

hzhou Mar 29, 2021

hzhou Mar 29, 2021

gpu: misc fixes #5176

gpu: misc fixes #5176

Conversation

hzhou commented Mar 26, 2021 • edited Loading

Pull Request Description

Author Checklist

hzhou commented Mar 26, 2021

hzhou commented Mar 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hzhou Mar 29, 2021 • edited Loading

Choose a reason for hiding this comment

hzhou Mar 29, 2021 • edited Loading

Choose a reason for hiding this comment

hzhou commented Mar 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hzhou Mar 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hzhou commented Mar 26, 2021 •

edited

Loading

hzhou Mar 29, 2021 •

edited

Loading

hzhou Mar 29, 2021 •

edited

Loading

hzhou commented Mar 29, 2021 •

edited

Loading

hzhou Mar 29, 2021 •

edited

Loading