Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/MD: retry memory registration with reduced access flags on failure #10341

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

amastbaum
Copy link
Collaborator

@amastbaum amastbaum commented Dec 1, 2024

What?

Enable memory registration for read-only addresses.

Why?

A customer reported an issue when attempting to send read-only memory using RDMA. Details can be found in the related discussion: https://github.com/openucx/ucx/issues/10085#issuecomment-2377368540.

@amastbaum amastbaum added the WIP-DNM Work in progress / Do not review label Dec 1, 2024
@amastbaum amastbaum added Ready for Review and removed WIP-DNM Work in progress / Do not review labels Dec 2, 2024
@amastbaum amastbaum requested a review from gleon99 December 2, 2024 11:23
@amastbaum amastbaum marked this pull request as ready for review December 2, 2024 11:24
@@ -49,7 +49,8 @@ ucp_proto_rndv_rts_request_init(ucp_request_t *req)
status = ucp_datatype_iter_mem_reg(ep->worker->context,
&req->send.state.dt_iter,
rpriv->md_map,
UCT_MD_MEM_ACCESS_RMA |
UCT_MD_MEM_ACCESS_REMOTE_GET |
UCT_MD_MEM_ACCESS_LOCAL_READ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the most basic option so I thought I can keep mention it as well, but on the other hand it doesn't matter as local read is enabled by default in ibv_reg_mr (there is no such flag in verbs.h because it's always enabled)

md->super.relaxed_order,
1, flags) &
access_mask;
int attempt_cnt = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid abbreviations (cnt)

@@ -626,9 +627,39 @@ ucs_status_t uct_ib_memh_alloc(uct_ib_md_t *md, size_t length,
return UCS_OK;
}

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order)
uint64_t uct_flags_to_ibv_mem_access_flags(uint64_t uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uct_ib_..

@@ -460,7 +460,7 @@ uct_ib_md_handle_mr_list_mt(uct_ib_md_t *md, void *address, size_t length,
ucs_status_t uct_ib_reg_mr(uct_ib_md_t *md, void *address, size_t length,
const uct_md_mem_reg_params_t *params,
uint64_t access_flags, struct ibv_dm *dm,
struct ibv_mr **mr_p)
struct ibv_mr **mr_p, int first_attempt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_attempt -> ignore_errors or smth similar.
This func doesn't care about "attempts" and their number, the meaning of this flag here is only to "force" ignore errors.

}

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order,
int first_attempt, uint64_t uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, I'd rename first_attempt to its real meaning in this context.
Maybe smth like use_uct_flags?

access_flags = uct_ib_memh_access_flags(&memh->super, md->relaxed_order,
attempt_cnt == 0, uct_flags);

status = uct_ib_reg_mr(md, address, length, params, access_flags, NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align =

NULL, &memh->mrs[mr_type].super.ib,
attempt_cnt == 0);
attempt_cnt++;
access_flags = uct_ib_memh_access_flags(&memh->super,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First reg_mr, then access_flags - is that intentional?
This line always has attempt_cnt >= 1...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's intentional, though it might be written better..
access_flags is first calculated at the top of the function because it's used for another function, and only then it is used in the do while loop

@amastbaum amastbaum requested a review from gleon99 December 3, 2024 14:02
@@ -626,9 +627,35 @@ ucs_status_t uct_ib_memh_alloc(uct_ib_md_t *md, size_t length,
return UCS_OK;
}

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order)
uint64_t uct_ib_flags_to_ibv_mem_access_flags(uint64_t uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static

@@ -309,7 +309,7 @@ void *uct_ib_md_mem_handle_thread_func(void *arg)
while (ctx->length > 0) {
if (ctx->params != NULL) {
status = uct_ib_reg_mr(ctx->md, ctx->address, length, ctx->params,
ctx->access_flags, NULL, &ctx->mrs[mr_idx]);
ctx->access_flags, NULL, &ctx->mrs[mr_idx], 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why hide_errors is true here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe Oh, I actually wanted to put here a value that doesn't change the previous behavior and put 1 by mistake. I'll also change it in the other places.

Comment on lines 691 to 699
memh = ucs_derived_of(ib_memh, uct_ib_verbs_mem_t);

do {
access_flags = uct_ib_memh_access_flags(&memh->super, md->relaxed_order,
attempt_count > 0, uct_flags);

status = uct_ib_reg_mr(md, address, length, params, access_flags,
NULL, &mr_default, attempt_count == 0);
attempt_count++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This retry should happen in UCP layer, otherwise IMO it can cause issues with UCP rcache region merges

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe Where are the rcache region merges triggered from?
Do you think we should still try to register with full permissions and only then with specific ones?

@amastbaum amastbaum force-pushed the support_mem_reg_of_a_read_only_address branch from 364f3cf to 251cd9a Compare January 2, 2025 13:05
@@ -462,7 +462,7 @@ static void ucp_memh_cleanup(ucp_context_h context, ucp_mem_h memh)
}

static ucs_status_t ucp_memh_register_gva(ucp_context_h context, ucp_mem_h memh,
ucp_md_map_t md_map)
ucp_md_map_t md_map, unsigned uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it's temp, but, just in case, it would be good to replace gva_mr with some structure and keep registrations for different access flags there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants