Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/IB/MD: retry memory registration with reduced access flags on failure #10341

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
e47f0ff
UCT/IB/MD: retry memory registration with reduced access flags on fai…
amastbaum Dec 1, 2024
0ed3576
UCT/IB/MD: add myself to AUTHORS file
amastbaum Dec 2, 2024
5b5a4ba
UCT/IB/MD: added some changes
amastbaum Dec 2, 2024
ebd6386
UCT/IB/MD: set mem reg flags to read-only in proto rndv rts
amastbaum Dec 2, 2024
e2564b6
UCT/IB/MD: CR fixes
amastbaum Dec 3, 2024
bcf4d11
UCT/IB/MD: CR fixes 2
amastbaum Dec 3, 2024
7f0ad47
UCT/IB/MD: add ucp_mem_map params->prot to uct_flags translation
amastbaum Dec 8, 2024
0be06fc
Merge branch 'master' into support_mem_reg_of_a_read_only_address
amastbaum Dec 10, 2024
547432e
UCT/IB/MD: CR fixes
amastbaum Dec 11, 2024
f45911c
UCT/IB/MD: use only ucp defined flags in ib mem reg
amastbaum Dec 16, 2024
fdaa23a
UCT/IB/MD: revert changes in UCP
amastbaum Dec 16, 2024
a25d39b
Merge branch 'master' into support_mem_reg_of_a_read_only_address
amastbaum Dec 30, 2024
251cd9a
UCT/IB/MD: pass uct_flags to gva mem_reg
amastbaum Jan 1, 2025
2dc3f68
UCT/IB/MD: set uct_flags to ALL when calling ucp_datatype_iter_mem_reg
amastbaum Jan 6, 2025
26e991e
UCT/IB/MD: change mem_reg permissions in rndv_rtr to ALL
amastbaum Jan 6, 2025
bc3cf3c
UCT/IB/MD: change libperf memory registration permissions to ALL
amastbaum Jan 6, 2025
d2b1b60
UCT/IB/MD: add ucp_mem_map params->prot to uct_flags translation
amastbaum Jan 7, 2025
8aa1b4e
UCT/IB/MD: add full permissions to ucp_test mem_map
amastbaum Jan 7, 2025
0ad3eb4
UCT/IB/MD: decoupled atomic permission flags assignment from rma flags
amastbaum Jan 9, 2025
22ec2c2
UCT/IB/MD: added memory protection flags to map_buffer test function
amastbaum Jan 9, 2025
256bd62
UCT/IB/MD: specified memory protection flags in test_ucp_mem_type and…
amastbaum Jan 9, 2025
c6f8cbe
Merge branch 'master' into support_mem_reg_of_a_read_only_address
amastbaum Jan 9, 2025
e3aa7ed
UCT/IB/MD: don't fail in dereg_invalidate_rkey_check if the requested…
amastbaum Jan 13, 2025
a6a700e
UCT/IB/MD: revert invalidate_rkey_check change - return 'unsupported'…
amastbaum Jan 15, 2025
bae0a11
UCT/IB/MD: don't include atomic mem access flags in rndv memory regis…
amastbaum Jan 15, 2025
399eb3a
UCT/IB/MD: support both read-only and r/w instances of gva mr
amastbaum Jan 26, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Alex Margolin <[email protected]>
Alex Mikheev <[email protected]>
Alexey Rivkin <[email protected]>
Alina Sklarevich <[email protected]>
Alma Mastbaum <[email protected]>
Anatoly Vildemanov <[email protected]>
Andrey Maslennikov <[email protected]>
Artem Polyakov <[email protected]>
Expand Down
3 changes: 2 additions & 1 deletion src/ucp/rndv/proto_rndv.inl
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,8 @@ ucp_proto_rndv_rts_request_init(ucp_request_t *req)
status = ucp_datatype_iter_mem_reg(ep->worker->context,
&req->send.state.dt_iter,
rpriv->md_map,
UCT_MD_MEM_ACCESS_RMA |
UCT_MD_MEM_ACCESS_REMOTE_GET |
UCT_MD_MEM_ACCESS_LOCAL_READ |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just the most basic option so I thought I can keep mention it as well, but on the other hand it doesn't matter as local read is enabled by default in ibv_reg_mr (there is no such flag in verbs.h because it's always enabled)

UCT_MD_MEM_FLAG_HIDE_ERRORS,
UCP_DT_MASK_ALL);
if (status != UCS_OK) {
Expand Down
66 changes: 52 additions & 14 deletions src/uct/ib/base/ib_md.c
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ void *uct_ib_md_mem_handle_thread_func(void *arg)
while (ctx->length > 0) {
if (ctx->params != NULL) {
status = uct_ib_reg_mr(ctx->md, ctx->address, length, ctx->params,
ctx->access_flags, NULL, &ctx->mrs[mr_idx]);
ctx->access_flags, NULL, &ctx->mrs[mr_idx], 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why hide_errors is true here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe Oh, I actually wanted to put here a value that doesn't change the previous behavior and put 1 by mistake. I'll also change it in the other places.

if (status != UCS_OK) {
goto err_dereg;
}
Expand Down Expand Up @@ -460,7 +460,7 @@ uct_ib_md_handle_mr_list_mt(uct_ib_md_t *md, void *address, size_t length,
ucs_status_t uct_ib_reg_mr(uct_ib_md_t *md, void *address, size_t length,
const uct_md_mem_reg_params_t *params,
uint64_t access_flags, struct ibv_dm *dm,
struct ibv_mr **mr_p)
struct ibv_mr **mr_p, int first_attempt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_attempt -> ignore_errors or smth similar.
This func doesn't care about "attempts" and their number, the meaning of this flag here is only to "force" ignore errors.

{
ucs_time_t UCS_V_UNUSED start_time = ucs_get_time();
unsigned long retry = 0;
Expand Down Expand Up @@ -507,7 +507,8 @@ ucs_status_t uct_ib_reg_mr(uct_ib_md_t *md, void *address, size_t length,
if (mr == NULL) {
uct_ib_md_print_mem_reg_err_msg(title, address, length, access_flags,
errno,
flags & UCT_MD_MEM_FLAG_HIDE_ERRORS);
(flags & UCT_MD_MEM_FLAG_HIDE_ERRORS) ||
first_attempt);
return UCS_ERR_IO_ERROR;
}

Expand Down Expand Up @@ -626,9 +627,39 @@ ucs_status_t uct_ib_memh_alloc(uct_ib_md_t *md, size_t length,
return UCS_OK;
}

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order)
uint64_t uct_flags_to_ibv_mem_access_flags(uint64_t uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uct_ib_..

{
uint64_t access_flags = UCT_IB_MEM_ACCESS_FLAGS;
uint64_t ibv_flags = 0;

if (uct_flags & UCT_MD_MEM_ACCESS_LOCAL_WRITE) {
ibv_flags |= IBV_ACCESS_LOCAL_WRITE;
}

if (uct_flags & UCT_MD_MEM_ACCESS_REMOTE_GET) {
ibv_flags |= IBV_ACCESS_REMOTE_READ;
}

if (uct_flags & UCT_MD_MEM_ACCESS_REMOTE_PUT) {
ibv_flags |= IBV_ACCESS_REMOTE_WRITE;
}

if (uct_flags & UCT_MD_MEM_ACCESS_REMOTE_ATOMIC) {
ibv_flags |= IBV_ACCESS_REMOTE_ATOMIC;
}

return ibv_flags;
}

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order,
int first_attempt, uint64_t uct_flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, I'd rename first_attempt to its real meaning in this context.
Maybe smth like use_uct_flags?

{
uint64_t access_flags;

if (first_attempt) {
access_flags = UCT_IB_MEM_ACCESS_FLAGS;
} else {
access_flags = uct_flags_to_ibv_mem_access_flags(uct_flags);
}

if (memh->flags & UCT_IB_MEM_FLAG_ODP) {
access_flags |= IBV_ACCESS_ON_DEMAND;
Expand All @@ -645,26 +676,33 @@ ucs_status_t uct_ib_verbs_mem_reg(uct_md_h uct_md, void *address, size_t length,
const uct_md_mem_reg_params_t *params,
uct_mem_h *memh_p)
{
uct_ib_md_t *md = ucs_derived_of(uct_md, uct_ib_md_t);
uct_ib_md_t *md = ucs_derived_of(uct_md, uct_ib_md_t);
uint64_t uct_flags = UCT_MD_MEM_REG_FIELD_VALUE(params, flags,
FIELD_FLAGS, 0);
int attempt_cnt = 0;
struct ibv_mr *mr_default;
uct_ib_verbs_mem_t *memh;
uct_ib_mem_t *ib_memh;
uint64_t access_flags;
ucs_status_t status;

status = uct_ib_memh_alloc(md, length,
UCT_MD_MEM_REG_FIELD_VALUE(params, flags,
FIELD_FLAGS, 0),
status = uct_ib_memh_alloc(md, length, uct_flags,
sizeof(*memh), sizeof(memh->mrs[0]), &ib_memh);
if (status != UCS_OK) {
goto err;
}

memh = ucs_derived_of(ib_memh, uct_ib_verbs_mem_t);
access_flags = uct_ib_memh_access_flags(&memh->super, md->relaxed_order);
memh = ucs_derived_of(ib_memh, uct_ib_verbs_mem_t);

do {
access_flags = uct_ib_memh_access_flags(&memh->super, md->relaxed_order,
attempt_cnt == 0, uct_flags);

status = uct_ib_reg_mr(md, address, length, params, access_flags, NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align =

&mr_default, attempt_cnt == 0);
attempt_cnt++;
} while ((status != UCS_OK) && (attempt_cnt < 2));

status = uct_ib_reg_mr(md, address, length, params, access_flags, NULL,
&mr_default);
if (status != UCS_OK) {
goto err_free;
}
Expand All @@ -676,7 +714,7 @@ ucs_status_t uct_ib_verbs_mem_reg(uct_md_h uct_md, void *address, size_t length,
if (md->relaxed_order) {
status = uct_ib_reg_mr(md, address, length, params,
access_flags & ~IBV_ACCESS_RELAXED_ORDERING,
NULL, &memh->mrs[UCT_IB_MR_STRICT_ORDER].ib);
NULL, &memh->mrs[UCT_IB_MR_STRICT_ORDER].ib, 1);
if (status != UCS_OK) {
goto err_dereg_default;
}
Expand Down
5 changes: 3 additions & 2 deletions src/uct/ib/base/ib_md.h
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ void uct_ib_md_close(uct_md_h tl_md);
ucs_status_t uct_ib_reg_mr(uct_ib_md_t *md, void *address, size_t length,
const uct_md_mem_reg_params_t *params,
uint64_t access_flags, struct ibv_dm *dm,
struct ibv_mr **mr_p);
struct ibv_mr **mr_p, int first_attempt);

ucs_status_t uct_ib_dereg_mr(struct ibv_mr *mr);

Expand All @@ -388,7 +388,8 @@ uct_ib_md_handle_mr_list_mt(uct_ib_md_t *md, void *address, size_t length,
uint64_t access_flags, size_t mr_num,
struct ibv_mr **mrs);

uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order);
uint64_t uct_ib_memh_access_flags(uct_ib_mem_t *memh, int relaxed_order,
int first_attempt, uint64_t uct_flags);

ucs_status_t uct_ib_verbs_mem_reg(uct_md_h uct_md, void *address, size_t length,
const uct_md_mem_reg_params_t *params,
Expand Down
41 changes: 30 additions & 11 deletions src/uct/ib/mlx5/dv/ib_mlx5dv_md.c
Original file line number Diff line number Diff line change
Expand Up @@ -703,11 +703,13 @@ uct_ib_mlx5_devx_reg_mr(uct_ib_mlx5_md_t *md, uct_ib_mlx5_devx_mem_t *memh,
uct_ib_mr_type_t mr_type, uint64_t access_mask,
uint32_t *lkey_p, uint32_t *rkey_p)
{
uint64_t access_flags = uct_ib_memh_access_flags(&memh->super,
md->super.relaxed_order) &
access_mask;
unsigned flags = UCT_MD_MEM_REG_FIELD_VALUE(params, flags,
FIELD_FLAGS, 0);
uint64_t access_flags = uct_ib_memh_access_flags(&memh->super,
md->super.relaxed_order,
1, flags) &
access_mask;
int attempt_cnt = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid abbreviations (cnt)

ucs_status_t status;
uint32_t mkey;

Expand All @@ -731,8 +733,16 @@ uct_ib_mlx5_devx_reg_mr(uct_ib_mlx5_md_t *md, uct_ib_mlx5_devx_mem_t *memh,
/* Fallback if multi-thread registration is unsupported */
}

status = uct_ib_reg_mr(&md->super, address, length, params, access_flags,
NULL, &memh->mrs[mr_type].super.ib);
do {
status = uct_ib_reg_mr(&md->super, address, length, params, access_flags,
NULL, &memh->mrs[mr_type].super.ib,
attempt_cnt == 0);
attempt_cnt++;
access_flags = uct_ib_memh_access_flags(&memh->super,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First reg_mr, then access_flags - is that intentional?
This line always has attempt_cnt >= 1...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's intentional, though it might be written better..
access_flags is first calculated at the top of the function because it's used for another function, and only then it is used in the do while loop

md->super.relaxed_order,
attempt_cnt == 0, flags);
} while ((status != UCS_OK) && (attempt_cnt < 2));

if (status != UCS_OK) {
return status;
}
Expand Down Expand Up @@ -792,6 +802,7 @@ uct_ib_mlx5_devx_mem_reg_gva(uct_md_h uct_md, unsigned flags, uct_mem_h *memh_p)
{
uct_ib_mlx5_md_t *md = ucs_derived_of(uct_md, uct_ib_mlx5_md_t);
uct_md_mem_reg_params_t params = {};
int attempt_cnt = 0;
uct_ib_mlx5_devx_mem_t *memh;
uint64_t access_flags;
ucs_status_t status;
Expand All @@ -805,9 +816,17 @@ uct_ib_mlx5_devx_mem_reg_gva(uct_md_h uct_md, unsigned flags, uct_mem_h *memh_p)
}

relaxed_order = md->flags & UCT_IB_MLX5_MD_FLAG_GVA_RO;
access_flags = uct_ib_memh_access_flags(&memh->super, relaxed_order);
status = uct_ib_reg_mr(&md->super, NULL, SIZE_MAX, &params, access_flags,
NULL, &memh->mrs[UCT_IB_MR_DEFAULT].super.ib);

do {
access_flags = uct_ib_memh_access_flags(&memh->super, relaxed_order,
attempt_cnt == 0, flags);
status = uct_ib_reg_mr(&md->super, NULL, SIZE_MAX, &params,
access_flags, NULL,
&memh->mrs[UCT_IB_MR_DEFAULT].super.ib,
attempt_cnt == 0);
attempt_cnt++;
} while ((status != UCS_OK) && (attempt_cnt < 2));

if (status != UCS_OK) {
goto err_reg;
}
Expand All @@ -816,7 +835,7 @@ uct_ib_mlx5_devx_mem_reg_gva(uct_md_h uct_md, unsigned flags, uct_mem_h *memh_p)
status = uct_ib_reg_mr(&md->super, NULL, SIZE_MAX, &params,
access_flags & ~IBV_ACCESS_RELAXED_ORDERING,
NULL,
&memh->mrs[UCT_IB_MR_STRICT_ORDER].super.ib);
&memh->mrs[UCT_IB_MR_STRICT_ORDER].super.ib, 1);
if (status != UCS_OK) {
goto err_dereg_default;
}
Expand Down Expand Up @@ -1826,7 +1845,7 @@ static void uct_ib_mlx5_devx_init_flush_mr(uct_ib_mlx5_md_t *md)

status = uct_ib_reg_mr(&md->super, md->zero_buf,
UCT_IB_MD_FLUSH_REMOTE_LENGTH, &params,
UCT_IB_MEM_ACCESS_FLAGS, NULL, &md->flush_mr);
UCT_IB_MEM_ACCESS_FLAGS, NULL, &md->flush_mr, 1);
if (status != UCS_OK) {
goto err;
}
Expand Down Expand Up @@ -2027,7 +2046,7 @@ uct_ib_mlx5_devx_device_mem_alloc(uct_md_h uct_md, size_t *length_p,
reg_params.field_mask = 0;
status = uct_ib_reg_mr(md, address, dm_attr.length, &reg_params,
UCT_IB_MEM_ACCESS_FLAGS, dm,
&memh->mrs[UCT_IB_MR_DEFAULT].super.ib);
&memh->mrs[UCT_IB_MR_DEFAULT].super.ib, 1);
if (status != UCS_OK) {
goto err_munmap_address;
}
Expand Down
Loading