-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/IB/EFA/SRD: Initial interface add #10412
base: master
Are you sure you want to change the base?
Conversation
src/uct/ib/base/ib_iface.h
Outdated
#if HAVE_EFA | ||
UCT_IB_QPT_SRD, | ||
#else | ||
UCT_IB_QPT_SRD = UCT_IB_QPT_UNKNOWN, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we really need to define UCT_IB_QPT_SRD is HAVE_EFA is 0?
- Shall we define UCT_IB_QPT_SRD to IBV_QPT_DRIVER? Or unify them somehow? Seems weird to define UCT_IB_QPT_SRD without any reference to an IB macro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed both
src/uct/ib/efa/Makefile.am
Outdated
@@ -15,9 +15,12 @@ libuct_ib_efa_la_LIBADD = $(top_builddir)/src/ucs/libucs.la \ | |||
$(top_builddir)/src/uct/libuct.la \ | |||
$(top_builddir)/src/uct/ib/libuct_ib.la | |||
|
|||
libuct_ib_efa_la_SOURCES = ib_efa_md.c | |||
libuct_ib_efa_la_SOURCES = ib_efa_md.c \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move this to efa/base, now that we added the srd subdir?
to align with other transports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_def.h
Outdated
typedef ucs_frag_list_sn_t uct_srd_psn_t; | ||
typedef struct uct_srd_iface uct_srd_iface_t; | ||
typedef struct uct_srd_iface_addr uct_srd_iface_addr_t; | ||
typedef struct uct_srd_iface_peer uct_srd_iface_peer_t; | ||
typedef struct uct_srd_ep_addr uct_srd_ep_addr_t; | ||
typedef struct uct_srd_send_op uct_srd_send_op_t; | ||
typedef struct uct_srd_send_desc uct_srd_send_desc_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need all these forward declarations (and in this PR)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed for now
src/uct/ib/efa/srd/srd_def.h
Outdated
typedef uint32_t uct_srd_ep_conn_sn_t; | ||
|
||
typedef struct uct_srd_neth { | ||
uint32_t packet_type; | ||
uint8_t fc; | ||
uct_srd_psn_t psn; | ||
} UCS_S_PACKED uct_srd_neth_t; | ||
|
||
typedef struct uct_srd_recv_desc { | ||
uct_ib_iface_recv_desc_t super; | ||
} uct_srd_recv_desc_t; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to reduce the PR size to the required LOC count, can we move the mpool and desc related code to next PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we strictly add when used, it will be added all at once, so it thought to include the creation/init of qp in pre posting buffers, init + teardown path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to split it somehow to be <500 LOC in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
#endif | ||
|
||
#include "srd_iface.h" | ||
#include "../ib_efa.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use non-relative path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
qp_init_attr.cap.max_send_sge = 1 + ucs_min(config->super.tx.min_sge, | ||
efa_attr->max_sq_sge - 1); | ||
qp_init_attr.cap.max_recv_sge = 1; | ||
qp_init_attr.cap.max_inline_data = efa_attr->inline_buf_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whall we take it from user config for TX_MIN_INLINE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
|
||
ucs_debug("iface=%p: created %s QP 0x%x on " UCT_IB_IFACE_FMT | ||
" TX wr:%d sge:%d inl:%d resp:%d RX wr:%d sge:%d resp:%d", | ||
iface, uct_ib_qp_type_str(UCT_IB_QPT_SRD), iface->qp->qp_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO using uct_ib_qp_type_str(UCT_IB_QPT_SRD) is redundant, it's always SRD here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
qp_attr.qkey = UCT_IB_KEY; | ||
ret = ibv_modify_qp(iface->qp, &qp_attr, | ||
IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | | ||
IBV_QP_QKEY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
{"FC_WND_SIZE", "512", | ||
"The size of flow control window per endpoint. limits the number of AM\n" | ||
"which can be sent w/o acknowledgment.", | ||
ucs_offsetof(uct_srd_iface_config_t, fc.wnd_size), UCS_CONFIG_TYPE_UINT}, | ||
|
||
{"FC_SOFT_THRESH", "0.5", | ||
"Threshold for sending soft request for FC credits to the peer. This " | ||
"value\n" | ||
"refers to the percentage of the FC_WND_SIZE value. (must be > " | ||
"HARD_THRESH and < 1)", | ||
ucs_offsetof(uct_srd_iface_config_t, fc.soft_thresh), | ||
UCS_CONFIG_TYPE_DOUBLE}, | ||
|
||
{"FC_HARD_THRESH", "0.25", | ||
"Threshold for sending hard request for FC credits to the peer. This " | ||
"value\n" | ||
"refers to the percentage of the FC_WND_SIZE value. (must be > 0 and < 1)", | ||
ucs_offsetof(uct_srd_iface_config_t, fc.hard_thresh), | ||
UCS_CONFIG_TYPE_DOUBLE}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FC can be moved to next PR to reduce LOC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually could likely be among the last PRs
src/uct/ib/efa/srd/srd_iface.h
Outdated
uct_ib_iface_config_t super; | ||
uct_ud_iface_common_config_t ud_common; | ||
|
||
struct { | ||
size_t max_get_zcopy; | ||
} tx; | ||
|
||
struct { | ||
double soft_thresh; | ||
double hard_thresh; | ||
unsigned wnd_size; | ||
} fc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align fields on column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
libuct_ib_efa_la_LDFLAGS = $(EFA_LIB) $(IBVERBS_LDFLAGS) \ | ||
-version-info $(SOVERSION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: why would need to move?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
align order with IB Makefile.am
src/uct/ib/efa/srd/srd_iface.c
Outdated
qp_init_attr.cap.max_send_sge = 1 + ucs_min(config->super.tx.min_sge, | ||
efa_attr->max_sq_sge - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems simpler
qp_init_attr.cap.max_send_sge = 1 + ucs_min(config->super.tx.min_sge, | |
efa_attr->max_sq_sge - 1); | |
qp_init_attr.cap.max_send_sge = ucs_min(config->super.tx.min_sge + 1, | |
efa_attr->max_sq_sge); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
iface->super.config.max_inl_cqe[UCT_IB_DIR_RX]); | ||
|
||
/* Transition the QP to RTS */ | ||
memset(&qp_attr, 0, sizeof(qp_attr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not initializing in place as other attrs in this func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
return status; | ||
} | ||
|
||
memset(&init_attr, 0, sizeof(init_attr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe init in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
ret = ibv_modify_qp(iface->qp, &qp_attr, | ||
IBV_QP_STATE | IBV_QP_PKEY_INDEX | | ||
IBV_QP_PORT | IBV_QP_QKEY); | ||
if (ret) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (ret) { | |
if (ret != 0) { |
?
or use the same style for all places in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
/* TAG */ | ||
iface_attr->cap.tag.rndv.max_zcopy = iface->config.max_get_zcopy; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to initialize, because SRD does not support tag offload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks removed
src/uct/ib/efa/srd/srd_iface.c
Outdated
iface_attr->cap.am.max_short = uct_ib_iface_hdr_size( | ||
iface->config.max_inline, sizeof(uct_srd_neth_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can move this part to line 313 (when AM short support is checked)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.c
Outdated
iface_attr->cap.am.max_short = uct_ib_iface_hdr_size( | ||
md->dev.max_inline_data, sizeof(uct_srd_neth_t)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already initialized on line 300 using different value. What is the correct one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
src/uct/ib/efa/srd/srd_iface.h
Outdated
|
||
#include "srd_def.h" | ||
|
||
/** @file srd_iface.h */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/uct/ib/efa/srd/srd_iface.h
Outdated
#ifndef UCT_SRD_IFACE_H | ||
#define UCT_SRD_IFACE_H | ||
|
||
#include <uct/ib/base/ib_iface.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor - it is already included in srd_def.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed, will rely implicitly on it
What?
Initial add of SRD transport with bare interface instantiation with QP, tested like below.
Depends on #10069 to implement build and testing in CI.
Test