-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix a bug introduced by PR 11854 #12242
Conversation
opal/mca/pmix/base/pmix_base_fns.c
Outdated
mca_base_parse_paramfile(file, ¶ms); | ||
free(file); | ||
OPAL_LIST_FOREACH (fv, ¶ms, mca_base_var_file_value_t) { | ||
pmix_overlap = (NULL == evar) ? 0 : check_pmix_overlap(fv->mbvfv_var, evar); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see my comment on your mpi4py PR - this is an incorrect fix. The evar
variable is not set - you are using a dangling value from the prior code block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hu? the triad has check for evar being null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay i see what you're saying. we just shouldn't use evar in this section.
opal/mca/pmix/base/pmix_base_fns.c
Outdated
mca_base_parse_paramfile(file, ¶ms); | ||
free(file); | ||
OPAL_LIST_FOREACH (fv, ¶ms, mca_base_var_file_value_t) { | ||
if (check_pmix_param(fv->mbvfv_var)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still isn't quite correct. You do need to check the params from the file for overlap - i.e., legacy ORTE params that need to be converted to PMIx. The problem was that we were passing the wrong things to check_pmix_param
- the params in that function are (the variable name, the variable's value). So here you would pass check_pmix_overlap(fv->mbvfv_var, fv->mbvfv_value)
.
You still need check_pmix_param
in case they put a PMIx var in the file - you just need to check overlap as well.
HTH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rhc54 check now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
there is a typo in the commit message (PR ... introduced a big) |
c48044e
to
f2ecd67
Compare
i think the nvidia failure is related to #12245 |
@qkoziol please review when you have a chance |
@hppritcha That's correct. The fix is here #12246 I will ping Quincey to take a look at the PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good fixes!
Hmmm not sure how to trigger nvidia CI. The failure should have been fixed. |
PR open-mpi#11854 introduced a big that causes singleton runs to segfault at startup in some cases this bug was rooted out by the github action in PR open-mpi#12217 Signed-off-by: Howard Pritchard <[email protected]>
f2ecd67
to
86a05c1
Compare
PR #11854 introduced a big that causes singleton runs to segfault at startup in some cases
this bug was rooted out by the github action in
PR #12217