Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stokhos: Remove NUM_MPI_PROCS from parallel-enabled tests #11322

Merged

Conversation

etphipp
Copy link
Contributor

@etphipp etphipp commented Dec 1, 2022

Apparently including NUM_MPI_PROCS N with N > 1 prevents the test from running in a serial build, which is not desired behavior. I don't remember this being the case years ago when this option was added, but appears to be now.

Apparely including NUM_MPI_PROCS N with N > 1 prevents the test from
running in a serial build, which is not desired behavior.  I don't
remember this being the case years ago when this option was added, but
appears to be now.
@etphipp etphipp added pkg: Stokhos AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed labels Dec 1, 2022
@etphipp etphipp requested a review from rppawlo December 1, 2022 00:05
@etphipp etphipp self-assigned this Dec 1, 2022
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_PR_gcc-8.3.0

  • Build Num: 1489
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-openmpi-1.10.1-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_gcc-8.3.0-serial

  • Build Num: 16
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_gcc-8.3.0-debug

  • Build Num: 15
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-openmpi-1.10.1-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_clang-11.0.1

  • Build Num: 15
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-clang-11.0.1-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_python3

  • Build Num: 1271
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-7.2.0-anaconda3-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_pr-framework
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL ascic
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_cuda-11.4.2-uvm-off

  • Build Num: 1016
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-cuda-11.4.2-sems-gnu-10.1.0-sems-openmpi-4.0.5_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL GPU
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Using Repos:

Repo: TRILINOS (etphipp/Trilinos)
  • Branch: stokhos_test_num_mpi_procs
  • SHA: 2462f69
  • Mode: TEST_REPO

Pull Request Author: etphipp

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_PR_gcc-8.3.0

  • Build Num: 1489
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-openmpi-1.10.1-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_gcc-8.3.0-serial

  • Build Num: 16
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_gcc-8.3.0-debug

  • Build Num: 15
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-8.3.0-openmpi-1.10.1-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_clang-11.0.1

  • Build Num: 15
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-clang-11.0.1-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL trilinos-any
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_python3

  • Build Num: 1271
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-gnu-7.2.0-anaconda3-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_no-mpi_no-pt_no-rdc_no-uvm_deprecated-on_pr-framework
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL ascic
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454

Build Information

Test Name: Trilinos_PR_cuda-11.4.2-uvm-off

  • Build Num: 1016
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
FORCE_CLEAN true
GENCONFIG_BUILD_NAME rhel7_sems-cuda-11.4.2-sems-gnu-10.1.0-sems-openmpi-4.0.5_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables
PR_LABELS pkg: Stokhos;AT: AUTOMERGE
PULLREQUESTNUM 11322
PULLREQUEST_CDASH_TRACK Pull Request
TEST_REPO_ALIAS TRILINOS
TRILINOS_NODE_LABEL GPU
TRILINOS_SOURCE_BRANCH stokhos_test_num_mpi_procs
TRILINOS_SOURCE_REPO https://github.com/etphipp/Trilinos
TRILINOS_SOURCE_SHA 2462f69
TRILINOS_SRN_CONFIG true
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 9c91454


CDash Test Results for PR# 11322.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ rppawlo ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester trilinos-autotester merged commit b6ed086 into trilinos:develop Dec 1, 2022
@trilinos-autotester
Copy link
Contributor

Merge on Pull Request# 11322: IS A SUCCESS - Pull Request successfully merged

@trilinos-autotester trilinos-autotester removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Dec 1, 2022
@rppawlo
Copy link
Contributor

rppawlo commented Dec 1, 2022

Apparently including NUM_MPI_PROCS N with N > 1 prevents the test from running in a serial build, which is not desired behavior. I don't remember this being the case years ago when this option was added, but appears to be now.

@bartlettroscoe - is this intended? I'm pretty sure we have use cases where we want a test to run in both serial and parallel builds, and where we need to force a specific number of mpi processes to be able to check the answer in parallel.

@bartlettroscoe
Copy link
Member

It was a defect in TriBITS and very undesirable behavior if you think about it. (I can provide some concrete examples.) If you want to run a test both in serial and in parallel, just add two tests, one with NUM_MPI_PROCS 1 and one with NUM_MPI_PROCS <n>. That way, the MPI build runs a super-set of the SERIAL tests (so there is no need to have to run both an MPI build and a SERIAL build to run all of the tests).

@rppawlo
Copy link
Contributor

rppawlo commented Dec 1, 2022

So is this something you plan to fix in tribits? Or would it break too many projects? I would probably have to go through all the nox and loca examples. I bet every epetra and tpetra test has this issue. It would be much easier to see it fixed in tribits, but that may lead to issues downstream (test harnesses timing out...).

@bartlettroscoe
Copy link
Member

So is this something you plan to fix in tribits? Or would it break too many projects? I would probably have to go through all the nox and loca examples. I bet every epetra and tpetra test has this issue. It would be much easier to see it fixed in tribits, but that may lead to issues downstream (test harnesses timing out...).

@rppawlo, I was not clear in my response above: The behavior where TriBITS would simply ignore NUM_MPI_PROCS N for N > 1 in a non-MPI build and run the test using one process should be considered a defect which has now been fixed in TriBITS (which is why I said "was a defect" instead of "is a defect"). In other words, if a developer sets NUM_MPI_PROCS N for a test, they should have the expectation that that test will only be run with exactly N MPI processes. So the only harm here is that the NUM_MPI_PROCS 1 version of the test is not being run in a non-MPI SERIAL build.

TL;DR

I suspect that most cases where a developer was setting NUM_MPI_PROCS N for N > 1 they did not expect (or want) the test to run on a single process in a non-MPI build (and they likely fixed that by adding COMM mpi). For example, I was surprised that setting NUM_MPI_PROCS N for N > 1 still added the test in a non-MPI build (and that behavior was not documented and I was the one who wrote the code).

Note that @etphipp's fix in this PR was to simply remove NUM_MPI_PROCS 4. That makes a test's NUM_MPI_PROCS set by ${MPI_EXEC_DEFAULT_NUMPROCS} (with the default of 4 in an MPI build and a default of 1 in a non-MPI build) which basically says that you don't really care how many MPI procs are used to run a test or an example. But the value of MPI_EXEC_DEFAULT_NUMPROCS can be changed in the CMake Cache so any tests that depend on ${MPI_EXEC_DEFAULT_NUMPROCS} will be fragile unless the test is very robust when running with any number of MPI procs (which would be an unusual test).

Note that this behavior of having the default for NUM_MPI_PROCS set by ${MPI_EXEC_DEFAULT_NUMPROCS} was likely a dumb design decision make 14 years ago. This is an example of where more planning and thought should have gone into this behavior then. (But I think I was just copying what the old Trilinos test harness was doing which may not have been a great decision.)

So what do we want to do here? This change was made quite a while ago to Trilinos 'develop' so many new tests have likely been added using the new (arguably correct) behavior where setting NUM_MPI_PROCS N means that the test are run with exactly N MPI processes.

I will go back to my assertion that we want to design our test suites so that doing an MPI build also runs a superset of all same one-proc tests run by a non-MPI build would run, in addition to the multi-process tests. That way, doing an MPI build and passing all of the tests gives great confidence that all of the tests will pass with an non-MPI build. (That was not the case with the old behavior of TriBITS and is not the case with these updated Stokhos tests or any tests that leave off the NUM_MPI_PROCS argument.)

Other than the argument to maintain backwards compatibility for all time (even for confusing and surprising behavior), is there another view here?

Any more discussion on this should likely happen in a short MS Teams meeting (which will speed up the communication).

@bartlettroscoe
Copy link
Member

NOTE: Really, there never should have been a variable MPI_EXEC_DEFAULT_NUMPROCS and the default for leaving off NUM_MPI_PROCS should have been to run the test on just one process. If we are going to do a bunch of work with Trilinos tests, that is what we should do. (But that is a lot of work.)

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

I removed NUM_MPI_PROCS because I have no reason to believe it won't pass on any number of processors and I want it to run in serial builds.

If specifying NUM_MPI_PROCS N means only run the test with exactly N procs, why do you allow SERIAL as an option for COMM if N > 1? Seems like if the user includes SERIAL for COMM, they expect it to run in a non-MPI build. If this is the behavior you intend, you should make it a configure error if SERIAL is specified when N > 1.

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Dec 1, 2022

I removed NUM_MPI_PROCS because I have no reason to believe it won't pass on any number of processors and I want it to run in serial builds.

👍

If specifying NUM_MPI_PROCS N means only run the test with exactly N procs, why do you allow SERIAL as an option for COMM if N > 1?Seems like if the user includes SERIAL for COMM, they expect it to run in a non-MPI build. If this is the behavior you intend, you should make it a configure error if SERIAL is specified when N > 1.

@etphipp, good point, should be an error because the test will never be added and never be run. I can update TriBITS to error out in those cases.

Update: I added TriBITSPub/TriBITS#542

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

Getting that through the autotester will require updating lots of tests :)

@bartlettroscoe
Copy link
Member

Getting that through the autotester will require updating lots of tests :)

@etphipp you think there are a lot of tests passing in COMM serial and NUM_MPI_PROCS N for N > 1? I would be surprised if there was. Note that currently those tests are never being added or run so worst-case scenario we just comment out those tests and let the individual package teams decide how they want to deal with those.

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

I don't know about other packages, but I know there are many NOX/LOCA tests where this is the case. I don't think you should just comment out those tests. The proper solution would be to add a separate SERIAL test.

@bartlettroscoe
Copy link
Member

I don't know about other packages, but I know there are many NOX/LOCA tests where this is the case. I don't think you should just comment out those tests.

@etphipp, can you point out some examples?

The proper solution would be to add a separate SERIAL test.

Right, but some those tests may be failing once you add then back the so someone will have to fix those tests. I guess we just post a PR for each package independently and ask the package teams to take them over and fix the one-proc tests that are failing?

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

There are many examples in NOX/LOCA, e.g., https://github.com/trilinos/Trilinos/blob/master/packages/nox/test/epetra/DS6.5.1/CMakeLists.txt. I am not aware of any tests in NOX/LOCA with NUM_MPI_PROCS > 1 that aren't expected to pass in serial builds. So hopefully none of these tests fail in serial as that represents a serious regression.

@bartlettroscoe
Copy link
Member

Those NOX/LOCA tests show the args:

    COMM serial mpi
    NUM_MPI_PROCS 2

One idea is to change the behavior of tribits_add_test() (and tribits_add_executable_and_test()) so that if COMM serial ... is explicitly passed in, then the NUM_MPI_PROCS argument gets ignored for non-MPI serial builds. But that is too subtle for my taste (which is the old behavior).

Another idea is to extend the NUM_MPI_PROCS argument for tribits_add_test() (and tribits_add_executable_and_test()) to take in a list of numbers like NUM_MPI_PROCS 1 2 which will produce a different test for each number of MPI procs. (But how that interacts with multiple sets of arguments for ARGS and POSTFIX_AND_ARGS_<idx> will need to be determined.)

Opinions?

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

I would prefer the first choice as that is how I thought it worked all along, and I don't understand what the problems are with it. If the test writer includes serial for COMM, they clearly expect it to run in a serial build, don't they?

@rppawlo
Copy link
Contributor

rppawlo commented Dec 1, 2022

Agree with @etphipp . First choice would be my preference.

@etphipp etphipp deleted the stokhos_test_num_mpi_procs branch December 1, 2022 21:08
@bartlettroscoe
Copy link
Member

I would prefer the first choice as that is how I thought it worked all along

What about the case where COMM serial ... is not explicitly passed in but NUM_MPI_PROCS N for N > 1 is. It seems confusing that the NUM_MPI_PROCS N argument would be ignored in that case and would yield a test with just one process in an non-MPI build.

@bartlettroscoe
Copy link
Member

I would prefer the first choice as that is how I thought it worked all along

But I come back to the problem that this approach does not run the single-process versions of these tests in an MPI build. How about if you set COMM serial mpi and NUM_MPI_PROCS N for N > 1 then that will produce two tests in an MPI build, one with one process and one with N processes; and in a non-MPI SERIAL build will produce just the one test with one process?

But I argue that it should still be the case that if the user passes in COMM serial (but not mpi) and NUM_MPI_PROCS N for N > 1, then that should error out because the NUM_MPI_PROCS N argument would always be ignored.

@etphipp
Copy link
Contributor Author

etphipp commented Dec 1, 2022

How about if you set COMM serial mpi and NUM_MPI_PROCS N for N > 1 then that will produce two tests in an MPI build, one with one process and one with N processes; and in a non-MPI SERIAL build will produce just the one test with one process?

That sounds like a good idea to me.

But I argue that it should still be the case that if the user passes in COMM serial (but not mpi) and NUM_MPI_PROCS N for N > 1, then that should error out because the NUM_MPI_PROCS N argument would always be ignored.

Agreed.

@bartlettroscoe
Copy link
Member

See updated issue TriBITSPub/TriBITS#542

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants