-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a SEMSDevEnv.cmake file to automatically use loaded SEMS dev env #158
Comments
@dmvigi should be CC'ed on this too. |
@bartlettroscoe Do you have a prelim file we can test on shiller? |
@bathmatt, what is a "prelim file"? |
Preliminary file he can test I think. |
From: Bartlett, Roscoe A Hello Trilinos Framework, Is there some machine that has the SEAMS Dev Env mounted that is not being constantly hammered where I could get an account? I have some changes that I need to test that might impact a lot of Secondary Tested Trilinos packages such as:
I can’t effectively test all of this and push unless I have most of the important TPLs built for packages like STK and SEACAS. I would not hammer this machine. I could use very few processes. I would only use it for the final configure/build/test/push (unless something breaks before the push then I will need to fix it). This is also a chance for me to put together a standard Trilinos configure for SEMS:
This could lead to a standard checkin-test-sems.sh script for a standard CI build for Trilinos. Cheers, -Ross |
@bartlettroscoe , it's quite easy to mount the sems TPLs, so any COE RHEL machine on the SON or SRN should work. |
Don't you need sudo to mount an NFS drive? I don't have sudo on any SNL machine. The only machines I have accounts on are the CEE server ceesrv02 and several of the ATTB machines. I don't think any of these mount the SEMS dev env partitions, do they? I don't have my own Linux machine yet (and it does not seem trivial to get one). |
@bartlettroscoe we don't mount SEMS on ATTB for reasons we outlined relating to TPLs not working on some architectures and level of tuning/modification and support in the environment (where many things are non-standard until we have had time to work with vendors to find appropriate ways to integrate these into their platforms and optimize them). Have you checked SNL re-app for Linux machines, there are usually workstations there if you just need a builder box. |
Given that
I have not. But I am little leery of spending a lot of time setting up an old slow workstation that could die at any minute. |
There should be funds to buy a reasonable machine, they aren't that much. On Tue, Apr 5, 2016 at 7:43 AM, Roscoe A. Bartlett <[email protected]
|
@bartlettroscoe , yes, you need root to mount, but you can just ask a sysadmin to do it. |
Yea, I guess I need to just go ahead and pull the trigger on getting a new COE RedHand linux machine just for myself to do this type of stuff on. I have already been given the green light to do so. I was just trying to see if it was possible to create a productive dev env at SNL without purchasing my own Linux box (or purchasing a very expensive CEE machine).
Right. For development work related to advanced architectures, we need to use the ATTB machines. That is the purpose of #172. But the issue is that we desperately need a standard CI env for Trilinos. We can't use the ATTB machines for that. The simplest way to build a uniform CI env for Trilinos is based of the SEMS dev env (see). |
It looks like I have access to the machine muir (thanks for pointing that out Brent!) that NFS mounts the SEMS dev env under under
You see the enabled TPLs:
It looks like the X11 and Matio TPL is required by SEACAS:
Are these TPLs really important for testing Trilinos functionality downstream from SEACAS? There are a bunch of optional TPLs that are not enabled but yet I know some people feel are important. If the remaining 88 TPLs:
how many should really be present and enabled for a solid pre-push test of Trilinos. I know that SuperLUDist, SuperLU, HYPRE and PETSC are important for the IDEAS Productivity project. What about the other sparse-direct TPLs like UMFPACK and CSparse? I, for one, would really like the TPL BinUtils to be present. That is used for creating backtraces with exception handling. Can we get a solid pre-push test of Trilinos without all of these other TPLs? I will ask the Trilinos developers. I should also ask Trilinos customers what TPLs they enable with Trilinos. |
Note that BLAS and LAPACK are not listed in the SEMS modules. From looking at the automated builds on muri on CDash at: http://testing.sandia.gov/cdash/viewConfigure.php?buildid=2408598 It would seem these are found in the base Linux COE packages:
|
Just to clarify, this doesn't mean you want to do only single compiler testing for the nightlies right? That said for standard push right now: Nightlies need additionaly openmpi 1.6 can be retired in my opinion, that is pretty old by now. The superlu version provided by SEMS is broken last I checked, and nobody cared enough to fix it since one can simply not use it in Trilinos. |
@bartlettroscoe the real aim of this work shouldn't be to do this for SEMS environments per-se, it should be to ensure we have a high quality, well tested framework for execution on capacity- and capability- class production computing environments. Note that on these production machines SEMS won't be the environment because, instead, it will be supplied by vendors who have to optimize and support it (this particularly true for machines like Cray and IBM who very carefully patch and engineering an environment for their machines). The real value of SEMS is that it brings this concept down to the workstation environment for our developers and includes local support. To that end, these tests should be oriented towards configurations that will be on our production environments where we can do that/replicate the combinations to be close enough. This means running multiple of these combinations (unfortunately) to replicate the production computing systems. As a rough wag to get get started, we would need an OpenMPI 1.10 series test with GCC 4.8 and 4.9 for POWER systems (perhaps including CUDA) and then an MPICH-based environment with GCC 4.9/Intel 16.X for a machine like Trinity (note Intel relies on GCC for header files so we must be careful to select this appropriately). The SIERRA folks will also need to run on older machines with Intel 15.X and older GCC (I would suggest 4.7 for this purpose). |
No, this is just the standard pre-push CI build. This will not affect what gets tested post-push. And we want to make the pre-push CI build fast and we want to focus on what best protects other developers doing their work. So we could likely get away with just a single MPI build of Trilinos with ETI turned on, no complex or float, etc. We want this to cover a good bit but be as fast as possible. |
Ok, in that case as I said GCC 4.8.4 with OpenMPI 1.8.7 and I would very, very strongly advocate for enabling OpenMP because I believe by default we must exercise the threaded code path. |
We are not looking for a comprehensive set of builds. The post-push builds and the usage of the 'develop'/'master' branch workflow will take care of protecting these customers. What we are going for is the best single (relatively fast) pre-push CI build that we can put together. |
@bartlettroscoe wrote: ParMETIS v4.0.3 or later, built with 32-bit index types |
I got a message back for SEMSHELPD-130 that SEMS is working on a sync script for the SEMS Dev Env. I am excited to try that out to see if that fixes the performance problems that I am seeing. |
@bartlettroscoe, working remotely from SNL/NM, the mounted SEMS environment is unusably slow. |
@tjfulle SEMS has developed a tool that should allow you to use your TPLs locally while syncing the minimal amount of data. The tool is in the final stages of testing and should be available soon. |
I look forward to using it @jgfouca ! |
…linos#482) The SEMS Env does not provide any GCC except for 5.3.0. Also, it does not provide Boolst 1.55.0, only 1.58.0 and 1.59.0. Also, it does not provide any build of Scotch at all. Therefore, I have change the default dev env to be GCC 5.3.0, Boost 1.58.0 and disabled the load of Scotch.
SEMS now supports a script to sync the SEMS env. I documented how I installed it (and what I had to tweak from the SEMS-provided documentation to get it to work) in: In this Issue comment, I document the change in performance of the SEMS-built and now locally installed tools vs. a locally built set of executables. My timing experiments (shown below) still shows about a factor of 2x slowdown the locally synced SEMS env tools vs. locally built tools (built from the CASL VERA dev env build scripts). This might be something that SEMS might want to look into at some point. A factor of 2x slowdown is still pretty large. Based on this analysis, I think that I would use my locally built env for all local development and only use the SEMS env when pushing with the checkin-test-sems.sh script (see #482). Detailed Notes: A) Locally built and installed VERA Dev Env: Using the locally built VERA Dev Env (that supplies CMake, GCC, etc.) I configured TriBITS from scratch and then built and ran tests:
B) Remotely built by locally synced SEMS env: Now with the remotely built but locally synced SEMS env:
C) Analysis: This shows a much improved speed-up of the locally synced SEMS env over the NFS-mounted env reported above. However, we are still seeing about a factor of 2x slowdown of the locally synced SEMS env vs. a locally built env. Why is this? Is it because GCC is built to not run fast (as was suggested by someone in the past)? Is it because CMake is built with -O0 instead of -O3 (we got bit by that in CASL for over a year)? Is it because the the locally built tools take advantage of the processor arch better? Why is this? |
This commit removes all of the enables/disables from this file. Now the SEMSDEvEnv.cmake module just sets the locations for compilers and the TPLs. What should get enabled and disabled for CI testing is really orthogonal and should be set somewhere else. Also, I have realized due to many reasons that we may not be able to able to demand the SEMS env in order to define a CI build. If someone has a compatible set of compilers, MPI, and TPLs built, then perhaps we should allow them to push? That will be will supported when the checkin-test.py script supports --compare-to-control-build option (see TriBITSPub/TriBITS#152).
@bartlettroscoe the cmake configure slowdown possibly has something to do with the version of cmake (3.6.2 vs 3.5.2). Every SEMS TPL/utility should have been built with -03, assuming the package doesn't ignore the CFLAGS CPPFLAGS etc env variables. As far as the reduced performance of the built code, that is both a complete surprise and disappointing. The fact that GCC was built on a certain architecture should not prevent it from taking advantage of the current architecture, should it? One possibility is that the stdlib is not tuned to the current architecture. One thing that would be helpful is if you can duplicate this slowdown with a different compiler, like intel. |
Okay, I could test that. I could use install-cmake.py to locally build and install the exact same version of CMake as provided by SEMS and then compare. But from the contract work with Kitware over the last 1.5 years and lots of performance studies for CMake, we don't expect to see factor of nearly 2x between CMake versions after 3.3. The speedups that we have funding are more in the range of 20-50% or so.
That should work. That is what the install-cmake.py script does and we verified that it adds -O3 to the compile lines for the source files. It makes about a 2x difference in cmake runtimes.
The TriBITS test suite does not really test the performance of compiled code. It builds and runs only very tiny compiled programs. What it really tests (in a performance sense) is running cmake, gcc, g++, gfortran, etc. a bunch of times, creating a lot of little directories, etc. Someone can do real performance comparisons of the built code, but this is not that. It would likely be a good idea of someone did that at some point soon. Thanks for all your work with the SEMS env! |
…linos#482) The SEMS Env does not provide any GCC except for 5.3.0. Also, it does not provide Boolst 1.55.0, only 1.58.0 and 1.59.0. Also, it does not provide any build of Scotch at all. Therefore, I have change the default dev env to be GCC 5.3.0, Boost 1.58.0 and disabled the load of Scotch.
I did a few things here: * Got rid of Trilinos_ENABLE_CI_TEST_MODE out of cmake/CallbackSetupExtraOptions.cmake and moved all of those options to BasicCiTestingSettings.cmake * Moved the majority of options for the MPI_RELEASE_DEBUG_SHARED build (changed name to MPI_RELEASE_DEBUG_SHARED_PT) from project-checkin-test-config.py to MpiReleaseDebugSharedPtSettings.cmake and then update project-checkin-test-config.py build MPI_RELEASE_DEBUG_SHARED_PT to just point to these *.cmake fragement files * Moved MPI_RELESE_DEBUG_SHARED_PTR_COMPLEX to project-checkin-test-config.py but don't call it by default in checkin-test-sems.sh (i.e. --default-builds=MPI_DEBUG_RELEASE_SHARED_PT) * Got rid of commented-out stuff from SEMSDevEnv.cmake (now it only contains system stuff)
This removes a bunch of commented-out statements. This version of the SEMSDevEnv.cmake just sets the compilers and MPI and points to the supported TPLs, nothing else.
@bmpersc, I cleaned up the SEMSDevEnv.cmake file quite a bit. See the version of SEMSDevEnv.cmake for the commit 19afca0 which you can see the full file here. Hopefully this will address the majority of your concerns you expressed above (except for perhaps how it gets loaded). This will get pushed soon as part of #482 with the branch better-ci-build-482. |
This commit removes all of the enables/disables from this file. Now the SEMSDEvEnv.cmake module just sets the locations for compilers and the TPLs. What should get enabled and disabled for CI testing is really orthogonal and should be set somewhere else. Also, I have realized due to many reasons that we may not be able to able to demand the SEMS env in order to define a CI build. If someone has a compatible set of compilers, MPI, and TPLs built, then perhaps we should allow them to push? That will be will supported when the checkin-test.py script supports --compare-to-control-build option (see TriBITSPub/TriBITS#152).
…linos#482) The SEMS Env does not provide any GCC except for 5.3.0. Also, it does not provide Boolst 1.55.0, only 1.58.0 and 1.59.0. Also, it does not provide any build of Scotch at all. Therefore, I have change the default dev env to be GCC 5.3.0, Boost 1.58.0 and disabled the load of Scotch. Currently need to manually set TRILINOS_DIR. Need to fix this up so that it is more automatic. See comment in the script.
I did a few things here: * Got rid of Trilinos_ENABLE_CI_TEST_MODE out of cmake/CallbackSetupExtraOptions.cmake and moved all of those options to BasicCiTestingSettings.cmake * Moved the majority of options for the MPI_RELEASE_DEBUG_SHARED build (changed name to MPI_RELEASE_DEBUG_SHARED_PT) from project-checkin-test-config.py to MpiReleaseDebugSharedPtSettings.cmake and then update project-checkin-test-config.py build MPI_RELEASE_DEBUG_SHARED_PT to just point to these *.cmake fragement files * Moved MPI_RELESE_DEBUG_SHARED_PTR_COMPLEX to project-checkin-test-config.py but don't call it by default in checkin-test-sems.sh (i.e. --default-builds=MPI_DEBUG_RELEASE_SHARED_PT) * Got rid of commented-out stuff from SEMSDevEnv.cmake (now it only contains system stuff)
This removes a bunch of commented-out statements. This version of the SEMSDevEnv.cmake just sets the compilers and MPI and points to the supported TPLs, nothing else.
@jwillenbring and @bmpersc, do you have the time and the interest to review: If not, this is committed and is being used automatically by the checkin-test-sems.sh script and the new post-push CI build. The new SEMSDevEnv.cmake file is pretty minimal and should have addressed all of Brent's concerns. The right compilers and TPL paths are still set up automatically but no TPLs are enabled automatically. I don't see that as a bad thing. Therefore, I would like to close this as complete if there are no objections. |
We are no longer going to purse extending the checkin-test-sems.sh (or checkin-test.py) script for Trilinos (see #482 (comment)) and the SEMS env is already working for this and protected by a CI sever. This is working for me and many other people so there is no further need to review this. I will fix any major problems that come up so that I and others can continue to use this. Closing as complete. |
Next Action Status:
Working and providing value as part of new checkin-test-sems.sh script and CI build (see #482)
CC list: @rppawlo, @bathmatt, @jgfouca, @jwillenbring, @gdsjaar, @trilinos/framework
Blocking: #410, #370
Description:
This story will be to create a SEMSDevEnv.cmake file that once included (with -DTrilinos_CONFIGURE_OPTIONS_FILES=/SEMSDevEnv.cmake), then TriBITS will automatically pick up the right compilers and TPL locations.
In addition, it would be desirable for the Trilinos configure to automatically pick up the loaded SEAMS env (like is done on the ATTB machines using the
ATTB_ENV
env var, see #172).In addition to just providing the
SEMSDevEnv.cmake
module, this story will also scope out what might be useful for a standard Trilinos dev env. However, a new story will be created to refine what a new expanded Trilinos Primary Tested build of TPLs and Packages will look like.List of TPLs and other requirements needed for a standard Trilinos CI build:
This following list is the current consensus for this the standard Trilinos CI build (this list is updated as consensus changes).
Tasks:
SEMSDevEnv.cmake
module (targeting a standard Trilinos pre-push CI build) [Done]SEMSDevEnv.cmake
to automatically set up Compilers TPLs for a given loaded SEMS env (usingmodule load <avail>
) by reading env vars starting withSEMS_
. (see below) [Done]load_sems_dev_env.sh
module that can be called assource load_sems_dev_env.sh [<compiler-and-version>] [<openmpi-and-version>]
, for examplesource load_sems_dev_env.sh gcc/4.9.2 openmpi/1.10.1
(where<compiler-and-version>
and<openmpi-and-version>
are given defaults if not provided). (see below) [Done]SEMSDevEnv.cmake
if it is detected that the SEMS dev env is loaded. (see below) [Done]The text was updated successfully, but these errors were encountered: