-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: Resolution loop on 'bids/freesurfer/latest' detected, RHEL 9 #677
Comments
I'm not great with developing environment modules, but this seems relevant: @marcodelapierre have you seen this before? |
I am testing with Lmod version 8.7.47, and latest Singularity-HPC, and cannot reproduce. Would you be able to provide the output of Thank you |
It does not return any values.
Since we have both TCL and Lmod installed which option should be used? Is it expected to not load modules when we use Lmod? |
It should work for either or, but I've never tested shpc in an environment with both! |
Is there any debugging or comments we can insert to see where the modules are failing to load? |
shpc is just using standard environment modules or lmod, so you'd want to look in documentation for those projects. @marcodelapierre is much better at these modules than me so he might have some advice! I do most in the cloud these days, or user space kubernetes. |
Hello @marcodelapierre is there anything I can do to work around this? |
I am surprised that I would like to see the outcome of that command in the same shell environment that also generates the Error. When you say you have both TCL + Lmod installed, I am assuming you mean TCL as the language and then Lmod modules, and that you do NOT also have installed Environment Modules (the TCL variation of Lmod). Is this correct? With the assumption above, Lmod can handle both modules written in lua and in tcl. Lua is the native one for Lmod, hence preferable because it has more functionalities; Lmod can handle tcl modules too though. For what regards SHPC, I cannot test at present, but for what I remember from past exploration, Lmod is capable of correctly using both Lua and Tcl modules generated by SHPC. (as said above, lua preferred because it is the native language of Lmod modules). |
Yes here's what
These also show up as modules:
Did the above help?
This must be a clue as to why the 4
Attempting to load one of those gets this error:
Using the full path to the module.lua file:
|
Mmh OK. I would discourage having both Env Modules and Lmod active in the same shell environment, it adds unnecessary complexity and sources of issues. You should not be seeing these:
I think they come out of using From your first messages:
To correctly use modules, you need to run I would like to see the outcome of |
What type of privileges do you have on that machine? It would be good to be able to disable either Lmod or Env modules setup in the shell environment, to test in a more robust environment. Sourcing of the corresponding scripts will typically be either in your .bashrc/.profile in your home, in which case you can comment out the corresponding line, or in the forms of scripts within /etc/profile.d/ directory. The latter case is trickier as it would require root privileges to move sourceable files out of it. |
So within
root
Sure what would you like me to try? EDIT I went to another node. This time I made some progress:
So if I do
Yep this is a node that has only |
Oh great! Looks like the EnvModules vs Lmod clash was indeed causing your issues. Let us know if you need anything else, or if we can close this one. |
@marcodelapierre 🙌 🙌 ! |
Can you suggest a way to do this in the nodes that have both?
Looks like both Lmod and environment-modules are in use to get specific modules to load only on compute nodes, so when trying to uninstall environment-modules it wants to remove mpich, openmpi, etc. It'd be great to know how to work around this. |
Please start from this comment of mine above:
|
That's where I'm not clear. Are you suggesting to run the source cmd on |
No worries, let me try and clarify. Having the RPM packages for Lmod and Env modules is not enough to have them actually configured in the shell environment. Both applications rely on an initialisation script that needs to be sourced when the shell starts. We have proof that you have both running (ml -> lmod ; module --version --> suggest env modules), and this implies that at shell startup scripts for both are sourced. Now, to have only one running, you need to locate where the scripts are sourced, and disable that. Common setups: LMOD:
ENV MODULES:
So, what I suggest is to have a look at the following:
I hope this can help. |
OK I'm getting closer. I'm assuming I want to disable Environment Modules. I see we have:
There is also the
Which has:
And this file, which has the
So notice that it references
So can I just use FWIW:
And this file is what appears to enable Lmod:
|
Looks like Yes, you want to disable the env modules sourceable files, so for instance having the alternatives file What I don't understand in this setup is where the Lmod script is currently sourced, otherwise you would not be able to use What I would do:
I hope this helps. Please be aware that I will be logging off at the end of today, and get back on my work computer on 6 Jan. I will not be responsive in the meanwhile. All the best, |
OK I removed the 2 suggested files, then created the sym link. However in order for How do I get that module to be available without having to run
Thanks perhaps @vsoch has an idea? |
Hi, To make the modules available from shell login, I suggest to add To customise grouping of modules within the directory structure, please have a look at the Views functionality in the documentation, see if it can be helpful in your case. |
What I mean is all of these show up:
So how would I hide those? Also what causes this?
|
Oh I see, thanks for clarifying. Both issues come to the fact that the directory If you are not executing it yourself after shell start, then it is executed in some sourced script, such as the usual |
I only run So I'm confused now, as previously you told me to:
After I run the
so indeed it does get added to Looking at my cmd history could one of these cause the
|
Instead of the one you have just mentioned, the correct
Using this one should fix the incorrect |
Ah now that looks much better:
Any ideas on how to fix having both Lmod and environment-modules without having to do the below on all nodes? I suppose this could be added to an Ansible play book just seems clumsy to scale it the way you suggested:
Perhaps this is a feature request, to allow usage/confguration with And thanks for you patience and responding to my issues, I bet this thread will help others in the future. |
Hi, no worries, indeed these are enjoyable and useful conversation within the community :) I don't have much experience on the cluster sys admin side nor on removing one of the modules installation. I would suggest to look around for a scriptable, effective way of removing it, that can be implemented within the cluster manager of your cluster - for this second part consulting with other team members of the cluster operations team may help I reckon. Regarding allowing to use both modules systems at once, as far as I know this is not a limitation of SHPC per se. I may be wrong, but Lmod and EnvModules themselves were designed as alternative options. As such, their shell setups have inherently overlapping and conflicting aspects. |
Well I uninstalled Lmod and module use
What else could be preventing the module from appearing? I also ran: Running the following results in the below error:
So is an additional step needed for Environment Modules? EDIT: I see that for Environment Modules I need to use However I get this error:
What's interesting is the help cmd works:
so Line 85:
Here's a
Edit: someone at the environment-modules Git wrote this
Edit: Indeed removing in line 85 the |
So on another cluster I get this error:
|
I was advised to share this, via the environment-module Git
In |
Hello SHPC team. For this point mentionned by @SomePersonSomeWhereInTheWorld, I have just proposed you a pull request: #682. Regards, |
The modules do not appear with Lmod via
module available
. Testing freesurfer, using TCL the module appears as:bids/freesurfer/latest/module.tcl
ml bids/freesurfer/latest/module.tcl
results in:ERROR: Resolution loop on 'bids/freesurfer/latest' detected
I've switched between TCL and Lmod. We use the Bright Computing provisioning management system but I don't think that is related.
To Reproduce
Steps to reproduce the behavior:
This was installed within
echo $MODULEPATH
Expected behavior
The module should appear.
Anything else?
If this is a question more than a bug, feel free to move it.
The text was updated successfully, but these errors were encountered: