-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to view jobs on Slurm Queue Manager #49
Comments
Admittedly I'm not sure what error the screenshot is displaying. So basically you're saying that the backend appears to be working correctly (i.e. the plumbing that bosses Slurm around according to the user's instructions), but the frontend isn't reporting everything that is going on? As others would be quick to point out for me, my knowledge of JupyterLab-Slurm is embarrassingly out-of-date at this point, but I can at the very least try to help you debug based on my (again out-of-date) experience developing it? Versioning:
Minimal Reproducibility: I imagine this part might not be strictly necessary, but that's what I always tried to have in hand when trying to debug issues with the frontend. Web Browser Error Logs: Because often the errors most useful for debugging frontend issues are found there (I think, although it might have been primarily useful for validating HTTP requests i.e. the functioning of the backend)
In any case, since this seems to be a frontend/UI issue based on your description, my guess (and again current/active developers will know this 100x+ better than I do either way) is that being able to share any logged Javascript errors would be one of the most important pieces of information. I know none of this directly helps you at the moment, but at the very least it should help to make the error more precise, and then hopefully identify what is going on/how it could be fixed? Please let me know your thoughts
|
OK, this appears to be very helpful in terms of additional information. Since I currently work at a new job now, and did not help implement the newest changes to the frontend, I don't really know/understand how the frontend works anymore. In particular I have never written a substantial piece of software using React myself and don't really understand it. So I should admit to you up front that I personally probably won't be able to help you with your issue, although hopefully someone else might be able to. It is good to see that the jobs are actually being submitted, but you're right that it (more than) somewhat defeats the purpose of implementing a GUI if the GUI doesn't confirm for the user that this happened. @zainul1114 I notice that in the screenshot the "Show my jobs only" checkbox is clicked.
When I worked on this, getting the "show my jobs only" box to work tended to be difficult/problematic. Also when I last worked on this, the structure was somewhat "brittle" with respect to the format in which squeue responded to commands https://slurm.schedmd.com/squeue.html so e.g. if in a newer version of Slurm the format in which
Sorry again for not being able to provide you with any definitive answers. |
Thanks for the feedback...
Hopefully I may get the response from the Team Jupyterlab_Slurm 2.0.
…On Tue, Mar 2, 2021 at 12:02 AM krinsman ***@***.***> wrote:
OK, this appears to be very helpful in terms of additional information.
Since I currently work at a new job now, and did not help implement the
newest changes to the frontend, I don't really know/understand how the
frontend works anymore. In particular I have never written a substantial
piece of software using React myself and don't really understand it. So I
should admit to you up front that I personally probably won't be able to
help you with your issue, although hopefully someone else might be able to.
It is good to see that the jobs are actually being submitted, but you're
right that it (more than) somewhat defeats the purpose of implementing a
GUI if the GUI doesn't confirm for the user that this happened.
@zainul1114 <https://github.com/zainul1114> I notice that in the
screenshot the "Show my jobs only" checkbox is clicked.
*1.* Can you describe, as well as show with screenshots, what happens
when it's unclicked?
When I worked on this, getting the "show my jobs only" box to work tended
to be difficult/problematic. Also when I last worked on this, the structure
was somewhat "brittle" with respect to the format in which squeue responded
to commands https://slurm.schedmd.com/squeue.html so e.g. if in a newer
version of Slurm the format in which squeue returns output has changed,
maybe that could cause problems?
*2.* Did you already say what version of Slurm your group is using?
Hopefully a version incompatibility with Slurm is not the issue, but based
on my limited knowledge it maybe can't yet be ruled out?
Sorry again for not being able to provide you with any definitive answers.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#49 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHSJWEXFNTLLEHJCXINQJVLTBPMT3ANCNFSM4YE5Z2OQ>
.
--
*Regards*
*Zain*
|
|
Hi Team, Please help us to resolve this. |
Hi @zainul1114 ! Thank you for your continued interest in this project! It is really gratifying to know that people actually use and appreciate your work. I noticed you're running JupyterLab 2.2? Some questions:
I know the current maintainers have a lot of work maintaining numerous other projects as well, so the more help you can provide for them in advance in narrowing down the scope of the possible sources of the issue, the more likely it is to be resolved. Speaking personally as someone who is not currently actively maintaining this repository, and not for anyone else, I know that I would not want to look into this further without knowing the answers to at least some of the above questions. EDIT: To refresh/check my understanding of these tools and Docker, I've been considering this weekend or the next possibly trying to create a Docker container that could reproduce this issue. Then I might be able to help you identify the source of the problem or even propose a solution (no promises though). But currently I'm afraid to even try without at least knowing the specific version of Slurm involved, and without being able to rule out earlier or later versions of JupyterLab as the fix. |
Thank you, this is already very helpful! I was hoping it would maybe be a resurfacing of an old bug (cf. #23 ) due to the updated dependencies in the new versions, but it appears more complicated than that... |
@zainul1114 FYI don't bother trying to test this with JupyterLab 3.0, since that appears to not be supported yet:
1. Also you said you are using JupyterLab-slurm 1.05 with JupyterLab 2.2?
Not all versions of JupyterLab-Slurm are compatible with JupyterLab 2+ unfortunately -- I don't remember whether 1.05 is one of them.
2. Also have you tried running JupyterLab-Slurm without JupyterHub (you might need SSH port forwarding to do this)? Admittedly that probably isn't the issue, since when it was the issue in the past the reason was because using JupyterHub required the HTTP request command syntax to be modified in a way that the old version of JupyterLab-Slrum didn't yet support. The HTTP server backend for the version ( 3. I spent 3+ hours this weekend trying to create a Docker container to reproduce the issue but haven't finished yet. (The container build process is relatively slow, making it difficult to iterate quickly since simple mistakes often take 15-20 minutes to recover from.) You can see my progress so far here and here. I might try to finish this next weekend using the guide here. In all honesty I probably should have made a container doing this a long time ago while working on the project. I believe my rationalization at the time was that I was lazy and able to debug most of the problems using the system's Slurm installation and the like. Anyway hopefully a "Dockerized" minimal working example of the bug will make it easier to identify and fix. |
Hi, I have tried this weekend Then again i changed jupyterlab-slurm version 1.0.5 and i tried
Here i tried to submit job and i am getting Unknown error encountered while submitting the script. Try again later. When i tried jupyter labextension [email protected] and i am getting as below
Please guide me below scenario's as i am not clear with this. Regards, |
I haven't had time yet to look into the results of the version checks you did -- maybe later this week. A. I don't know either way to be honest. B. They should be -- there isn't really any saved/stored environment variable. Under the hood jupyterlab-slurm just calls C. The server extension/backend hypothetically might work (the python package). (At least it should work with the same Python backend as JupyterLab does, and I think classical notebook and JupyterLab share the same Python backend and/or that situation is in flux.) The NPM part/the lab extension definitely wouldn't work, since JupyterLab and Notebook have completely different frontends. |
Thanks for the Update. |
@zainul1114 I'm a little confused, because it seems like multiple errors are reported
So to confirm,
correct? And you installed both (i) the LabExtension (NPM/Node) part of JupyterLab-Slurm 2.0, and the (ii) ServerExtension (Python3) part of JupyterLab-Slurm 2.0, correct?
So to confirm,
correct? And you installed both (i) the LabExtension (NPM/Node) part of JupyterLab-Slurm 2.0, and the (ii) ServerExtension (Python3) part of JupyterLab-Slurm 2.0, correct?
This is where I'm most confused. 1. So you did not get the same result as before (although you had expected to). Is that correct? 2. To confirm, JupyterLab 2.1.2 and JupyterLab-Slurm 2.0 loaded, but this combination did not work? 3. Could you rephrase what the error was that happened when you used JupyterLab 2.1.2 and JupyterLab-Slurm 2.0? Also my guess is that it probably is only possible to provide support for the most recent version of JupyterLab-Slurm. (In any case I'm looking into this on my own free time and am not being paid at all for it anymore.) My point being that it would probably work out best for you if we can get JupyterLab-Slurm running for you. 4. Is there anything preventing you from running JupyterLab 2.1.2 on your system? If so we can still try to see if we can debug JupyterLab-Slurm 1.0.5, and maybe submit a patch if possible. But I just want to warn you in advance that it is less likely that the current developers will be able to support that version in the future, even if we do debug it now.
Here is also where I'm confused -- it looks like you've installed the LabExtension part of JupyterLab-Slurm 2.0.0, while reverting the ServerExtension part back to JupyterLab-Slurm 1.0.5? If that is the case, then it makes sense that they might no longer work with one another? I know it's confusing, because both programs are developed in the same repository, and they are designed to work together, but technically it makes sense to think of JupyterLab-Slurm as 2 programs: (1) a "ServerExtension" ( This isn't an ideal situation and I could complain more about it to you, but suffice it to say for now that at least for the near or even indefinite future it seems unavoidable.
To confirm, this happened when you ran both (1) JupyterLab-Slurm ServerExtension/Python3-Pip package ("backend") version 1.0.5 and (2) JupyterLab-Slurm LabExtension/NPM-NodeJS package ("frontend") version 2.0.0 at the same time? This would make sense, since probably the backend was changed along with the frontend and so version 1.0.5 of the backend may no longer be compatible with version 2.0.0 of the frontend.
OK, so here you are trying to install version 1.0.5 of the frontend/NPM-NodeJS package/LabExtension to work along with the version 1.0.5 of the backend/Python3-Pip/ServerExtension? I am not sure why this would fail. To confirm first, though, are you unable to run JupyterLab 2.1.2 on your system? Again, if you are able to do that, probably the ideal situation would be to run that and figure out how we can get JupyterLab-Slurm 2.0.0 to work with that. (Assuming I understood you correctly in saying that JupyterLab-Slurm 2.0.0 does not work with JupyterLab 2.1.2, even when both version 2.0.0 of the frontend of JupyterLab-Slurm and version 2.0.0 of the backend of JupyterLab-Slurm were installed.) All the more so since the most recent version of JupyterLab is 3.0 + , so probably JupyterLab-Slurm will need to be updated even further in the near future to support that (if it hasn't already, I don't remember), making JupyterLab-Slurm version 1.0.5 even more "obsolete" or at least more difficult to support. |
Yes, that is correct.
Let's hope for the Best, near feature we might get a proper version of jupyterlab-slurm 2.x/3.x from the Maintainer.
|
@zainul1114 Can you please confirm more explicitly which parts are correct? Or at the very least could you answer: 1. You are able, or are you not able, to run the latest versions of JupyterLab (2.1.2 or 3.0.0) on your system? 2. Can you rephrase what the error was when you ran JupyterLab-Slurm 2.0.0 and JupyterLab 2.1.2 on your system? 3. Did you ever run both the From what you've described, it sounds like so far you only tried running version 2.0.0 of the LabExtension with version 1.0.5 of the ServerExtension, which were never designed to work together. Based on your descriptions I'm not sure whether there's any bug when the current versions of all components:
are all run together simultaneously. |
So I've spent about two hours this morning trying to run Slurm in a Docker container. What I was able to figure out is that the latest version of Ubuntu (Hirsute, even the 20 LTS only install Slurm 19.X) needs to be installed to run Slurm 20+, and that JupyterLab-Slurm 2.0.0 appears to be incompatible with JupyterLab 2.2.0+, but works (or at least JupyterLab builds) with JupyterLab 2.1.2. cf. here I still have not actually been able to test JupyterLab-Slurm with the container, because although
does not contain @zainul1114 If you have any insight from past experience why |
Hi, https://github.com/giovtorres/slurm-docker-cluster
later, i have tried setup jupyterlab-slurm on slurmctld container as follows
here, jupyterhub is running but i have not able to access the jupyterhub container from my phycial machine ipaddress:port like https://192.168.1.195:8090. and i am not getting how to expose port for external access. once we can resolve this issue, we can able to trace the our actual problem (unable to view jobs on Slurm Queue Manager). |
@zainul1114 Sorry I haven't had the cycles to look into this over the past two weekends Regarding your specific question, are you asking how to view newly submitted jobs without pressing the reload button? I.e. you submit a job, and then it doesn't immediately appear in the table? And you would like to know what to do to have it appear in the table? By default, the frontend waits 60 seconds/1 minute to send a request to the backend to update the table: Have you submitted a new job and timed it with a stopwatch to confirm the new job does not appear after 60 seconds or more? If it does appear, but only slowly/after 60 seconds then you could try changing the configuration to update the table more frequently by default. If I recall correctly, basically the idea was to avoid potentially "spamming" the Slurm system by calling I don't remember whether there was any development towards somehow automatically reloading/sending a new Anyway if the jobs do show up after 60 seconds (without pressing the reload button yourself) then maybe we should close this issue so you can make a feature request to automatically send an I'm not sure how that might be done though. |
Hi,
I have tried to submit the jobs with Slurm Queue Manager on jupyterhub, jobs are running and getting output as expected. But, not able to view jobs while running or after completed, Means job rows on Slurm Queue Manager.
My Environment:
Python 3.6.9
jupyterlab-slurm 2.0.0
jupyterhub 1.1.0
jupyterlab 2.2.0
Anyhow, i am able to get the details from sacct on command line. Please find the attached screenshot and help me to resolve this issue.
Regards,
Zain
The text was updated successfully, but these errors were encountered: