You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
what is the recommended way of running tasks with HTCondor workflow which rely on other tasks with HTCondor workflows?
Concretely, I have a task called FTest which has subtasks FTestCategory. The latter must run with HTCondor. FTestCategory has a requirement called Trees2WS which again consists of subtasks Trees2WSSingleProcess which should run on HTCondor as well.
Now, when I execute law run FTest --workers 4, then law creates the Condor submission for FTestCategory and on that respective node the Condor submission for Trees2WSSingleProcess but ultimately fails, since on LXPLUS the condor nodes themselfs cannot access the schedd.
The resulting error is this one, which I guess is due to the inaccessibility of the schedd on Condor nodes:
Traceback (most recent call last):
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/luigi/worker.py", line 210, in run
new_deps = self._run_get_new_deps()
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/luigi/worker.py", line 138, in _run_get_new_deps
task_gen = self.task.run()
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/law/workflow/remote.py", line 628, in run
return self._run_impl()
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/law/workflow/remote.py", line 700, in _run_impl
self.submit()
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/law/workflow/remote.py", line 882, in submit
job_ids, submission_data = self._submit_group(submit_jobs)
File "/afs/cern.ch/user/n/niharrin/cernbox/PhD/Higgs/CMSSW_14_1_0_pre4/src/flashggFinalFit/law/install_dir/lib/python3.9/site-packages/law/contrib/htcondor/workflow.py", line 190, in _submit_group
c, p = job_id.split(".")
AttributeError: 'Exception' object has no attribute 'split'
How do you handle such chained HTCondor workflows?
Thanks a lot!!
The text was updated successfully, but these errors were encountered:
two things before going into depth of the workflow -> task -> workflow pattern.
The error you are seeing is a bug that we also stumbled upon recently. I will hopefully have time late next week to debug this further. It's quite elusive and seems to appear only in edge cases (at last on our end).
To make sure I understand, is this the situation you want to achieve? (workflows have a purple border)
flowchart TD
%% aliases
ftest(FTest)
ftestcat1[FTestCategory]
ftestcat2[FTestCategory]
t2ws1(Trees2WS)
t2ws2(Trees2WS)
t2wss11[Trees2WSSingleProcess]
t2wss12[Trees2WSSingleProcess]
t2wss21[Trees2WSSingleProcess]
t2wss22[Trees2WSSingleProcess]
%% styles
classDef WF stroke: #83b, stroke-width: 3px
%% assign styles
class ftestcat1 WF
class ftestcat2 WF
class t2wss11 WF
class t2wss12 WF
class t2wss21 WF
class t2wss22 WF
%% actual graph
ftest --> ftestcat1
ftest --> ftestcat2
ftestcat1 --> t2ws1
ftestcat2 --> t2ws2
t2ws1 --> t2wss11
t2ws1 --> t2wss12
t2ws2 --> t2wss21
t2ws2 --> t2wss22
Loading
If not, feel free to change the graph and paste it here in GH in a ```mermaid code box.
Yes, for now this is the situation I want to achieve. Ideally, Trees2WS should run only once per execution of law (as it produces all the ingredients for FTestCategory).
Question
Hello,
what is the recommended way of running tasks with HTCondor workflow which rely on other tasks with HTCondor workflows?
Concretely, I have a task called
FTest
which has subtasksFTestCategory
. The latter must run with HTCondor.FTestCategory
has a requirement calledTrees2WS
which again consists of subtasksTrees2WSSingleProcess
which should run on HTCondor as well.Now, when I execute
law run FTest --workers 4
, then law creates the Condor submission forFTestCategory
and on that respective node the Condor submission forTrees2WSSingleProcess
but ultimately fails, since on LXPLUS the condor nodes themselfs cannot access the schedd.The resulting error is this one, which I guess is due to the inaccessibility of the schedd on Condor nodes:
How do you handle such chained HTCondor workflows?
Thanks a lot!!
The text was updated successfully, but these errors were encountered: