[Q&A] Jobs not being submitted to the correct server when hosting two servers on the same machine #2657
-
Python version (
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
@dima1997 thanks for the question! Our job storage by default is stored inside "/tmp/nvflare/jobs-storage". This is configured in here: https://github.com/NVIDIA/NVFlare/blob/2.3.8/nvflare/lighter/impl/master_template.yml#L200-L220 So for your case, if you want to "run two NVIDIA FLARE servers on the same physical machine with the same FQDN name but with different associated ports." You need to modify the "project.yml" to read in a different version of "master_template.yml" before you do the provision for the second server. So for example, copy and modify the "master_template.yml" to be "my_template.yml":
Then in the my_project.yml:
Finally do the provision: @IsaacYangSLA @SYangster maybe we want to add this to our documentation. |
Beta Was this translation helpful? Give feedback.
@dima1997 thanks for the question!
Our job storage by default is stored inside "/tmp/nvflare/jobs-storage".
This is configured in here: https://github.com/NVIDIA/NVFlare/blob/2.3.8/nvflare/lighter/impl/master_template.yml#L200-L220
So for your case, if you want to "run two NVIDIA FLARE servers on the same physical machine with the same FQDN name but with different associated ports."
You need to modify the "project.yml" to read in a different version of "master_template.yml" before you do the provision for the second server.
Otherwise, this two server will be using the same directory for the job management thus you see the failure.
So for example, copy and modify the "master_template.yml" …