Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-proxy on Lassen: service.add: missing sender uuid #2685

Closed
dongahn opened this issue Jan 28, 2020 · 4 comments
Closed

flux-proxy on Lassen: service.add: missing sender uuid #2685

dongahn opened this issue Jan 28, 2020 · 4 comments

Comments

@dongahn
Copy link
Member

dongahn commented Jan 28, 2020

Another issue @JaeseungYeom and I hit on Lassen (Issue #2684).

We launched the flux-0.11.x-20190425 version with sleep -inf:

lassen259{dahn}32: env PMI_LIBRARY=/usr/global/tools/pmi4pmix/blueos_3_ppc64le_ib/20191120/lib/libpmi.so jsrun -a 1 -c ALL_CPUS -g ALL_GPUS --bind=none -n 4 /usr/global/tools/flux/blueos_3_ppc64le_ib/flux-0.11.x-20190425/bin/flux start flux ~/ip.sh
ssh://lassen259/var/tmp/flux-8pfahZ

Then, on a Lassen login node, we used flux-proxy to connect to this FLUX_URI and tried to run a parallel program.

lassen708{dahn}9: /usr/global/tools/flux/blueos_3_ppc64le_ib/flux-0.11.x-20190425/bin/flux proxy ssh://lassen259/var/tmp/flux-8pfahZ
lassen708{dahn}21: flux wreckrun -N 4 -n 4 ~/testcases/parallel_dbg_target/parallel_dbg_target/virtual_ring_mpi

On the compute node, flux errored the following error message and the job didn't seem to run.

2020-01-28T01:18:58.210480Z proxy.err[0]: response_cb: topic service.add: missing sender uuid

This may be a problem that has been fixed in a newer version. But we couldn't test it because of issue #2684 .

@grondo
Copy link
Contributor

grondo commented Jan 28, 2020

I think this may be a duplicate of flux-framework/flux-core-v0.11#21.

You can't use service.add over flux-proxy.

Just ssh to the host and set FLUX_URI instead.

@dongahn
Copy link
Member Author

dongahn commented Jan 28, 2020

Thanks @grondo. Of course, a better option will be to fix Issue #2684 and use the new version.

@grondo
Copy link
Contributor

grondo commented Jan 29, 2020

I'm going to close this issue since it is against flux-core-v0.11, which has its own issue tracker, and this issue has been found to be a duplicate.

@grondo grondo closed this as completed Jan 29, 2020
@dongahn
Copy link
Member Author

dongahn commented Jan 29, 2020

Yes, this is fine. I will test this once we fix the other problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants