Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dyld: Symbol not found: _starpu_mpi_world_rank #21

Closed
barracuda156 opened this issue May 15, 2023 · 12 comments
Closed

dyld: Symbol not found: _starpu_mpi_world_rank #21

barracuda156 opened this issue May 15, 2023 · 12 comments

Comments

@barracuda156
Copy link

I am trying to run tests with a port which depends on starpu. However that fails with missing symbol:

dyld: Symbol not found: _starpu_mpi_world_rank
  Referenced from: /opt/local/lib/libstarpu-1.4.1.dylib
  Expected in: dynamic lookup

Here what starpu itself links to:

10:~ svacchanda$ otool -L /opt/local/lib/libstarpu-1.4.dylib
/opt/local/lib/libstarpu-1.4.dylib:
	/opt/local/lib/libstarpu-1.4.1.dylib (compatibility version 2.0.0, current version 2.0.0)
	/opt/local/lib/libMacportsLegacySupport.dylib (compatibility version 1.0.0, current version 1.0.99)
	/opt/local/lib/libhwloc.15.dylib (compatibility version 22.0.0, current version 22.0.0)
	/opt/local/lib/libglpk.40.dylib (compatibility version 44.0.0, current version 44.1.0)
	/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 219.0.0)
	/opt/local/lib/libgcc/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.30.0)
	/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 751.63.0)
	/System/Library/Frameworks/IOKit.framework/Versions/A/IOKit (compatibility version 1.0.0, current version 275.0.0)
	/opt/local/lib/libgcc/libgcc_s.1.1.dylib (compatibility version 1.0.0, current version 1.1.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 125.2.11)

We do not build it with MPICH due to errors with it, see: #16 (comment)
Here is the portfile we use at the moment: https://github.com/macports/macports-ports/blob/master/devel/starpu/Portfile

@sthibaul
Copy link
Collaborator

dyld: Symbol not found: _starpu_mpi_world_rank

Could you send the output of make V=1 so we can make sure it's doing it as we expect it to?

We do not build it with MPICH due to errors with it, see: #16 (comment)

Isn't this already fixed in the 1.4 branch?

@barracuda156
Copy link
Author

barracuda156 commented May 15, 2023

@sthibaul Let me run the build (and tests as well), I will update soon. I will try without MPICH, as well as with MPICH (which failed last time I tried though – when I updated starpu in Macports).

UPD. So, tests are broken for the same reason:

dyld: Symbol not found: _starpu_mpi_world_rank
  Referenced from: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_devel_starpu/starpu/work/starpu-9563a47472940f4be9f199ffba10d40ef327cb44/src/.libs/libstarpu-1.4.1.dylib
  Expected in: dynamic lookup

FAIL fault-tolerance/retry (exit status: 133)

Here is the build log:
starpu_1.4_gcc12_10.6.8.log

Same story on a PowerMac, so it is neither a broken env on a specific machine nor a Rosetta issue.

@barracuda156
Copy link
Author

@sthibaul Please help with this, 1.3 was working reasonably fine, 1.4 is broken for us :(

@barracuda156
Copy link
Author

Let me try building with MPICH explicitly.

@sthibaul
Copy link
Collaborator

Here is the build log: starpu_1.4_gcc12_10.6.8.log

Ok, so the -Wl,-U -Wl,_starpu_mpi_world_rank option really is passed, but apparently that's not actually having the expected effect on macos (allow the symbol do be undefined). Do you happen to know what option we can pass to properly allow some undefined symbol on macos? Here we use a weak reference to detect whether libstarpumpi is loaded or not. We added that -U option precisely to allow this, but apparently in your case that's not working?

@barracuda156
Copy link
Author

@sthibaul Here is the build with MPICH. It actually works (in a sense of reaching completion), but same error with missing symbol.
starpu_1.4_mpich_10.6.8.log

@barracuda156
Copy link
Author

barracuda156 commented May 15, 2023

Here is the build log: starpu_1.4_gcc12_10.6.8.log

Ok, so the -Wl,-U -Wl,_starpu_mpi_world_rank option really is passed, but apparently that's not actually having the expected effect on macos (allow the symbol do be undefined). Do you happen to know what option we can pass to properly allow some undefined symbol on macos? Here we use a weak reference to detect whether libstarpumpi is loaded or not. We added that -U option precisely to allow this, but apparently in your case that's not working?

@sthibaul On macOS the correct flag is -undefined dynamic_lookup. I do not think -Wl,-U is supported (not sure here, but de facto it does not anyway).
P. S. No need to specify the symbol with it.

UPD. Let me try it first.

@barracuda156
Copy link
Author

No, that does not work, build break with the following then:

In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_worker.c:19:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ./core/jobs.h:24,
                 from sched_policies/component_sched.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_prio.c:17:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/schedulers/starpu_scheduler_toolbox.h:21,
                 from sched_policies/prio_deque.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_fifo.c:18:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~
In file included from ../include/starpu.h:49,
                 from ../include/starpu_sched_component.h:22,
                 from sched_policies/component_eager.c:17:
../include/starpu_thread.h:338:9: error: unknown type name 'pthread_barrier_t'
  338 | typedef pthread_barrier_t starpu_pthread_barrier_t;
      |         ^~~~~~~~~~~~~~~~~
../include/starpu_thread.h:339:9: error: unknown type name 'pthread_barrierattr_t'
  339 | typedef pthread_barrierattr_t starpu_pthread_barrierattr_t;
      |         ^~~~~~~~~~~~~~~~~~~~~
../include/starpu_thread.h:378:9: error: unknown type name 'pthread_spinlock_t'
  378 | typedef pthread_spinlock_t starpu_pthread_spinlock_t;
      |         ^~~~~~~~~~~~~~~~~~

Need some other solution, it seems.

@sthibaul
Copy link
Collaborator

that does not work

What do you mean? What did you try exactly?

@sthibaul
Copy link
Collaborator

It seems that Darwin doesn't actually support weak references, it only supports weak imports, and -undefined dynamic_lookup won't help. The previous -U option indeed doesn't work any more on Darwin. I have thus pushed just disabling the corresponding code, that should become available on github within a day.

@barracuda156
Copy link
Author

It seems that Darwin doesn't actually support weak references, it only supports weak imports, and -undefined dynamic_lookup won't help. The previous -U option indeed doesn't work any more on Darwin. I have thus pushed just disabling the corresponding code, that should become available on github within a day.

Thank you. Yeah, I tried replacing those -U flags in Makefile.am with -undefined dynamic_lookup and -flat_namespace -undefined suppress, or adding. Nothing worked.

@barracuda156
Copy link
Author

that should become available on github within a day.

@sthibaul I have made a patch and built with it. Everything works now. We only have two failures: #4 (comment)
And they look like not real failures but rather unsupported function, I guess?

Warning: could not get current CPU binding: Function not implemented

nfurmento pushed a commit that referenced this issue May 16, 2023
Darwin doesn't seem to be actually supporting weak references, to detect
when libstarpumpi is linked in. Let's just disable force_mpi_hostnames
support there.

Fixes #21

(cherry picked from commit 68ad5cd770bc4e1cc079062630be3617977c57b6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants