Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elfdeps: Add full multiarch deps support #1038

Conversation

Conan-Kudo
Copy link
Member

This changes elfdeps to emit dependency strings that contain full architecture names instead of just declaring whether something is 64bit. This means that systems that allow more than two architectures to be installed on the same computer will actually be able to resolve library dependencies correctly.

This means that RPM dependencies would be compatible with system library install schemes like Debian's, where libraries are installed into subdirectories under /usr/lib that are named after the platform triple. It also allows for multiarch installations where foreign architecture packages are automatically relocated to be installed under a system root target location (e.g. /usr/<triple>/lib(64)) as is done in distributions like Exherbo.

As part of this change, the legacy behavior is now encoded behind the --biarch-deps flag, which is passed in for both Provides and Requires by default.

This behavior can be enabled by passing --multiarch-deps or by setting %__multiarch_deps to 1 in the spec or vendor configuration.

When setting %__multiarch_deps to 1 in the spec or vendor configuration, legacy Provides are still generated to allow distributions to support packages generated with the legacy ELF dependency generator behavior.

@Conan-Kudo
Copy link
Member Author

@pmatilai @ffesti I finally revised #360. Phew!

@Conan-Kudo Conan-Kudo force-pushed the elfdeps-support-multiarch branch 6 times, most recently from 7ac477c to 5af676e Compare February 2, 2020 03:39
@mikhailnov
Copy link
Contributor

It would be nice to add e2k here

@Conan-Kudo
Copy link
Member Author

Conan-Kudo commented Feb 2, 2020

@mikhailnov what is e2k? This architecture does not exist in rpm...

@mlschroe
Copy link
Collaborator

mlschroe commented Feb 3, 2020

I don't like that it doesn't output anything if neither --biarch-deps not --multiarch-deps is provided. I think it would be better to either change the option to --no-biarch-deps or at least default to biarch if neither option is given.

@Conan-Kudo Conan-Kudo force-pushed the elfdeps-support-multiarch branch 2 times, most recently from b790560 to f7ce837 Compare February 3, 2020 13:29
@Conan-Kudo
Copy link
Member Author

@mlschroe I've added a default to enable biarch when neither option is set.

@Conan-Kudo Conan-Kudo force-pushed the elfdeps-support-multiarch branch from f7ce837 to 7532c16 Compare February 3, 2020 13:50
@Conan-Kudo
Copy link
Member Author

@mlschroe This should be good to look at again. :)

@Conan-Kudo Conan-Kudo force-pushed the elfdeps-support-multiarch branch from 7532c16 to 17a2747 Compare February 4, 2020 12:20
@ffesti
Copy link
Contributor

ffesti commented Feb 20, 2020

Not a fan of the %__multiarch_deps macro tbh.
First it does not allow to completely switch to the new multiarch-deps whihc is something distributions probably want to do at some point. So it does not remove the need to eventually patch the .attr file.
The other question is whether there should be a macro at all. This invites package(r)s to fiddle with it when they probably shouldn't. Are there any use cases other that switching the distribution policy?

@ffesti
Copy link
Contributor

ffesti commented Feb 20, 2020

Otherwise this looks pretty reasonable.

@Conan-Kudo
Copy link
Member Author

@ffesti If someone is trying to build a cross-distro package, accounting for legacy from a newer-ish system (think Google Chrome or Bluejeans/Zoom Linux RPMs being built on a specific host and shipped for everybody), being able to flip behaviors makes sense.

@Conan-Kudo
Copy link
Member Author

Incidentally, that's why I'm always populating biarch Provides. We're going to be stuck with them for a long time due to RHEL and SLE and proprietary software.

@ffesti
Copy link
Contributor

ffesti commented Mar 11, 2020

Meh. I am still not a big fan of this solution, but we can still change the defaults later on. It indeed seems unlikely someone won't ship the compat Provides anytime soon.

@ffesti
Copy link
Contributor

ffesti commented Mar 11, 2020

Oh, another use case for some new architecture detection and handling library.

@pmatilai
Copy link
Member

This needs to be looked at together as a whole with any arch handling changes we are about to do, so we don't end up doing something now that we'll bitterly regret in six months time. So we need to build up the big picture first.

For example, while the arch marker generation seems pretty obvious in general, starting something new with hardcoding ARM stuff gives me the willies. I want ARM to stop being the freak case that always being worked around, and have the design flexible enough to cover it. We cannot foresee all such cases that might come up in the future so maybe the arch marker thing should allow for "arbitrary" multiple tokens that can be reliably parsed, in addition to the main arch name.

@Conan-Kudo
Copy link
Member Author

For example, while the arch marker generation seems pretty obvious in general, starting something new with hardcoding ARM stuff gives me the willies. I want ARM to stop being the freak case that always being worked around, and have the design flexible enough to cover it.

I'm not looking forward to RISC-V potentially adding another set of freak cases either... 😢

Thankfully, we've so far managed to avoid it, but...

@Conan-Kudo
Copy link
Member Author

I think I did okay in this PR avoiding many freak cases, though I definitely see your point about the broader architecture handling.

@berolinux
Copy link
Contributor

For the package overall, it may be best to just use the Provides/Requires mechanism with multiple things being generated, e.g.
Requires: cpu(arm)
Requires: cpu(neon)
Requires: cpu(thumb)

or
Requires: cpu(x86_64)
Requires: cpu(mmx)
Requires: cpu(sse)
Requires: cpu(sse2)

where cpu() features would be automatically Provide:d by rpm looking at the likes of /proc/cpuinfo, and could be manually provided by stuff like qemu-binfmt packages.

Of course it gets a lot trickier when we're dealing with the per-file libc.so.6()(64bit) bits, not sure we want something huge like Provides: libc.so.6(x86_64)(mmx)(sse)(sse2)(ssse3)(sse4)(sse4_1)(sse4_2)(avx)(avx2) and obviously something requiring that wouldn't necessarily know that libc.so.6(x86_64) or libc.so.6(x86_64)(mmx) would be sufficient to fulfill the dependency.
But obviously the nice thing about it would be that tools like dnf could automatically pick between libc.so.6(x86_64)(mmx)(sse)(sse2)(ssse3)(sse4)(sse4_1)(sse4_2)(avx)(avx2), libc.so.6(x86_64)(mmx)(sse)(sse2) and libc.so.6(x86_64) if multiple options are provided.

Of course going down further that route, it may also be necessary to provide information about CPU timings to allow picking between e.g. an Intel optimized and an AMD optimized build even if both will work on either type (-march=generic -mtune=whatever).
But something like
libc.so.6(x86_64)(mmx)(sse)(sse2)(ssse3)(sse4)(sse4_1)(sse4_2)(avx)(avx2)(tune:amd)
can get pretty long...

@pmatilai
Copy link
Member

@Conan-Kudo , I didn't say this looks bad or anything like that. It's just that we're talking about a change with massive, long-standing effects. With such a thing, we really need to know where we are going and where we want to be in 10 years rather than just "this change doesn't look half bad so why not".

@pmatilai
Copy link
Member

pmatilai commented Sep 9, 2020

Been thinking about this on and off. I'm totally convinced that we should record the dependency architecture, rather than the existing super dumb ()64bit() marker thing. I'm just far less convinced that the architecture should be embedded in each and every dependency string we store.

If we store the extracted dependency strings as-is, and associate the arch via a separate indexed arch table there are several benefits available:

  • The dependency strings are identical across architectures, making data compress better and making comparisons and queries saner, dependency on zlib is always "libz.so.1" and not some associated mumble
  • As the extracted dependency architectures are stored as an array of their own, we gain a kind of an automatic RPMTAG_ARCH replacement that is an array consisting of all the architectures this package depends on/provides something for. A "normal" package would of course only have one arch in there, but it might open up interesting possibilities to handle this generally, just like this PR doesn't care whether something is packageally correct or not, it just records what it finds.
  • Constructing dep(marker) style strings out of these arrays is easy if needed for eg repodata (in the short term)

@Conan-Kudo
Copy link
Member Author

@pmatilai Personally, I'd also like a solution to an annoying case that I have on macOS where I have binaries with multiple architectures in them. What you're saying here could probably be expanded to handle that too, I suppose, since it essentially turns architectures into qualifiers for files/dependencies, and we support having multiple qualifiers per dependency already.

@pmatilai
Copy link
Member

Apparently linking from discussions doesn't behave the same as with tickets, so linking this here manually: #2060

This changes elfdeps to emit dependency strings that contain full
architecture names instead of just declaring whether something is
"64bit". This means that systems that allow more than two architectures
to be installed on the same computer will actually be able to resolve
library dependencies correctly.

This means that RPM dependencies would be compatible with system library
install schemes like Debian's, where libraries are installed into
subdirectories under "/usr/lib" that are named after the platform
triple. It also allows for multiarch installations where foreign
architecture packages are automatically relocated to be installed under
a system root target location (e.g. "/usr/<triple>/lib(64)") as is done in
distributions like Exherbo.

As part of this change, the legacy behavior is now encoded behind the
--biarch-deps flag, which is passed in for both Provides and Requires
by default.

This behavior can be enabled by passing --multiarch-deps or by setting
%__multiarch_deps to 1 in the spec or vendor configuration.

When setting %__multiarch_deps to 1 in the spec or vendor configuration,
biarch Provides are still generated to allow distributions to support
packages generated with the legacy ELF dependency generator behavior.
@Conan-Kudo Conan-Kudo force-pushed the elfdeps-support-multiarch branch from 17a2747 to 4948648 Compare August 11, 2022 09:15

/* First the machine type... */
switch (ehdr->e_machine) {
case EM_SPARC:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most(*) 32-bit Linux binaries are normally of type EM_SPARC32PLUS (that's v8+ or v8a), so it should be added.

(*) very unscientific measurement based on http://archive.debian.org/debian-archive/debian/pool/main/g/glibc/libc6-sparcv9_2.3.6.ds1-13etch10_sparc.deb

@berolinux
Copy link
Contributor

I like the patch, but I'm wondering if it wouldn't be better to go a step further and put the whole target triplet instead of just the arch marker.
On systems that can emulate other systems (think qemu, or Linux emulation on FreeBSD) or even systems with multiple libcs or multiple float ABIs, it can make sense to have libxyz.so.1(x86_64-linux-gnu), libxyz.so.1(x86_64-freebsd), libxyz.so.1(x86_64-linux-musl), libxyz.so.1(i686-pc-linux-gnu) etc. all coexisting (and located in /usr//lib as mentioned in the initial message).

@Conan-Kudo
Copy link
Member Author

I like the patch, but I'm wondering if it wouldn't be better to go a step further and put the whole target triplet instead of just the arch marker.

That's a lot more extra information that can be derived from other dependencies. For example, knowing whether it's musl or glibc is easy enough to derive from the libc dependency that gets added anyway. The rest of the triplet is similarly derivable.

@pmatilai
Copy link
Member

Okay, I think this has sat here long enough.

We basically already came to a decision to add multiarch support to rpm v6 format in 2019 (see #2197) but the details left open. I'm not merging it as I don't think this is exactly how it's going to be, so I'm closing this now, lets move the discussion to the ticket instead. Thanks Neal for the initiative and the patch, this probably will serve as a basis regardless (so I've grabbed a copy of this patch, but please don't delete the branch anyway).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants