Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate whether it is worth it to enable XXH3_128bit build ids in LLVM, Mold and Binutils. #4603

Open
ermo opened this issue Dec 17, 2024 · 1 comment
Labels
Performance Related to measuring or improving performance per benchmark numbers Type: Feature Something can be enhanced.

Comments

@ermo
Copy link
Contributor

ermo commented Dec 17, 2024

Since #4570 and #4575 BLAKE3 hashes (truncated to 160 bits / 20 bytes) is now supported and is already a fair bit faster than SHA-1.

This issue is for documenting the speedup of XXH3_128bit vs. SHA-1 and BLAKE3 and then making a decision of whether this speedup is sufficient to make it worth pursuing.

See #1346 for context.

@ermo ermo added Type: Feature Something can be enhanced. Performance Related to measuring or improving performance per benchmark numbers labels Dec 17, 2024
@ermo ermo added this to the Solus 4.8 Epoch milestone Dec 17, 2024
@ermo ermo added this to Solus Dec 17, 2024
@github-project-automation github-project-automation bot moved this to Triage in Solus Dec 17, 2024
@ReillyBrogan
Copy link
Contributor

Leaving my notes here for the future.

State of XXH3 support:

  • XXH3 should be in binutils 2.44: bminor/binutils-gdb@2299dfd
  • Mold doesn't support XXH3 at all (nor any other XXH variant)
  • LLVM has the code to support XXH3_128 but it isn't used anywhere in the project outside of tests. The build-id code will need to be modified to work with XXH3_128 (which may be trivial, the LLVM code confuses me)

Other Notes:

  • Mold and LLD currently break the input into 1MB chunks and then hash each one in parallel before hashing the hashes of the 1MB chunk hashes. They are thus not very likely to benefit much from XXH3 given that their build-id calculation is already very fast and they're probably bottle-necked by IO anyway (needs testing).
  • Binutils (ld specifically) on the other hand uses only a single thread as far as I can see and if there's any improvement to be had it will likely show up here. I suspect it is also likely to be IO-bound however.
  • I can't imagine that there won't be some benefit to XXH3 with these three, but if the benefit is < 2% or so of linking time my preference would be to keep with blake3 so that the build-id lengths are consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Related to measuring or improving performance per benchmark numbers Type: Feature Something can be enhanced.
Projects
Status: Triage
Development

No branches or pull requests

2 participants