-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework and parallelize abi hashing #1225
base: main
Are you sure you want to change the base?
Conversation
Do you have numbers how much time was spend there? :) |
It's important to note that the time vcpkg does something is very small. Calculating abis takes around 75 ms or 10% of the total time vcpkg runs (excluding sub processes) when installing one port. I'm not sure if it's worth parallelizing all of it on a per-port level or parallelizing certain tasks like parsing cmake and finding script helpers. The goal is to find out what's most efficient and the code is split up to allow easy combination of the required steps. |
Isn't the main bottleneck just I/O? Also the parallelization is not allow to break existing abi hashes. |
Yes! That's why I pointed at script hashes which is something I'll definitely consider parallelizing. I'd also take a look at saving the abi file (and maybe also that weird sbom file) on disk. My profiling also suggests that parsing the cmake contents of a port for the use of helper functions takes some time - we loop through the cmake contents >100 times!
I'm not sure what you mean by that. ABI can't be broken by calculating hashes of one port in different order because the vector is sorted in the end. I'm also aware that we can't parallelize everything without iterating multiple times over install plan actions. |
The top 3 hottest paths are If we could get the compiler info for host and target triplet in parallel we could be even faster, but I don't think this is possible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BillyONeal I have a few questions about what is set in AbiInfo
in certain cases.
return; | ||
} | ||
|
||
abi_info.compiler_info = paths.get_compiler_info(*abi_info.pre_build_info, toolset); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we not need compiler info if either --head
or --editable
were used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't have a abi_info object you don't need the fill the compiler_info entry in it.
#802 adds a new PortAbiCache to speed up port abi creation. But this is only helpful if you have multiple ABIs for the same port, which you normally only have if you have host-triplet != target-triplet. But #802 builds a port multiple times with different abis and needed this (resulted in a multiple minutes speedup when using |
I don't think it's worth optimizing for host-triplet != target-triplet because it applies only to few ports. However, I think I might end up parallelizing |
…cpkg-tool into parallel-abi-hashes
Changes in behavior
Parallelized
get_cmake_script_hashes()
get_private_abi()
Additionally, those 2 parallelized functions are executed asynchronously while
initialize_abi_info()
is running. This function initializesAbiInfo
which includes getting toolset, compiler, and triplet info.The former
populate_abi_tag()
is split up in order to separate concerns and allow parallelization.