Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary cache: async push_success #908
base: main
Are you sure you want to change the base?
Binary cache: async push_success #908
Changes from 2 commits
95f0438
9d999d8
163d9cd
2a54205
0912655
5d7288c
10189ac
2567607
ecdd000
8e7ae61
850d7c9
548be38
6dbbf06
74b86fd
5171d3e
d69ed8f
2df42d5
5f1786e
93303c3
8a26c8b
aa7e52f
d46a4d6
5e51718
a9ac558
4faf674
b9be8c6
78ca081
579bfa9
103968e
dd32416
b666f94
15bb503
d995bfd
24cd026
92fc76b
3527227
48305b3
27fa076
bcd459a
f958d36
7a24007
50114f9
ca5f2b1
eccd9ee
e7837e0
969e7fc
2d5586f
809d0b6
455e29b
03fdfea
f4bad8c
26bbbd5
814e434
290e586
3cc3378
978ceae
47b56ce
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we can do this here. This is on the critical path for ctrl-c handling and should only be used for extremely fast, emergency tear-down behavior (like restoring the console).
If there happens to be an exit anywhere in any BinaryCache implementation, this would deadlock. Importantly, this include any sort of assertion we might want to do, like checking pointers for null.
Unfortunately, the only path forward I see is to call this (or appropriately scope the
BinaryCache
itself) at the relevant callers. The consequence of possibly not uploading some set of binary caches in the case of some unhandled program error (such as permissions issue on a directory expected to be writable) is vastly preferable to deadlocks.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the
BinaryCache::wait_for_async_complete()
implementation so it does not deadlock anymore.I also moved the call to
Checks::exit_with_code
which is not called when crtl+c is handled. (I personally would like to have a way to terminate vcpkg but wait until the binary cache is done so that I don't lose progress.)And I prefer it when build packages are uploaded to the binary caches before vcpkg exits because of an error, otherwise I have to build the already build packages again at a later point when there is no cache entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that it is, desirable to finish uploading on "understood" errors. For example, if a package failed to build or failed to be installed.
I was also wrong about my original assessment of a deadlock. My concern was the call path of the binary upload thread calling
Checks::unreachable()
or.value_or_exit()
, but it seems thatstd::thread::join()
does have a carve-out to handle this specific case: it will throw aresource_deadlock_would_occur
if you try to join yourself.I've put some other concerns below, but I don't want those to distract from my main point: We must make it as trivial / correct-by-construction as possible to guarantee that the binary cache thread NEVER attempts to wait on itself. I think the best approach for vcpkg right now is to add calls from
Install::perform()
etc toBinaryCache::wait_for_async_complete()
before any "user-facing" error, such as the exit guarded byresult.code != BuildResult::SUCCEEDED && keep_going == KeepGoing::NO
. This is motivated by the perspective that it's always safer to terminate than to join and possibly deadlock / race condition / etc.There's still a UB data race if the main thread and binary upload thread attempt to exit at the same time:
There's also a serious "scalability" problem if we ever want a second background thread for whatever reason, because BGThread A would join on BGThread B, while BGThread B tries to join on BGThread A. This might be solvable with ever more complex structures, such as a thread ownership DAG that gets threads to join only on their direct children, but I don't think the benefit is worth the cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UB and the joining itself could simply be prevented by doing a
if (std::this_thread::get_id() == instance->push_thread.get_id())
. My concern with the explicit approach is that it is easy to forget to call the waiting function of the BinaryCache and every time you want to exit you have to remember to call it. This seems to me to be very prone to human error.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now implemented your request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ras0219-msft Is there anything left that is preventing this PR from being merged?