Skip to content

Hold the mutate exclusive lock when vendoring #12509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 20, 2023

Conversation

0xPoe
Copy link
Member

@0xPoe 0xPoe commented Aug 16, 2023

What does this PR try to resolve?

close #12254

Held the mutate exclusive lock when vendoring. This would block all other processes which want to read or change the unpacked files.

See: #12509 (comment)

r? @weihanglo

@rustbot rustbot added Command-vendor S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 16, 2023
@0xPoe 0xPoe marked this pull request as draft August 16, 2023 02:45
@ehuss
Copy link
Contributor

ehuss commented Aug 16, 2023

I assume this is related to #12254?

@weihanglo
Copy link
Member

Thanks and sorry I didn't speak it more clearly. I would expect to see acquiring a lock there first. If somebody can working on digging into the issue deeper, we can perhaps remove this piece of code when we feel safe.

Thinking a bit more on the integrity of cargo-vendor, re-unpacking source looks like a safer solution before we can find a way to resolve #9455. We never know if some other tools changes unpacked registry sources, which is already broken when you run cargo build today, unfortunately 😢.

@ehuss
Copy link
Contributor

ehuss commented Aug 16, 2023

I would expect to see acquiring a lock there first.

Just FYI, I don't think the existing locking would help here, since the problem is with the build process which does not hold a lock. I am currently reworking locking for the garbage collection support, which adds read locks to the build process. It might be usable for this situation.

I'm also curious, the other alternative from the original issue was to read from the .crate files instead of the src cache. I'm wondering how difficult that would be to support?

@0xPoe
Copy link
Member Author

0xPoe commented Aug 17, 2023

I'm also curious, the other alternative from the original issue was to read from the .crate files instead of the src cache. I'm wondering how difficult that would be to support?

Do you mean we need to unpack it again from .cargo/registry/cache/xxx/some-pkg.crate?

@ehuss
Copy link
Contributor

ehuss commented Aug 17, 2023

Do you mean we need to unpack it again from .cargo/registry/cache/xxx/some-pkg.crate?

Yea, to have cargo vendor directly read from the .crate and extract it into the vendor folder instead of reading from the src cache. I'm not sure what the API for that would look like, I'm just curious what kind of options there are or how difficult that would be to do.

@0xPoe
Copy link
Member Author

0xPoe commented Aug 18, 2023

I'm not sure what the API for that would look like, I'm just curious what kind of options there are or how difficult that would be to do.

I think it can be done. The tricky part is that we have to calculate the checksum for each file we want to copy. So we may need another unpack_package function with filters and checksum calculation.

But there is an easy way to do it, we can unpack it to a temp folder first and then execute the original logic. But I think this has a big impact on performance.

@0xPoe
Copy link
Member Author

0xPoe commented Sep 17, 2023

I think it can be done. The tricky part is that we have to calculate the checksum for each file we want to copy. So we may need another unpack_package function with filters and checksum calculation.

@ehuss Do you think this is a reasonable way to move forward? I am happy to give it a try.

@ehuss
Copy link
Contributor

ehuss commented Sep 23, 2023

I'm looking over the vendor code, and I think it looks like it would be quite challenging to directly extract from the .crate files. There would need to be two variants of cp_sources, one that reads from a source directory (like git dependencies) and one from .crate files. It also seems complex, since there are multiple workspaces and multiple sets of Sources (a potentially different set from each workspace). It could possibly build a union of all the sources, and then ask the appropriate Source to handle extracting into the target directory. That seems like it could be a pretty large change to me, though, and quite hard to avoid duplicating some of the code.

#12706 is introducing the cache locking support. That might be a much easier option, since it would be a roughly one line change to acquire an exclusive lock. It's still unpleasant, since this remove_dir_all stuff has a fairly significant performance penalty, and it would be nice to remove it.

Unfortunately I don't think I have the time to dig into exactly how to architect a design that would support extracting .crate files directly. Perhaps @weihanglo can help if you want to continue down that path? It would be nice (due to the significant performance improvement), but it seems like it would require a lot of changes.

@weihanglo
Copy link
Member

r? weihanglo

Would find a time looking into it and discuss with hi-rustin :)

@0xPoe 0xPoe force-pushed the rustin-patch-revert branch from 8618a82 to 2e77dc1 Compare December 18, 2023 01:47
@0xPoe 0xPoe marked this pull request as ready for review December 18, 2023 01:52
@0xPoe 0xPoe changed the title Revert alexcrichton/cargo-vendor#131 Hold the mutate exclusive lock when vendoring Dec 18, 2023
@rustbot
Copy link
Collaborator

rustbot commented Dec 18, 2023

Could not assign reviewer from: weihanglo.
User(s) weihanglo are either the PR author, already assigned, or on vacation, and there are no other candidates.
Use r? to specify someone else to assign.

@0xPoe
Copy link
Member Author

0xPoe commented Dec 18, 2023

#12706 is introducing the cache locking support. That might be a much easier option, since it would be a roughly one line change to acquire an exclusive lock. It's still unpleasant, since this remove_dir_all stuff has a fairly significant performance penalty, and it would be nice to remove it.

I've added the exclusive lock now. But it is still very difficult to remove remove_dir_all stuff because we don't know the reason of #5956. I don't think it's safe to remove this code unless we force users not to change the unpacked file in the src directory.

@weihanglo @ehuss Should we move forward with this PR? Or do you have any other thoughts?

@ehuss
Copy link
Contributor

ehuss commented Dec 18, 2023

I'm fine with moving forward with this. I agree that it seems like removing remove_dir_all will be quite challenging.

Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works as expected. Thank you.

Still not pleasant, as now cargo hold the mutate exclusive lock for the entire duration of cargo vendor.

I wonder if there is a way to “downgrade” from MutateExclusive to DownloadExclusive/Shared so that cargo doesn't need to hold the lock that long.

@weihanglo
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Dec 20, 2023

📌 Commit 2e77dc1 has been approved by weihanglo

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 20, 2023
@bors
Copy link
Contributor

bors commented Dec 20, 2023

⌛ Testing commit 2e77dc1 with merge 9a77459...

@bors
Copy link
Contributor

bors commented Dec 20, 2023

☀️ Test successful - checks-actions
Approved by: weihanglo
Pushing 9a77459 to master...

@bors bors merged commit 9a77459 into rust-lang:master Dec 20, 2023
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 22, 2023
Update cargo

10 commits in 1a2666ddd14cf0a255d4ddb61c63531c259a7b39..363a2d11320faf531f6aacd1ea067c6bc08343b9
2023-12-17 17:53:53 +0000 to 2023-12-22 03:12:42 +0000
- refactor: centralize git checkouts and db paths (rust-lang/cargo#13187)
- Bump to 0.78.0; update changelog (rust-lang/cargo#13192)
- refactor: custom error types for `cargo-util-schemas` (rust-lang/cargo#13186)
- chore(deps): update rust crate handlebars to `v4.5.0` (rust-lang/cargo#13168)
- Hold the mutate exclusive lock when vendoring (rust-lang/cargo#12509)
- refactor: clean up package metadata (rust-lang/cargo#13184)
- ci: check SemVer for cargo-util-schemas on CI (rust-lang/cargo#13185)
- refactor(schemas): Pull out as `cargo-util-schemas` (rust-lang/cargo#13178)
- chore(rustfix): rename Readme.md to README.md (rust-lang/cargo#13181)
- chore(rustfix): remove useless clippy rules and fix a typo (rust-lang/cargo#13182)

r? ghost
@rustbot rustbot added this to the 1.77.0 milestone Dec 23, 2023
weihanglo added a commit to weihanglo/cargo that referenced this pull request Mar 4, 2025
cargo-vendor has a workaround that to mitigate rust-lang#5956,
it removes all cached sources in order to trigger a re-unpack.
It was meant for dealing with registry sources only, but accidentally
applied to directory source kind.

While directory source kind was invented for vendoring,
and vendoring from on vendored directory to the other seems unusual,
Cargo IMO should not delete any real sources.

It does not mean that registry sources are cached so is okay to delete,
In long term, we should explore a way that unpacks `.crate` files
directly, without any removal. See
rust-lang#12509 (comment)
weihanglo added a commit to weihanglo/cargo that referenced this pull request Mar 4, 2025
cargo-vendor has a workaround that to mitigate rust-lang#5956,
it removes all cached sources in order to trigger a re-unpack.
It was meant for dealing with registry sources only, but accidentally
applied to directory source kind.

While directory source kind was invented for vendoring,
and vendoring from on vendored directory to the other seems unusual,
Cargo IMO should not delete any real sources.

It does not mean that registry sources are cached so is okay to delete,
In long term, we should explore a way that unpacks `.crate` files
directly, without any removal. See
rust-lang#12509 (comment)
weihanglo added a commit to weihanglo/cargo that referenced this pull request Mar 4, 2025
cargo-vendor has a workaround that to mitigate rust-lang#5956,
it removes all cached sources in order to trigger a re-unpack.
It was meant for dealing with registry sources only, but accidentally
applied to directory source kind.

While directory source kind was invented for vendoring,
and vendoring from one vendored directory to the other seems unusual,
Cargo IMO should not delete any real sources.

It does not mean that registry sources are okay to delete,
In long term, we should explore a way that unpacks `.crate` files
directly, without any removal. See
rust-lang#12509 (comment)
github-merge-queue bot pushed a commit that referenced this pull request Mar 4, 2025
### What does this PR try to resolve?

Fixes #15244

With this fix,
`cargo vendor` will not delete original sources,
if you want to vendor things from one directory sources to the other

#### Background

cargo-vendor has a workaround that to mitigate #5956:
it removes all cached sources in order to trigger a re-unpack.
It was meant for dealing with registry sources only,
but accidentally applied to directory source kind.

While directory source kind was invented for vendoring,
and vendoring from one vendored directory to the other seems unusual,
Cargo IMO should not delete any real sources.

It does not mean that registry sources are okay to delete,
In long term, we should explore a way that unpacks `.crate` files
directly, without any removal. See
#12509 (comment)

### How should we test and review this PR?

The added test should suffice.

Also, although this is for fixing #15244,
`cargo vendor` still doesn't support vendor from and to the same
location.
Unless we figure out an `rsync`-like solutin to update vendor sources,
it is not going to support in short term.
(And I also doubt the real world use case of it)

### Additional information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Command-vendor S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cargo vendor modifies the package cache without locking
5 participants