Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify SourceID Hash #14800

Merged
merged 2 commits into from
Dec 23, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions src/cargo/core/source_id.rs
Original file line number Diff line number Diff line change
Expand Up @@ -532,7 +532,10 @@ impl SourceId {
url == CRATES_IO_INDEX || url == CRATES_IO_HTTP_INDEX || is_overridden_crates_io_url(url)
}

/// Hashes `self`.
/// Hashes `self` to be used in the name of some Cargo folders, so shouldn't vary.
///
/// For git and url, `as_str` gives the serialisation of a url (which has a spec) and so
/// insulates against possible changes in how the url crate does hashing.
///
/// For paths, remove the workspace prefix so the same source will give the
/// same hash in different locations, helping reproducible builds.
Expand All @@ -550,7 +553,11 @@ impl SourceId {
return;
}
}
self.hash(into)
self.inner.kind.hash(into);
match self.inner.kind {
SourceKind::Git(_) => (&self).inner.canonical_url.hash(into),
_ => (&self).inner.url.as_str().hash(into),
}
}

pub fn full_eq(self, other: SourceId) -> bool {
Expand Down Expand Up @@ -665,16 +672,10 @@ impl fmt::Display for SourceId {
}
}

/// The hash of `SourceId` is used in the name of some Cargo folders, so shouldn't
/// vary. `as_str` gives the serialisation of a url (which has a spec) and so
/// insulates against possible changes in how the url crate does hashing.
impl Hash for SourceId {
fn hash<S: hash::Hasher>(&self, into: &mut S) {
self.inner.kind.hash(into);
match self.inner.kind {
SourceKind::Git(_) => self.inner.canonical_url.hash(into),
_ => self.inner.url.as_str().hash(into),
}
self.inner.canonical_url.hash(into);
}
Comment on lines -676 to 679
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most CanonicalUrl differences are related to git somehow. I'm assuming that means we can't hit them in any other way. There is the stripping of a trailing /. However, we probably should have been respecting that in the hash in the first place I assume. The question is if this will change anything significant enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps if you have a index hosted on git, then GITHUB.com/me/my-index.git/ could change a lot. I think the biggest concern here is having to recheck out dependencies. So even if this does affect people we might just eat that cost. On the other hand there's a lot of complexity here and maybe there are bigger problems I forgot about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't yet looked closer. I wonder if this would affect hacks like this one: #5478 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would be affected by this change. But it is something we have to watch for as I "clean up" this code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, since [patch.<url>] uses URLs directly, not SourceId.

This change may affect anywhere we let people define URL and parse it as SourceId, like the [registries] table? We could also deem this as a bug and fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hope that we rely on Eq for those cases and not Hash. So that the problem you identified is not a problem with this PR, rather one to investigate on the next PR that changes Eq.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Have checked other usages of it, none of them seems user-facing like the path to index files. It should be fine to move over.

}

Expand Down
Loading