Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior in CI and in interaction with cargo build #149

Open
TimJentzsch opened this issue Jan 20, 2023 · 10 comments
Open

Strange behavior in CI and in interaction with cargo build #149

TimJentzsch opened this issue Jan 20, 2023 · 10 comments
Labels
question Further information is requested

Comments

@TimJentzsch
Copy link

I'm currently trying to reduce CI times and cargo udeps is now the slowest job by a factor of five.
This is mainly caused by the job recompiling all dependencies on every run.

For the other cargo commands (e.g. cargo build), this can be avoided by caching (part of) the ~/.cargo and the target folder.
However, this approach doesn't appear to be working for cargo udeps.
Strangely enough, if you put two cargo udeps steps after one another, the second one does not recompile -- so it seems to depend on something outside of ~/.cargo and target.

While trying to debug this issue, I also found that the compilation part of cargo udeps seems to be incompatible with cargo build:
When you run them after one another, they each do a full recompile.
Looking into this in more detail, we can use an env variable to get more information on why cargo build recompiles:

$ CARGO_LOG=cargo::core::compiler::fingerprint=info cargo build

This gives very strange messages such as:

[2023-01-20T20:24:20Z INFO  cargo::core::compiler::fingerprint] dependency on `cfg_if` is newer than we are 1674246198.529277589s > 1674245814.324250255s "/home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/crossbeam-utils-0.8.14"

You can see an example here.

I'd like to be able to cache the progress of cargo udeps somehow, i.e. know which files it needs access to in order to not recompile all the dependencies.
This would significantly decrease the CI times.

I'm also curious about what is causing the incompatibility with cargo build, as cargo seems to be used directly in the code:

cargo::ops::compile_with_exec(&ws, &compile_opts, &exec)?;

This doesn't need to be fixed though, as I won't do cargo build before cargo udeps in the CI workflow.

Any help would be much appreciated! I hit a dead-end on my own research here.

@est31 est31 added the question Further information is requested label Jan 21, 2023
@est31
Copy link
Owner

est31 commented Jan 21, 2023

@TimJentzsch cargo-udeps has to pass special flags to some of the crates in the build graph. Thus, it purposefully makes cache behaviour different from cargo build, otherwise it would not work as the output of the flags is not present. As for the ~/.cargo and target directories, those are the only two which should be relevant for cargo-udeps, however if you share it via a cloud cache with cargo build jobs then it will lead to rebuilding.

Have you tried using a separate cache for cargo build and cargo udeps? That is, either represent ~/.cargo and target via different caches, or override the location of ~/.cargo and target via the CARGO_HOME and CARGO_TARGET_DIR env vars.

@TimJentzsch
Copy link
Author

@est31 Thanks for the quick reply!
The cache is already unique for the workflow job, so it should not be shared with other cargo commands.
The only thing that I can think of is that I have to run cargo install to get udeps installed, I'm not sure if that could interfere.

Which special flags do you set for the compilation process? Maybe they can be set for the install as well (if that's indeed the cause of the problem) to prevent the recompile.

@est31
Copy link
Owner

est31 commented Jan 21, 2023

I don't really know what's going on here. It might factor in the age of the cargo udeps binary and cargo install might touch that binary, but I doubt that. There can be quite many sources why cargo considers something to be recompiled.

@TimJentzsch
Copy link
Author

I tried to set CARGO_HOME and CARGO_TARGET_DIR differently for cargo udeps than for cargo install and then caching all of that, but for some reason this also doesn't work (see this workflow run).
Very strange, when nothing else touches those directories I would think that it doesn't need to recompile.

@est31
Copy link
Owner

est31 commented Jan 21, 2023

It seems to download stuff from scratch in the rerun. That means the cache isn't working at all?

@stevenh
Copy link

stevenh commented May 17, 2023

Banging my head against this too.

Caching ~/.cargo and workspace/target works fine for cargo build and cargo test but cargo udeps always rebuilds a some packages.

I've tried testing on locally intermixing cargo build and cargo udeps and after its run once it doesn't rebuild anything, only in CI.

Is there something else which might be causing it to rebuild some packages?

@stevenh
Copy link

stevenh commented May 17, 2023

I think I may have made a breakthrough on this, seems it thinks that the build is out of date because when rustup installs a toolchain it does so with current timestamps.

Dirty libc v0.2.144: the file /home/runner/.rustup/toolchains/nightly-2023-05-16-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-edb03adabf0b22c8.so has changed (1684349163.672937234s, 1h 33m after last build at 1684343583.107933784s)

@stevenh
Copy link

stevenh commented May 17, 2023

Ok confirmed, seems the toolchain is using file mtime to check if build is up to date, but as rustup always creates files with the current timestamp each time the CI installs the toolchain it makes it look like the toolchain libraries are newer than the build, triggering a rebuild.

Seems like fundamental toolchain issue, which rust-lang/cargo#6529 might address.

@stevenh
Copy link

stevenh commented May 17, 2023

Workaround in our flow was to add this to a step which uses the date of the nightly we're using as the date for all the files, ensuring they are in past.

find ~/.rustup/toolchains/nightly-2023-05-16-x86_64-unknown-linux-gnu -print0 | xargs -0 touch -d '2023-05-16'

You could get the date from rustc itself with something like:

rustup run nightly rustc --version --verbose | grep ^commit-date |awk '{print $2}'

@est31
Copy link
Owner

est31 commented May 18, 2023

@stevenh wow that's a pretty cool trick! It's a workaround, yes, but very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants