Skip to content

Commit 513d230

Browse files
committed
Auto merge of #6477 - Eh2406:add-a-timestamp-file, r=ehuss
touch some files when we use them This is a small change to improve the ability for a third party subcommand to clean up a target folder. I consider this part of the push to experiment with out of tree GC, as discussed in #6229. how it works? -------- This updates the modification time of a file in each fingerprint folder and the modification time of the intermediate outputs every time cargo checks that they are up to date. This allows a third party subcommand to look at the modification time of the timestamp file to determine the last time a cargo invocation required that file. This is far more reliable then the current practices of looking at the `accessed` time. `accessed` time is not available or disabled on many operating systems, and is routinely set by arbitrary other programs. is this enough to be useful? -------- The current implementation of cargo sweep on master will automatically use this data with no change to the code. With this PR, it will work even on systems that do not update `accessed` time. This also allows a crude script to clean some of the largest subfolders based on each files modification time. is this worth adding, or should we just build `clean --outdated` into cargo? ------ I would love to see a `clean --outdated` in cargo! However, I think there is a lot of design work before we can make something good enough to deserve the cargo teams stamp of approval. Especially as an in tree version will have to work with many use cases some of witch are yet to be designed (like distributed builds). Even just including `cargo-sweep`s existing functionality opens a full bike shop about what arguments to take, and in what form (`cargo-sweep` takes a days argument, but maybe we should have a minutes or a ISO standard time or ...). This PR, or equivalent, allows out of tree experimentation with all different interfaces, and is basically required for any `LRU` based system. (For example [Crater](rust-lang/crater#346) wants a GC that cleans files in an `LRU` manner to maintain a target folder below a target size. This is not a use case that is widely enough needed to be worth adding to cargo but one supported by this PR.) what are the downsides? ---- 1. There are legitimate performance concerns about writing so many small files during a NOP build. 2. There are legitimate concerns about unnecessary wrights on read-only filesystems. 3. If we add this, and it starts seeing widespread use, we may be de facto stabilizing the folder structure we use. (This is probably true of any system that allows out of tree experimentation.) 4. This may not be an efficient way to store the data. (It does have the advantage of not needing different cargos to manipulate the same file. But if you have a better idea please make a suggestion.)
2 parents 9b5d4b7 + 97363ca commit 513d230

File tree

3 files changed

+191
-2
lines changed

3 files changed

+191
-2
lines changed

src/cargo/core/compiler/fingerprint.rs

+16-2
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ use std::fs;
33
use std::hash::{self, Hasher};
44
use std::path::{Path, PathBuf};
55
use std::sync::{Arc, Mutex};
6+
use std::time::SystemTime;
67

78
use filetime::FileTime;
89
use log::{debug, info};
@@ -88,6 +89,7 @@ pub fn prepare_target<'a, 'cfg>(
8889

8990
let root = cx.files().out_dir(unit);
9091
let missing_outputs = {
92+
let t = FileTime::from_system_time(SystemTime::now());
9193
if unit.mode.is_doc() {
9294
!root
9395
.join(unit.target.crate_name())
@@ -98,8 +100,15 @@ pub fn prepare_target<'a, 'cfg>(
98100
.outputs(unit)?
99101
.iter()
100102
.filter(|output| output.flavor != FileFlavor::DebugInfo)
101-
.find(|output| !output.path.exists())
102-
{
103+
.find(|output| {
104+
if output.path.exists() {
105+
// update the mtime so other cleaners know we used it
106+
let _ = filetime::set_file_times(&output.path, t, t);
107+
false
108+
} else {
109+
true
110+
}
111+
}) {
103112
None => false,
104113
Some(output) => {
105114
info!("missing output path {:?}", output.path);
@@ -681,6 +690,11 @@ pub fn dep_info_loc<'a, 'cfg>(cx: &mut Context<'a, 'cfg>, unit: &Unit<'a>) -> Pa
681690

682691
fn compare_old_fingerprint(loc: &Path, new_fingerprint: &Fingerprint) -> CargoResult<()> {
683692
let old_fingerprint_short = paths::read(loc)?;
693+
694+
// update the mtime so other cleaners know we used it
695+
let t = FileTime::from_system_time(SystemTime::now());
696+
filetime::set_file_times(loc, t, t)?;
697+
684698
let new_hash = new_fingerprint.hash();
685699

686700
if util::to_hex(new_hash) == old_fingerprint_short {

tests/testsuite/freshness.rs

+172
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
use std::fs::{self, File, OpenOptions};
22
use std::io::prelude::*;
33
use std::net::TcpListener;
4+
use std::path::PathBuf;
45
use std::thread;
6+
use std::time::SystemTime;
57

68
use crate::support::paths::CargoPathExt;
79
use crate::support::registry::Package;
@@ -1178,6 +1180,176 @@ fn changing_rustflags_is_cached() {
11781180
.run();
11791181
}
11801182

1183+
fn simple_deps_cleaner(mut dir: PathBuf, timestamp: filetime::FileTime) {
1184+
// Cargo is experimenting with letting outside projects develop some
1185+
// limited forms of GC for target_dir. This is one of the forms.
1186+
// Specifically, Cargo is updating the mtime of files in
1187+
// target/profile/deps each time it uses the file.
1188+
// So a cleaner can remove files older then a time stamp without
1189+
// effecting any builds that happened since that time stamp.
1190+
let mut cleand = false;
1191+
dir.push("deps");
1192+
for dep in fs::read_dir(&dir).unwrap() {
1193+
let dep = dep.unwrap();
1194+
if filetime::FileTime::from_last_modification_time(&dep.metadata().unwrap()) <= timestamp {
1195+
fs::remove_file(dep.path()).unwrap();
1196+
println!("remove: {:?}", dep.path());
1197+
cleand = true;
1198+
}
1199+
}
1200+
assert!(
1201+
cleand,
1202+
"called simple_deps_cleaner, but there was nothing to remove"
1203+
);
1204+
}
1205+
1206+
#[test]
1207+
fn simple_deps_cleaner_dose_not_rebuild() {
1208+
let p = project()
1209+
.file(
1210+
"Cargo.toml",
1211+
r#"
1212+
[package]
1213+
name = "foo"
1214+
version = "0.0.1"
1215+
1216+
[dependencies]
1217+
bar = { path = "bar" }
1218+
"#,
1219+
)
1220+
.file("src/lib.rs", "")
1221+
.file("bar/Cargo.toml", &basic_manifest("bar", "0.0.1"))
1222+
.file("bar/src/lib.rs", "")
1223+
.build();
1224+
1225+
p.cargo("build").run();
1226+
p.cargo("build")
1227+
.env("RUSTFLAGS", "-C target-cpu=native")
1228+
.with_stderr(
1229+
"\
1230+
[COMPILING] bar v0.0.1 ([..])
1231+
[COMPILING] foo v0.0.1 ([..])
1232+
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
1233+
)
1234+
.run();
1235+
if is_coarse_mtime() {
1236+
sleep_ms(1000);
1237+
}
1238+
let timestamp = filetime::FileTime::from_system_time(SystemTime::now());
1239+
if is_coarse_mtime() {
1240+
sleep_ms(1000);
1241+
}
1242+
// This dose not make new files, but it dose update the mtime.
1243+
p.cargo("build")
1244+
.env("RUSTFLAGS", "-C target-cpu=native")
1245+
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
1246+
.run();
1247+
simple_deps_cleaner(p.target_debug_dir(), timestamp);
1248+
// This should not recompile!
1249+
p.cargo("build")
1250+
.env("RUSTFLAGS", "-C target-cpu=native")
1251+
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
1252+
.run();
1253+
// But this should be cleaned and so need a rebuild
1254+
p.cargo("build")
1255+
.with_stderr(
1256+
"\
1257+
[COMPILING] bar v0.0.1 ([..])
1258+
[COMPILING] foo v0.0.1 ([..])
1259+
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
1260+
)
1261+
.run();
1262+
}
1263+
1264+
fn fingerprint_cleaner(mut dir: PathBuf, timestamp: filetime::FileTime) {
1265+
// Cargo is experimenting with letting outside projects develop some
1266+
// limited forms of GC for target_dir. This is one of the forms.
1267+
// Specifically, Cargo is updating the mtime of a file in
1268+
// target/profile/.fingerprint each time it uses the fingerprint.
1269+
// So a cleaner can remove files associated with a fingerprint
1270+
// if all the files in the fingerprint's folder are older then a time stamp without
1271+
// effecting any builds that happened since that time stamp.
1272+
let mut cleand = false;
1273+
dir.push(".fingerprint");
1274+
for fing in fs::read_dir(&dir).unwrap() {
1275+
let fing = fing.unwrap();
1276+
1277+
if fs::read_dir(fing.path()).unwrap().all(|f| {
1278+
filetime::FileTime::from_last_modification_time(&f.unwrap().metadata().unwrap())
1279+
<= timestamp
1280+
}) {
1281+
fs::remove_dir_all(fing.path()).unwrap();
1282+
println!("remove: {:?}", fing.path());
1283+
// a real cleaner would remove the big files in deps and build as well
1284+
// but fingerprint is sufficient for our tests
1285+
cleand = true;
1286+
} else {
1287+
}
1288+
}
1289+
assert!(
1290+
cleand,
1291+
"called fingerprint_cleaner, but there was nothing to remove"
1292+
);
1293+
}
1294+
1295+
#[test]
1296+
fn fingerprint_cleaner_dose_not_rebuild() {
1297+
let p = project()
1298+
.file(
1299+
"Cargo.toml",
1300+
r#"
1301+
[package]
1302+
name = "foo"
1303+
version = "0.0.1"
1304+
1305+
[dependencies]
1306+
bar = { path = "bar" }
1307+
"#,
1308+
)
1309+
.file("src/lib.rs", "")
1310+
.file("bar/Cargo.toml", &basic_manifest("bar", "0.0.1"))
1311+
.file("bar/src/lib.rs", "")
1312+
.build();
1313+
1314+
p.cargo("build").run();
1315+
p.cargo("build")
1316+
.env("RUSTFLAGS", "-C target-cpu=native")
1317+
.with_stderr(
1318+
"\
1319+
[COMPILING] bar v0.0.1 ([..])
1320+
[COMPILING] foo v0.0.1 ([..])
1321+
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
1322+
)
1323+
.run();
1324+
if is_coarse_mtime() {
1325+
sleep_ms(1000);
1326+
}
1327+
let timestamp = filetime::FileTime::from_system_time(SystemTime::now());
1328+
if is_coarse_mtime() {
1329+
sleep_ms(1000);
1330+
}
1331+
// This dose not make new files, but it dose update the mtime.
1332+
p.cargo("build")
1333+
.env("RUSTFLAGS", "-C target-cpu=native")
1334+
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
1335+
.run();
1336+
fingerprint_cleaner(p.target_debug_dir(), timestamp);
1337+
// This should not recompile!
1338+
p.cargo("build")
1339+
.env("RUSTFLAGS", "-C target-cpu=native")
1340+
.with_stderr("[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]")
1341+
.run();
1342+
// But this should be cleaned and so need a rebuild
1343+
p.cargo("build")
1344+
.with_stderr(
1345+
"\
1346+
[COMPILING] bar v0.0.1 ([..])
1347+
[COMPILING] foo v0.0.1 ([..])
1348+
[FINISHED] dev [unoptimized + debuginfo] target(s) in [..]",
1349+
)
1350+
.run();
1351+
}
1352+
11811353
#[test]
11821354
fn reuse_panic_build_dep_test() {
11831355
let p = project()

tests/testsuite/support/mod.rs

+3
Original file line numberDiff line numberDiff line change
@@ -1603,6 +1603,9 @@ pub fn sleep_ms(ms: u64) {
16031603

16041604
/// Returns true if the local filesystem has low-resolution mtimes.
16051605
pub fn is_coarse_mtime() -> bool {
1606+
// If the filetime crate is being used to emulate HFS then
1607+
// return true, without looking at the actual hardware.
1608+
cfg!(emulate_second_only_system) ||
16061609
// This should actually be a test that $CARGO_TARGET_DIR is on an HFS
16071610
// filesystem, (or any filesystem with low-resolution mtimes). However,
16081611
// that's tricky to detect, so for now just deal with CI.

0 commit comments

Comments
 (0)