Reproducible test data #225

HexDecimal · 2024-08-29T21:45:47Z

It'd be nice to make the test data reproducible, where when the CI workflow rebuilds these files and verifies that the results are exactly what's already been committed to the repo.

I made some progress for this in #218, but I didn't commit any tests files since it was incomplete. I've been able to reproduce everything currently being built except for the .pyd files in the test wheels. I couldn't figure out what's causing their nondeterminism.

In theory this is a security risk. Anyone cloning the repo and immediately running tests will execute most of these test library files. I don't suspect any of the current files are malicious but that only way to actually verify that is to build the files myself. These non-reproducible files can not be easily verified during PR's.

Right now CI always rebuilds the test files before bundling them in sdist and wheels, but this does not account for files which are not overwritten.

CI and Makefiles should delete all non-human-readable test data as part of their clean step.

The text was updated successfully, but these errors were encountered:

HexDecimal added enhancement security labels Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducible test data #225

Reproducible test data #225

HexDecimal commented Aug 29, 2024

Reproducible test data #225

Reproducible test data #225

Comments

HexDecimal commented Aug 29, 2024