-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test passes locally but fails on CI with identical environments #84
Comments
Hey @oeway this job https://github.com/bioimage-io/collection/actions/runs/9942967885 has been queuing for a while but there are no other jobs running. Could you have a look? |
With the same conda environment on my EMBL disk, running I believe it's a problem cause by environment variable and/or CPU architecture. |
Alright, now we know that AVX2 and AVX512 on Xeon give similar but slightly different results (Mismatched elements: |
Another test I made was: I opened a Xeon Jupyter Hub instance and the output matches my Xeon kreshuk-gpu1. This rules out the possibility of user-set environment variables being the cause. |
TorchScript is deterministic for the same input, given the same model state and environment. This means that, in theory, for the same input, it should always produce the same output if no external factors change. Both
bioimageio test rdf.yml pytorch_state_dict
andbioimageio test rdf.yml torchscript
passes in my env created frommamba create -n bioimageio.core -c conda-forge -c pytorch bioimageio.core pytorch
, ormamba create -n bioimageio.core -c conda-forge -c pytorch bioimageio.core pytorch torchvision torchaudio cpuonly
, ormamba create -n bioimageio.core.online -c pytorch -c conda-forge "bioimageio.core==0.6.7" "pytorch==2.3.1" "blas==1.0" "mkl==2022.2.1" "numpy==1.26.4" torchvision torchaudio cpuonly
which generates identical list packages and versions to the CI versionAnyways, CI fails, such as this: https://github.com/bioimage-io/collection/actions/runs/9892771006/job/27326341115
conda list
outputThe text was updated successfully, but these errors were encountered: