Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input image transforms for some of the models may be improved #441

Open
DrLex0 opened this issue May 31, 2023 · 1 comment
Open

Input image transforms for some of the models may be improved #441

DrLex0 opened this issue May 31, 2023 · 1 comment

Comments

@DrLex0
Copy link

DrLex0 commented May 31, 2023

Is your feature request related to a problem? Please describe.
The input image transforms in some of the models may not be configured to offer optimal input to the model during training, validation, and inference.

For instance the endoscopic_inbody_classification example:

  1. uses a Resized transform that shrinks images to 256×256 pixels, but it does not enable anti_aliasing. Especially when downscaling large video frames with sharp details, patterns, or lines in them, this can lead to aliased artifacts that may cause the model to learn the wrong things or recognize structures that aren't really in the original image.
  2. uses NormalizeIntensityd with nonzero set to true. If there are zero-valued pixels in the image, they will not be scaled and offset together with the rest, which may cause discontinuities and again make it seem as if there are structures that aren't really there.
  3. uses NormalizeIntensitydwithout specifying a fixed subtrahend and divisor. This means the intensities in each image will be normalized according to the mean and standard deviation in that specific image. If the image only contains a narrow range of intensities, for instance a dark image with sensor noise, this will be blown up to a big noisy mess in which the model might recognize random things. The usual way of working for ImageNet and such, is to calculate the mean and stddev across the entire training set and use those values everywhere.

Describe the solution you'd like

  1. Set "anti_aliasing": true in the Resized transforms.
  2. Consider leaving nonzero at false in the NormalizeIntensityd transform unless there really is a good reason for it.
  3. Consider setting a fixed subtrahend and divisor in the NormalizeIntensityd transform.
  4. Re-train models with whatever parameters were changed.

Additional context
A paper that discusses the impact of aliasing in convolutional networks

I have tested the impact on processing time of enabling anti_aliasing. On a GPU (RTX A2000), the impact is tiny: transforming a 720p video frame and adding it to a batch takes 11ms instead of 9. On CPU the impact is much larger (55ms instead of 9).

@Nic-Ma
Copy link
Collaborator

Nic-Ma commented May 31, 2023

Hi @DrLex0 ,

Thanks for your feedback, I think the bundle config files show the exact config content we used to train the model. If you want to change any config content in runtime, you can override the args in your CLI, refer to:
https://github.com/Project-MONAI/MONAI/blob/dev/monai/bundle/scripts.py#L655

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants