-
Notifications
You must be signed in to change notification settings - Fork 1.1k
MONAI_Preprocessors_and_Transforms_Design
This page is a work in progress. For now, please look at the design discussion page for this topic.
MONAI's preprocessor design is centered around a few key principles:
- transforms should be as 'unopinionated' as possible
- transforms should be simple to call outside of a
Compose
scenario - transforms should be callable within a
Compose
context -
Compose
should work with transforms from many different libraries
Transforms should look as much as possible like vanilla python functions. This means the following:
- Transforms should list their parameters, with or without default values
- Transforms shouldn't be responsible for passing parameters that they don't care about
- If they do, it should be achieved through use of
**kwargs
- If they do, it should be achieved through use of
- Transforms should return multiple values by tuple
The torchvision Compose
function is a convenience method for chaining together a number of transforms to make a pre-processing pipeline. torchvision's Compose
assumes that all transforms take a single argument and return a single argument. All of the torchvision transforms adhere to this requirement, but transforms from other libraries do not always do so.
One approach libraries take is to impose 'pass-through' semantics on their transforms, and provide a **kwargs
parameter for doing so. This typically requires a Compose
function that is **kwargs
compatible, but then torchvision-style layers cannot be called using this style of Compose
.
Other libraries take a different approach again, which is to have a single dict
parameter and return a single dict
value. They then enforce conventions on keywords in the dict
so that entries in the dict
effectively act as parameters.
Again, this approach doesn't allow simple usage of transforms from different libraries.
MONAI's approach is to treat Compose
as an anti-pattern, put make it easy to use if the user insists on doing so. This achieved through use of the adaptor
function.
adaptor
can be used to wrap transforms from various different libraries with torchvision's Compose
function (or MONAI's Compose
equivalent implementation for those who don't want the torchvision requirement).
Adaptor assumes that Compose
is being called in such a way that each transform receives a dictionary and each transform returns a dictionary. You can wrap transforms that don't work this way in an adaptor
call.
def load_data(path):
def _inner():
dictionary = {'image': get_images(path), 'labels': get_labels(path)}
return dictionary
return _inner
def simple_tx(image):
# do something to image and return a copy
return image
def complex_tx(dictionary):
# do something to the 'image' entry in the dictionary and return a copy of the
# modified dictionary
new_image = work_on(dictionary['image'])
dictionary = dict(dictionary)
dictionary['image'] = new_image
return dictionary
Compose([
load_data(path),
adaptor(simple_tx, 'image')
complex_tx
])
You typically only need to specify the output name when wrapping a simple transform so that the adaptor knows which location to store the returned element in the dictionary that is passed from transform to transform.