Skip to content

MONAI_Preprocessors_and_Transforms_Design

Ben Murray edited this page Apr 3, 2020 · 1 revision

Pre-processor design

This page is a work in progress. For now, please look at the design discussion page for this topic.

Introduction

MONAI's preprocessor design is centered around a few key principles:

  • transforms should be as 'unopinionated' as possible
  • transforms should be simple to call outside of a Compose scenario
  • transforms should be callable within a Compose context
  • Compose should work with transforms from many different libraries

Canonical transforms

Transforms should look as much as possible like vanilla python functions. This means the following:

  • Transforms should list their parameters, with or without default values
  • Transforms shouldn't be responsible for passing parameters that they don't care about
    • If they do, it should be achieved through use of **kwargs
  • Transforms should return multiple values by tuple

torchvision.Compose and the problems it creates

The torchvision Compose function is a convenience method for chaining together a number of transforms to make a pre-processing pipeline. torchvision's Compose assumes that all transforms take a single argument and return a single argument. All of the torchvision transforms adhere to this requirement, but transforms from other libraries do not always do so.

One approach libraries take is to impose 'pass-through' semantics on their transforms, and provide a **kwargs parameter for doing so. This typically requires a Compose function that is **kwargs compatible, but then torchvision-style layers cannot be called using this style of Compose.

Other libraries take a different approach again, which is to have a single dict parameter and return a single dict value. They then enforce conventions on keywords in the dict so that entries in the dict effectively act as parameters. Again, this approach doesn't allow simple usage of transforms from different libraries.

MONAI's approach is to treat Compose as an anti-pattern, put make it easy to use if the user insists on doing so. This achieved through use of the adaptor function.

adaptor can be used to wrap transforms from various different libraries with torchvision's Compose function (or MONAI's Compose equivalent implementation for those who don't want the torchvision requirement).

How to use adaptor

Adaptor assumes that Compose is being called in such a way that each transform receives a dictionary and each transform returns a dictionary. You can wrap transforms that don't work this way in an adaptor call.


def load_data(path):
  def _inner():
    dictionary = {'image': get_images(path), 'labels': get_labels(path)}
    return dictionary
  return _inner

def simple_tx(image):
  # do something to image and return a copy
  return image

def complex_tx(dictionary):
  # do something to the 'image' entry in the dictionary and return a copy of the
  # modified dictionary
  new_image = work_on(dictionary['image'])
  dictionary = dict(dictionary)
  dictionary['image'] = new_image
  return dictionary

Compose([
    load_data(path),
    adaptor(simple_tx, 'image')
    complex_tx
])

You typically only need to specify the output name when wrapping a simple transform so that the adaptor knows which location to store the returned element in the dictionary that is passed from transform to transform.

Clone this wiki locally