How to organize a custom multimodal dataset #789

Zrrr1997 · 2022-05-17T08:31:15Z

Zrrr1997
May 17, 2022

Hi there MONAILabel team,

I already read the FAQ and saw that the multimodality functionality is currently underway (PR #753).

I was wondering what the workflow for multimodal datasets will look like, e.g. PET-CT scans. Should all the PET and CT images get channel-stacked so that the model processes them in an early fusion way? Or will there also be late fusion support?

Another interesting question would be how would 3D-Slicer or any other annotation tool manage to visualize multiple modalities. Would you have to specify which channels correspond to which modality? Or would you just split them into two separate directories with corresponding filenames?

It would be great to have some insight on how this workflow would look since multimodality would be beneficial in gathering quality labels. I hope that these questions are not too vague, and as always, keep up the great work!

diazandr3s · 2022-05-17T08:53:09Z

diazandr3s
May 17, 2022
Maintainer

These all are very good questions, @Zrrr1997.

Yes, MONAI Label now supports multimodality/multichannel. As a demo, we've used the BRATS dataset, which has 4 modalities stacked in a single Nifti file.

I was wondering what the workflow for multimodal datasets will look like, e.g. PET-CT scans. Should all the PET and CT images get channel-stacked so that the model processes them in an early fusion way? Or will there also be late fusion support?

Assuming all image channels/modalities are registered, you should stack them into a single nifti file.
With regards to the labels and if using DeepEdit model, you should change the label names here, and the number of image modalities here.

Another interesting question would be how would 3D-Slicer or any other annotation tool manage to visualize multiple modalities. Would you have to specify which channels correspond to which modality? Or would you just split them into two separate directories with corresponding filenames?

Great question! If using MONAI Label, you should use a single file containing all the modalities. For BRATS I've used a single Nifti file with all 4 modalities stacked together. The issue is that 3D Slicer can visualize 4D volumes, but loading of 4D Nifti files is currently not supported. You can check the recent discussion about this here: #22 and an ugly/temporary "solution" here: #729 (comment)

I hope this helps.

1 reply

Zrrr1997 May 17, 2022
Author

Thanks, @diazandr3s, this does help me understand how the workflow would look. I will be looking forward to loading 4D Nifti files in Slicer when it becomes supported. I suppose we can use some workarounds in the meantime because it does not seem like a trivial update.

lassoan · 2022-05-17T12:51:57Z

lassoan
May 17, 2022
Collaborator

The issue is that 3D Slicer can visualize 4D volumes, but loading of 4D Nifti files is currently not supported.

Let's separate the discussion of 4D image support and 4D Nifti support in Slicer.

Slicer supports 4D images. Loading of 4D images from DICOM and NRRD (and from MetaImage with SlicerIGT extension) works well. 4D data can be edited, replayed (in slice views, as volume rendering, etc.), in sync with time sequences of other information (transforms, segmentations, annotations, etc.) There are 4D processing (cropping, registration, etc.) and plotting tools. 4D images can be saved as a single 4D NRRD files if geometry (origin, spacing, axis directions, and extent) of all 3D images in it are the same, otherwise each volume is saved in a separate file, metadata is written to an XML file, and these files are zipped into a single file.

Slicer just does not support 4D Nifti files, because Nifti already have been causing so much trouble for us with its ambiguities and it is so limited and neuroimaging focused, that it ends up being a very poor file format for generic medical image computing tasks. For example, while it is technically possible to store multichannel data in a Nifti file, there is no way to specify what each channel contains. The "best practice" for such needs among Nifti users is to add a similarly named .json file to the same folder and store additional metadata there. Storing metadata in such loosely attached additional files make workflows more complex and error-prone.

In contrast, NRRD files can store arbitrary metadata fields, so we could easily store all necessary metadata in a single file. Slicer already supports this, for example if you stack 4 3D volumes into a 4D file (such as it is done in some BRATS data set example) then Slicer stores it with this NRRD header:

NRRD0005
# Complete NRRD file format specification at:
# http://teem.sourceforge.net/nrrd/format.html
type: short
dimension: 4
space: right-anterior-superior
sizes: 4 256 256 130
space directions: none (0,-1,0) (0,0,-1) (1.2999954223632812,0,0)
kinds: list domain domain domain
labels: "modality" "" "" ""
endian: little
encoding: gzip
space origin: (-86.644897460937486,133.92860412597656,116.78569793701172)
measurement frame: (1,0,0) (0,1,0) (0,0,1)
axis 0 index type:=text
axis 0 index values:=T1 T2 T2CE FLAIR

(kinds standard field specifies that we store a list of 3D volumes, and the last two custom fields specify the labels for each volume)

I would recommend to use NRRD for general-purpose medical image storage and 4D NRRD file with the above fields for storage of multiple independent channels in a single file.

We can discuss adding 4D Nifti support for Slicer, but multi-modality image support would not be a valid driver for it, because Nifti is ill-suited for multi-modality image storage (too complex, yet too limited), and it does not officially support multi-modality image storage in multiple channels (the BRATS dataset example that I saw was a misuse of the Nifti standard). If we add 4D Nifti support then a valid motivation would be fMRI or DCE-MRI support, which can be properly stored in Nifti.

2 replies

Zrrr1997 Jun 2, 2022
Author

@lassoan Let's continue this discussion regarding the visualization of 4D images. Let's say we have two 3D volumes (PET and CT) with the same geometry after registration.

The workflow would be to define the header for these volumes and then stack them in the NRRD format, right? This would make sure Slicer could load them correctly and then visualize them.

At this point, I am not sure what Slicer would actually visualize. Typically, you have the three views for the Saggital, Coronal, and Axial view. However, in a 4D volume (PET + CT) you would also have the "imaging modality" dimension. How does Slicer handle this in the visualization of the volume? Is there a switch (checkbox) that visualizes both volumes overlayed over each other? Or can you only visualize only one of the imaging modalities at a time?

Also, a quick question: what would be the easiest way to stack two 3D volumes (say in Nifti) into one 4D NRRD volume? I started reading the NRRD file format specification and it seems that I have to specify geometry-related fields in the header myself (space origin, space directions etc.). Where can I read more about how to correctly set those fields or is there any tool which would do that automatically?

Thanks again for the great discussion!
Zdravko

lassoan Jun 2, 2022
Collaborator

If you open a 4D volume then you can explore the items using the "sequence browser" toolbar (in the screenshot the 4th axis is named "frame" but for a multimodality image it could be called "modality" and labels can be "CT" and "PET"):

Sequences infrastructure is set up for "real" sequences: for handling sequences of many similar kind of volumes, transforms, segmentations, annotations, etc. Therefore, by default it will display a single frame (in this case: modality) and you can quickly browse between them. If you want to display PET and CT at the same time (e.g., PET overlaid on CT) then you can add a second sequence browser node (each sequence browser node can display node at the selected index). You would probably also want to display different color LUT for PET. This is all doable, but not very convenient to do it manually each time such a data set is loaded. We would need to design a mechanism that would detect what kind of data is in the sequence and set up the visualization accordingly, fully automatically. We have such mechanism for PET/CT loading from DICOM - they are loaded as two separate volumes and color LUT is set up based on DICOM metadata. However, there is no standard established way for doing this for non-DICOM data, and especially for this kind of mixed/preprocessed 4D data, where images from different modalities are resampled and put into a single sequence.

One option would be to introduce visualization plugins: Python classes that can be registered in the client. When a data set is loaded, each plugin would get a chance to inspect the data set and provide a confidence value. The client then would use the plugin with the highest confidence to set up visualization for the data set. This mechanism would be generic enough and could work fully automatically.

Zrrr1997 · 2022-06-02T13:07:53Z

Zrrr1997
Jun 2, 2022
Author

Hi all,

Merging multiple 3D volumes into one 4D NRRD
I managed to create a 4D NRRD file from two registered 3D Nifti files (both are CTs but one is with a larger slice thickness). To answer my question from above: you can get all the geometrical metadata (space origin and direction etc.) by just loading one of the Nifti files in Slicer and then saving it as .nrrd. This will automatically create the header for you and it should be the same for all volumes if they are already registered.

To merge multiple 3D volumes in one 4D NRDD file you can just follow @lassoan's comment above and add the dimension: 4 in the header of the NRRD file as well as the kinds and sizes fields appropriately. This worked fine for me (see video).

However, in order to make it possible to switch between views in Slicer, I had to save it as .seq.nrrd instead of .nrrd. This way you can just click the sequence toolbar on top to choose which modality to view.

Annotation of 4D NRRD files?
You can take a look at the video to see my desired behavior. The "modalities" are just two registered CTs and I switch between them either with the "previous" and "next" buttons on the top left or with the slider for "frames".

My idea would be that a clinician could annotate one of the "modalities" with a paintbrush like in the video, then switch to the other modality to check if any structures are not annotated correctly (this would be even more relevant for PET-CT scans where the visual difference is extreme). However, it seems like MONAILabel cannot load .seq.nrrd files and this extension is not listed in monailabel/config.py.

Another question would be: How should user annotations be integrated into the input of the segmentation model? For example, if the input has the shape (2, H, W, D), where 2 refers to 2 modalities, should the scribbles also be (2, H, W, D), where we just have two identical (1, H, W, D) scribbles (just copy the annotations for each modality)? Or should one design an entirely new multi-modal segmentation model to handle this question (something in the lines of middle/late fusion)?

I hope that these questions open a fruitful discussion! @diazandr3s, please let me know if I have misunderstood that .seq.nrrd is not yet supported. I would really love to start implementing multimodal models as soon as the visualization questions are answered :)

two-modality-annotation.mp4

3 replies

pieper Jun 2, 2022

This is an interesting discussion and it's great to see progress to new application areas 👍

If the use case is really PET/CT, as opposed to generic multichannel data, then it will be much better to stick to dicom conventions as the primary data representation since the acquisitions are complex and highly variable across vendors and acquisition protocols. Converting to nrrd or nifti is very lossy and likely to be error prone, especially non-neuro 4D where there are essentially no standard conventions, so it should be considered an implementation detail for convenience or performance, but not something we expose or promote as a standard practice. Improving native dicom support in MONAI Label will make it more powerful for working in real-world clinical scenarios.

SachidanandAlle Jun 2, 2022
Maintainer

what we need at the end is Image Reader.. if we have some DICOM/XYZ reader and get the numpy/tensor (for running transforms+ inference) to represent that particular 2D/3D/4D image.. it solves the problem. MONAILabel can make use such loader(s) and load the required images instead of converting them to nifti/nrrd..

pieper Jun 3, 2022

it might be worth having a similar architecture Slicer, or reusing Slicer's, for mapping sets of dicom image instances to the format expected by a segmentation task.

https://slicer.readthedocs.io/en/latest/user_guide/modules/dicom.html#dicom-plugins

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to organize a custom multimodal dataset #789

{{title}}

Replies: 3 comments 6 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

How to organize a custom multimodal dataset #789

Zrrr1997 May 17, 2022

Replies: 3 comments · 6 replies

diazandr3s May 17, 2022 Maintainer

Zrrr1997 May 17, 2022 Author

lassoan May 17, 2022 Collaborator

Zrrr1997 Jun 2, 2022 Author

lassoan Jun 2, 2022 Collaborator

Zrrr1997 Jun 2, 2022 Author

pieper Jun 2, 2022

SachidanandAlle Jun 2, 2022 Maintainer

pieper Jun 3, 2022

Zrrr1997
May 17, 2022

Replies: 3 comments 6 replies

diazandr3s
May 17, 2022
Maintainer

Zrrr1997 May 17, 2022
Author

lassoan
May 17, 2022
Collaborator

Zrrr1997 Jun 2, 2022
Author

lassoan Jun 2, 2022
Collaborator

Zrrr1997
Jun 2, 2022
Author

SachidanandAlle Jun 2, 2022
Maintainer