Merge pull request #62 from will-moore/roi_table_on_dataset

Support ROI table on Dataset, for table with ROI ID column
ome · Mar 21, 2022 · cd7edb5 · cd7edb5
2 parents 148b6a4 + 536a76b
commit cd7edb5
Show file tree

Hide file tree

Showing 3 changed files with 278 additions and 42 deletions.
diff --git a/README.rst b/README.rst
@@ -65,8 +65,8 @@ populate
 
 This command creates an ``OMERO.table`` (bulk annotation) from a ``CSV`` file and links 
 the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project
-or Dataset. It also attempts to convert Image or Well names from the ``CSV`` into
-Image or Well IDs in the ``OMERO.table``.
+Dataset or Image. It also attempts to convert Image, Well or ROI names from the ``CSV`` into
+object IDs in the ``OMERO.table``.
 
 The ``CSV`` file must be provided as local file with ``--file path/to/file.csv``.
 
@@ -86,10 +86,10 @@ The ``# header`` row is optional. Default column type is ``String``.
 NB: Column names should not contain spaces if you want to be able to query
 by these columns.
 
-Examples:
+**Project / Dataset**
 
 To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name``
-and ``Image Name``::
+and ``Image Name`` or ``Image ID``::
 
     $ omero metadata populate Project:1 --file path/to/project.csv
 
@@ -102,7 +102,8 @@ project.csv::
     img-03.png,dataset01,0.093,3,TRITC
     img-04.png,dataset01,0.429,4,Cy5
 
-This will create an OMERO.table linked to the Project like this:
+This will create an OMERO.table linked to the Project like this with
+a new ``Image`` column with IDs:
 
 ========== ============ ======== ============= ============ =====
 Image Name Dataset Name ROI_Area Channel_Index Channel_Name Image
@@ -115,6 +116,9 @@ img-04.png dataset01    0.429    4             Cy5          36641
 
 If the target is a Dataset instead of a Project, the ``Dataset Name`` column is not needed.
 
+
+**Screen / Plate**
+
 To add a table to a Screen, the ``CSV`` file needs to specify ``Plate`` name and ``Well``.
 If a ``# header`` is specified, column types must be ``well`` and ``plate``.
 
@@ -142,36 +146,45 @@ Well  Plate  Drug   Concentration  Cell_Count  Percent_Mitotic  Well Name   Plat
 
 If the target is a Plate instead of a Screen, the ``Plate`` column is not needed.
 
-If the target is an Image, a csv with ROI-level and object-level data can be used to create an
-``OMERO.table`` (bulk annotation) as a ``File Annotation`` on an Image.
-The ROI identifying column can be an ``roi`` type column containing ROI ID, and ``Roi Name``
-column will be appended automatically (see example below). Alternatively, the input column can be
+**ROIs**
+
+If the target is an Image or a Dataset, a ``CSV`` with ROI-level or Shape-level data can be used to create an
+``OMERO.table`` (bulk annotation) as a ``File Annotation`` linked to the target object.
+If there is an ``roi`` column (header type ``roi``) containing ROI IDs, an ``Roi Name``
+column will be appended automatically (see example below). If a column of Shape IDs named ``shape``
+of type ``l`` is included, the Shape IDs will be validated (and set to -1 if invalid).
+Also if an ``image`` column of Image IDs is included, an ``Image Name`` column will be added.
+NB: Columns of type ``shape`` aren't yet supported on the OMERO.server.
+
+Alternatively, if the target is an Image, the ROI input column can be
 ``Roi Name`` (with type ``s``), and an ``roi`` type column will be appended containing ROI IDs.
 In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set.
 
 image.csv::
 
-    # header roi,l,d,l
-    Roi,object,probability,area
-    501,1,0.8,250
-    502,1,0.9,500
-    503,1,0.2,25
-    503,2,0.8,400
-    503,3,0.5,200
+    # header roi,l,l,d,l
+    Roi,shape,object,probability,area
+    501,1066,1,0.8,250
+    502,1067,2,0.9,500
+    503,1068,3,0.2,25
+    503,1069,4,0.8,400
+    503,1070,5,0.5,200
 
 This will create an OMERO.table linked to the Image like this:
 
-=== ====== =========== ==== ========
-Roi object probability area Roi Name
-=== ====== =========== ==== ========
-501 1      0.8         250  Sample1
-502 1      0.9         500  Sample2
-503 1      0.2         25   Sample3
-503 2      0.8         400  Sample3
-503 3      0.5         200  Sample3
-=== ====== =========== ==== ========
-
-Note that the ROI-level ``OMERO.table`` is not visible in the OMERO.web UI right-hand panel, but can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.
+=== ===== ====== =========== ==== ========
+Roi shape object probability area Roi Name
+=== ===== ====== =========== ==== ========
+501 1066  1      0.8         250  Sample1
+502 1067  2      0.9         500  Sample2
+503 1068  3      0.2         25   Sample3
+503 1069  4      0.8         400  Sample3
+503 1070  5      0.5         200  Sample3
+=== ===== ====== =========== ==== ========
+
+Note that the ROI-level data from an ``OMERO.table`` is not visible
+in the OMERO.web UI right-hand panel under the ``Tables`` tab,
+but the table can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.
 
 Developer install
 =================

diff --git a/src/omero_metadata/populate.py b/src/omero_metadata/populate.py
@@ -253,6 +253,7 @@ def create_columns_image(self):
         return self._create_columns("image")
 
     def _create_columns(self, klass):
+        target_class = self.target_object.__class__
         if self.types is not None and len(self.types) != len(self.headers):
             message = "Number of columns and column types not equal."
             raise MetadataError(message)
@@ -308,7 +309,7 @@ def _create_columns(self, klass):
                               self.DEFAULT_COLUMN_SIZE, list()))
                 # Ensure ImageColumn is named "Image"
                 column.name = "Image"
-            if column.__class__ is RoiColumn:
+            if column.__class__ is RoiColumn and target_class != DatasetI:
                 append.append(StringColumn(ROI_NAME_COLUMN, '',
                               self.DEFAULT_COLUMN_SIZE, list()))
                 # Ensure RoiColumn is named 'Roi'
@@ -446,7 +447,7 @@ def resolve(self, column, value, row):
             try:
                 return images_by_id[int(value)].id.val
             except KeyError:
-                log.debug('Image Id: %i not found!' % (value))
+                log.debug('Image Id: %s not found!' % (value))
                 return -1
             return
         if WellColumn is column_class:
@@ -458,6 +459,8 @@ def resolve(self, column, value, row):
             return self.wrapper.resolve_dataset(column, row, value)
         if RoiColumn is column_class:
             return self.wrapper.resolve_roi(column, row, value)
+        if column_as_lower == 'shape':
+            return self.wrapper.resolve_shape(value)
         if column_as_lower in ('row', 'column') \
            and column_class is LongColumn:
             try:
@@ -757,8 +760,36 @@ def __init__(self, value_resolver):
         super(DatasetWrapper, self).__init__(value_resolver)
         self.images_by_id = dict()
         self.images_by_name = dict()
+        self.rois_by_id = None
+        self.shapes_by_id = None
         self._load()
 
+    def resolve_roi(self, column, row, value):
+        # Support Dataset table with known ROI IDs
+        if self.rois_by_id is None:
+            self._load_rois()
+        try:
+            return self.rois_by_id[int(value)].id.val
+        except KeyError:
+            log.warn('Dataset is missing ROI: %s' % value)
+            return -1
+        except ValueError:
+            log.warn('Wrong input type for ROI ID: %s' % value)
+            return -1
+
+    def resolve_shape(self, value):
+        # Support Dataset table with known Shape IDs
+        if self.rois_by_id is None:
+            self._load_rois()
+        try:
+            return self.shapes_by_id[int(value)].id.val
+        except KeyError:
+            log.warn('Dataset is missing Shape: %s' % value)
+            return -1
+        except ValueError:
+            log.warn('Wrong input type for Shape ID: %s' % value)
+            return -1
+
     def get_image_id_by_name(self, iname, dname=None):
         return self.images_by_name[iname].id.val
 
@@ -800,12 +831,48 @@ def _load(self):
             images_by_id[iid] = image
             if iname in self.images_by_name:
                 raise Exception("Image named %s(id=%d) present. (id=%s)" % (
-                    iname, self.images_by_name[iname], iid
+                    iname, self.images_by_name[iname].id.val, iid
                 ))
             self.images_by_name[iname] = image
         self.images_by_id[self.target_object.id.val] = images_by_id
         log.debug('Completed parsing dataset: %s' % self.target_name)
 
+    def _load_rois(self):
+        log.debug('Loading ROIs in Dataset:%d' % self.target_object.id.val)
+        self.rois_by_id = {}
+        self.shapes_by_id = {}
+        query_service = self.client.getSession().getQueryService()
+        parameters = omero.sys.ParametersI()
+        parameters.addId(self.target_object.id.val)
+        data = list()
+        while True:
+            parameters.page(len(data), 1000)
+            rv = unwrap(query_service.projection((
+                'select distinct i, r, s '
+                'from Shape s '
+                'join s.roi as r '
+                'join r.image as i '
+                'join i.datasetLinks as dil '
+                'join dil.parent as d '
+                'where d.id = :id order by s.id desc'),
+                parameters, {'omero.group': '-1'}))
+            if len(rv) == 0:
+                break
+            else:
+                data.extend(rv)
+        if not data:
+            raise MetadataError("No ROIs on images in target Dataset")
+
+        for image, roi, shape in data:
+            # we only care about *IDs* of ROIs and Shapes in the Dataset
+            rid = roi.id.val
+            sid = shape.id.val
+            self.rois_by_id[rid] = roi
+            self.shapes_by_id[sid] = shape
+
+        log.debug('Completed loading ROIs and Shapes in Dataset: %s'
+                  % self.target_object.id.val)
+
 
 class ProjectWrapper(PDIWrapper):
 
@@ -894,6 +961,7 @@ class ImageWrapper(ValueWrapper):
     def __init__(self, value_resolver):
         super(ImageWrapper, self).__init__(value_resolver)
         self.rois_by_id = dict()
+        self.shapes_by_id = dict()
         self.rois_by_name = dict()
         self.ambiguous_naming = False
         self._load()
@@ -904,15 +972,25 @@ def get_roi_id_by_name(self, rname):
     def get_roi_name_by_id(self, rid):
         return unwrap(self.rois_by_id[rid].name)
 
+    def resolve_shape(self, value):
+        try:
+            return self.shapes_by_id[int(value)].id.val
+        except KeyError:
+            log.warn('Image is missing Shape: %s' % value)
+            return -1
+        except ValueError:
+            log.warn('Wrong input type for Shape ID: %s' % value)
+            return -1
+
     def resolve_roi(self, column, row, value):
         try:
             return self.rois_by_id[int(value)].id.val
         except KeyError:
             log.warn('Image is missing ROI: %s' % value)
-            return Skip()
+            return -1
         except ValueError:
             log.warn('Wrong input type for ROI ID: %s' % value)
-            return Skip()
+            return -1
 
     def _load(self):
         query_service = self.client.getSession().getQueryService()
@@ -930,9 +1008,10 @@ def _load(self):
         while True:
             parameters.page(len(data), 1000)
             rv = query_service.findAllByQuery((
-                'select distinct r from Image as i '
-                'join i.rois as r '
-                'where i.id = :id order by r.id desc'),
+                'select distinct s from Shape as s '
+                'join s.roi as r '
+                'join r.image as i '
+                'where i.id = :id order by s.id desc'),
                 parameters, {'omero.group': '-1'})
             if len(rv) == 0:
                 break
@@ -943,15 +1022,19 @@ def _load(self):
 
         rois_by_id = dict()
         rois_by_name = dict()
-        for roi in data:
+        shapes_by_id = dict()
+        for shape in data:
+            roi = shape.roi
             rid = roi.id.val
             rois_by_id[rid] = roi
+            shapes_by_id[shape.id.val] = shape
             if unwrap(roi.name) in rois_by_name.keys():
                 log.warn('Conflicting ROI names.')
                 self.ambiguous_naming = True
             rois_by_name[unwrap(roi.name)] = roi
         self.rois_by_id = rois_by_id
         self.rois_by_name = rois_by_name
+        self.shapes_by_id = shapes_by_id
         log.debug('Completed parsing image: %s' % self.target_name)
 
 
@@ -1148,8 +1231,8 @@ def preprocess_data(self, reader):
                     if isinstance(value, basestring):
                         column.size = max(
                             column.size, len(value.encode('utf-8')))
-                    # The following are needed for
-                    # getting post process column sizes
+                    # The following IDs are needed for
+                    # post_process() to get column sizes for names
                     if column.__class__ is WellColumn:
                         column.values.append(value)
                     elif column.__class__ is ImageColumn:
@@ -1164,6 +1247,8 @@ def preprocess_data(self, reader):
                     log.error('Original value "%s" now "%s" of bad type!' % (
                         original_value, value))
                     raise
+            # we call post_process on each single (mostly empty) row
+            # to get ids -> names
             self.post_process()
             for column in self.columns:
                 column.values = []