Skip to content

Commit

Permalink
Merge pull request #62 from will-moore/roi_table_on_dataset
Browse files Browse the repository at this point in the history
Support ROI table on Dataset, for table with ROI ID column
  • Loading branch information
jburel authored Mar 21, 2022
2 parents 148b6a4 + 536a76b commit cd7edb5
Show file tree
Hide file tree
Showing 3 changed files with 278 additions and 42 deletions.
67 changes: 40 additions & 27 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ populate

This command creates an ``OMERO.table`` (bulk annotation) from a ``CSV`` file and links
the table as a ``File Annotation`` to a parent container such as Screen, Plate, Project
or Dataset. It also attempts to convert Image or Well names from the ``CSV`` into
Image or Well IDs in the ``OMERO.table``.
Dataset or Image. It also attempts to convert Image, Well or ROI names from the ``CSV`` into
object IDs in the ``OMERO.table``.

The ``CSV`` file must be provided as local file with ``--file path/to/file.csv``.

Expand All @@ -86,10 +86,10 @@ The ``# header`` row is optional. Default column type is ``String``.
NB: Column names should not contain spaces if you want to be able to query
by these columns.

Examples:
**Project / Dataset**

To add a table to a Project, the ``CSV`` file needs to specify ``Dataset Name``
and ``Image Name``::
and ``Image Name`` or ``Image ID``::

$ omero metadata populate Project:1 --file path/to/project.csv

Expand All @@ -102,7 +102,8 @@ project.csv::
img-03.png,dataset01,0.093,3,TRITC
img-04.png,dataset01,0.429,4,Cy5

This will create an OMERO.table linked to the Project like this:
This will create an OMERO.table linked to the Project like this with
a new ``Image`` column with IDs:

========== ============ ======== ============= ============ =====
Image Name Dataset Name ROI_Area Channel_Index Channel_Name Image
Expand All @@ -115,6 +116,9 @@ img-04.png dataset01 0.429 4 Cy5 36641

If the target is a Dataset instead of a Project, the ``Dataset Name`` column is not needed.


**Screen / Plate**

To add a table to a Screen, the ``CSV`` file needs to specify ``Plate`` name and ``Well``.
If a ``# header`` is specified, column types must be ``well`` and ``plate``.

Expand Down Expand Up @@ -142,36 +146,45 @@ Well Plate Drug Concentration Cell_Count Percent_Mitotic Well Name Plat

If the target is a Plate instead of a Screen, the ``Plate`` column is not needed.

If the target is an Image, a csv with ROI-level and object-level data can be used to create an
``OMERO.table`` (bulk annotation) as a ``File Annotation`` on an Image.
The ROI identifying column can be an ``roi`` type column containing ROI ID, and ``Roi Name``
column will be appended automatically (see example below). Alternatively, the input column can be
**ROIs**

If the target is an Image or a Dataset, a ``CSV`` with ROI-level or Shape-level data can be used to create an
``OMERO.table`` (bulk annotation) as a ``File Annotation`` linked to the target object.
If there is an ``roi`` column (header type ``roi``) containing ROI IDs, an ``Roi Name``
column will be appended automatically (see example below). If a column of Shape IDs named ``shape``
of type ``l`` is included, the Shape IDs will be validated (and set to -1 if invalid).
Also if an ``image`` column of Image IDs is included, an ``Image Name`` column will be added.
NB: Columns of type ``shape`` aren't yet supported on the OMERO.server.

Alternatively, if the target is an Image, the ROI input column can be
``Roi Name`` (with type ``s``), and an ``roi`` type column will be appended containing ROI IDs.
In this case, it is required that ROIs on the Image in OMERO have the ``Name`` attribute set.

image.csv::

# header roi,l,d,l
Roi,object,probability,area
501,1,0.8,250
502,1,0.9,500
503,1,0.2,25
503,2,0.8,400
503,3,0.5,200
# header roi,l,l,d,l
Roi,shape,object,probability,area
501,1066,1,0.8,250
502,1067,2,0.9,500
503,1068,3,0.2,25
503,1069,4,0.8,400
503,1070,5,0.5,200

This will create an OMERO.table linked to the Image like this:

=== ====== =========== ==== ========
Roi object probability area Roi Name
=== ====== =========== ==== ========
501 1 0.8 250 Sample1
502 1 0.9 500 Sample2
503 1 0.2 25 Sample3
503 2 0.8 400 Sample3
503 3 0.5 200 Sample3
=== ====== =========== ==== ========

Note that the ROI-level ``OMERO.table`` is not visible in the OMERO.web UI right-hand panel, but can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.
=== ===== ====== =========== ==== ========
Roi shape object probability area Roi Name
=== ===== ====== =========== ==== ========
501 1066 1 0.8 250 Sample1
502 1067 2 0.9 500 Sample2
503 1068 3 0.2 25 Sample3
503 1069 4 0.8 400 Sample3
503 1070 5 0.5 200 Sample3
=== ===== ====== =========== ==== ========

Note that the ROI-level data from an ``OMERO.table`` is not visible
in the OMERO.web UI right-hand panel under the ``Tables`` tab,
but the table can be visualized by clicking the "eye" on the bulk annotation attachment on the Image.

Developer install
=================
Expand Down
107 changes: 96 additions & 11 deletions src/omero_metadata/populate.py
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ def create_columns_image(self):
return self._create_columns("image")

def _create_columns(self, klass):
target_class = self.target_object.__class__
if self.types is not None and len(self.types) != len(self.headers):
message = "Number of columns and column types not equal."
raise MetadataError(message)
Expand Down Expand Up @@ -308,7 +309,7 @@ def _create_columns(self, klass):
self.DEFAULT_COLUMN_SIZE, list()))
# Ensure ImageColumn is named "Image"
column.name = "Image"
if column.__class__ is RoiColumn:
if column.__class__ is RoiColumn and target_class != DatasetI:
append.append(StringColumn(ROI_NAME_COLUMN, '',
self.DEFAULT_COLUMN_SIZE, list()))
# Ensure RoiColumn is named 'Roi'
Expand Down Expand Up @@ -446,7 +447,7 @@ def resolve(self, column, value, row):
try:
return images_by_id[int(value)].id.val
except KeyError:
log.debug('Image Id: %i not found!' % (value))
log.debug('Image Id: %s not found!' % (value))
return -1
return
if WellColumn is column_class:
Expand All @@ -458,6 +459,8 @@ def resolve(self, column, value, row):
return self.wrapper.resolve_dataset(column, row, value)
if RoiColumn is column_class:
return self.wrapper.resolve_roi(column, row, value)
if column_as_lower == 'shape':
return self.wrapper.resolve_shape(value)
if column_as_lower in ('row', 'column') \
and column_class is LongColumn:
try:
Expand Down Expand Up @@ -757,8 +760,36 @@ def __init__(self, value_resolver):
super(DatasetWrapper, self).__init__(value_resolver)
self.images_by_id = dict()
self.images_by_name = dict()
self.rois_by_id = None
self.shapes_by_id = None
self._load()

def resolve_roi(self, column, row, value):
# Support Dataset table with known ROI IDs
if self.rois_by_id is None:
self._load_rois()
try:
return self.rois_by_id[int(value)].id.val
except KeyError:
log.warn('Dataset is missing ROI: %s' % value)
return -1
except ValueError:
log.warn('Wrong input type for ROI ID: %s' % value)
return -1

def resolve_shape(self, value):
# Support Dataset table with known Shape IDs
if self.rois_by_id is None:
self._load_rois()
try:
return self.shapes_by_id[int(value)].id.val
except KeyError:
log.warn('Dataset is missing Shape: %s' % value)
return -1
except ValueError:
log.warn('Wrong input type for Shape ID: %s' % value)
return -1

def get_image_id_by_name(self, iname, dname=None):
return self.images_by_name[iname].id.val

Expand Down Expand Up @@ -800,12 +831,48 @@ def _load(self):
images_by_id[iid] = image
if iname in self.images_by_name:
raise Exception("Image named %s(id=%d) present. (id=%s)" % (
iname, self.images_by_name[iname], iid
iname, self.images_by_name[iname].id.val, iid
))
self.images_by_name[iname] = image
self.images_by_id[self.target_object.id.val] = images_by_id
log.debug('Completed parsing dataset: %s' % self.target_name)

def _load_rois(self):
log.debug('Loading ROIs in Dataset:%d' % self.target_object.id.val)
self.rois_by_id = {}
self.shapes_by_id = {}
query_service = self.client.getSession().getQueryService()
parameters = omero.sys.ParametersI()
parameters.addId(self.target_object.id.val)
data = list()
while True:
parameters.page(len(data), 1000)
rv = unwrap(query_service.projection((
'select distinct i, r, s '
'from Shape s '
'join s.roi as r '
'join r.image as i '
'join i.datasetLinks as dil '
'join dil.parent as d '
'where d.id = :id order by s.id desc'),
parameters, {'omero.group': '-1'}))
if len(rv) == 0:
break
else:
data.extend(rv)
if not data:
raise MetadataError("No ROIs on images in target Dataset")

for image, roi, shape in data:
# we only care about *IDs* of ROIs and Shapes in the Dataset
rid = roi.id.val
sid = shape.id.val
self.rois_by_id[rid] = roi
self.shapes_by_id[sid] = shape

log.debug('Completed loading ROIs and Shapes in Dataset: %s'
% self.target_object.id.val)


class ProjectWrapper(PDIWrapper):

Expand Down Expand Up @@ -894,6 +961,7 @@ class ImageWrapper(ValueWrapper):
def __init__(self, value_resolver):
super(ImageWrapper, self).__init__(value_resolver)
self.rois_by_id = dict()
self.shapes_by_id = dict()
self.rois_by_name = dict()
self.ambiguous_naming = False
self._load()
Expand All @@ -904,15 +972,25 @@ def get_roi_id_by_name(self, rname):
def get_roi_name_by_id(self, rid):
return unwrap(self.rois_by_id[rid].name)

def resolve_shape(self, value):
try:
return self.shapes_by_id[int(value)].id.val
except KeyError:
log.warn('Image is missing Shape: %s' % value)
return -1
except ValueError:
log.warn('Wrong input type for Shape ID: %s' % value)
return -1

def resolve_roi(self, column, row, value):
try:
return self.rois_by_id[int(value)].id.val
except KeyError:
log.warn('Image is missing ROI: %s' % value)
return Skip()
return -1
except ValueError:
log.warn('Wrong input type for ROI ID: %s' % value)
return Skip()
return -1

def _load(self):
query_service = self.client.getSession().getQueryService()
Expand All @@ -930,9 +1008,10 @@ def _load(self):
while True:
parameters.page(len(data), 1000)
rv = query_service.findAllByQuery((
'select distinct r from Image as i '
'join i.rois as r '
'where i.id = :id order by r.id desc'),
'select distinct s from Shape as s '
'join s.roi as r '
'join r.image as i '
'where i.id = :id order by s.id desc'),
parameters, {'omero.group': '-1'})
if len(rv) == 0:
break
Expand All @@ -943,15 +1022,19 @@ def _load(self):

rois_by_id = dict()
rois_by_name = dict()
for roi in data:
shapes_by_id = dict()
for shape in data:
roi = shape.roi
rid = roi.id.val
rois_by_id[rid] = roi
shapes_by_id[shape.id.val] = shape
if unwrap(roi.name) in rois_by_name.keys():
log.warn('Conflicting ROI names.')
self.ambiguous_naming = True
rois_by_name[unwrap(roi.name)] = roi
self.rois_by_id = rois_by_id
self.rois_by_name = rois_by_name
self.shapes_by_id = shapes_by_id
log.debug('Completed parsing image: %s' % self.target_name)


Expand Down Expand Up @@ -1148,8 +1231,8 @@ def preprocess_data(self, reader):
if isinstance(value, basestring):
column.size = max(
column.size, len(value.encode('utf-8')))
# The following are needed for
# getting post process column sizes
# The following IDs are needed for
# post_process() to get column sizes for names
if column.__class__ is WellColumn:
column.values.append(value)
elif column.__class__ is ImageColumn:
Expand All @@ -1164,6 +1247,8 @@ def preprocess_data(self, reader):
log.error('Original value "%s" now "%s" of bad type!' % (
original_value, value))
raise
# we call post_process on each single (mostly empty) row
# to get ids -> names
self.post_process()
for column in self.columns:
column.values = []
Expand Down
Loading

0 comments on commit cd7edb5

Please sign in to comment.