Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'Roi' for name of RoiColumn #56

Merged
merged 5 commits into from
Jun 14, 2021
Merged

Conversation

will-moore
Copy link
Member

@will-moore will-moore commented Mar 5, 2021

This uses the name "Roi" for the RoiColumn, using the same convention we have adopted of "Image" for the ImageColumn.

I noticed @sbesson had 'Roi' column on his screenshot at ome/omero-web#264 (comment)

I was looking at loading a row for a particular ROI using:

/webgateway/table/Image/2567/query/?query=Roi-225611

To test:

# header roi,d,l,s
roi,ROI_Area,Channel_Index,Channel_Name
418817,0.0469,1,DAPI
418818,0.142,2,GFP
418819,0.093,3,TRITC
418820,0.093,3,ACA
418821,0.093,3,foo
  • Run the Populate_Metadata.py script on this image (can use the webclient UI and this also tests Populate metadata Image dtype omero-scripts#184).
  • Open the Table in webclient and check that the first column is named Roi (not roi) - and that rest of the table is correct.

@sbesson
Copy link
Member

sbesson commented Mar 8, 2021

In terms of API,

if column_type.lower() in COLUMN_TYPES:
will ensure the detection of any lowercase/uppercase variant of a known column types. In other terms, ROI, Roi or roi should all be valid headers of a RoiColumn and the downstream code should not make any assumption on the capitalization of these column names.

I think the primary case where the population code needs to make a decision is when the input CSV specifies a ROI name and the matching ROI ID is populated as an extra column. In this case, this PR adjusts the code to use a convention similar to the one we adopted in #10 which is completely fine.

No objection to having the example snippets of the README unified to use capitals. Should the other headings be adjusted accordingly?

@will-moore
Copy link
Member Author

I think it will be important to know that ROI columns are named Roi, so that in the future we can query like /webgateway/table/Image/2567/query/?query=Roi-225611 to get the row for the ROI and we don't also have to query with /webgateway/table/Image/2567/query/?query=roi-225611. For Image column, the webclient always has to do 2 queries for every lookup (Image and image) since we have supported both in the past.
I really want to avoid the same mistake for Roi.
So, even when the column naming may not be required for the population code (when we have an ROI column defined by the header) I wanted the example csv in the README to use Roi.
Maybe the populate functionality should enforce this?

@will-moore
Copy link
Member Author

I wanted to check with @erindiel that using Roi doesn't break any API that Glencoe are using?

@sbesson
Copy link
Member

sbesson commented Mar 17, 2021

For complete reference, the PR that @will-moore is referring to in #56 (comment) is ome/openmicroscopy#5832.

No objection to a PR that would aim the column names from my side but I think will need to make a few decisions:

  • should such a standard only apply to ROI/Roi/roi column or other primary column types as well?
  • will we need some form of upgrade script for existing OMERO.tables created with a different variant column name?
  • should a column names roi in an input CSV be renamed or duplicated as a new Roi column?

Alternatively, would a case-insensitive bulk annotation column query be an option i.e. loosen the assumptions made by the client to meet the ones made in this plugin?

@erindiel
Copy link
Member

Thanks @will-moore. It will be an easy switch to ignore case or assume uppercase, based on what you decide here.

@will-moore
Copy link
Member Author

With those commits:

  • I have enforced that Plate, Well, Image and Roi columns are named "Plate", "Well", "Image" and "Roi", regardless of the name of the column in csv. Column can be set as RoiColumn if the # header has roi or the column is named e.g. roi.
  • I don't see a need to duplicate columns to preserve names. Having e.g. multiple RoiColumns seems very confusing.
  • We could consider an upgrade script to rename columns in existing tables. I don't see a great need for this for roi -> Roi since I don't know that there are many tables already created with roi. There might be a case for updating image -> Image, in order to remove the duplicate queries we are currently performing.

In some ways I like the idea that e.g. /webgateway/table/Dataset.imageLinks.child/2567/query/?query=image-2567 would query the table for Image=2567 AND image=2567, so you didn't have to worry about it. This would simplify the current webclient code. But which columns would we apply this logic to? Certainly 'WellandImage. But do we also add it to RoiorPlate` if we know we don't need it?

Also, if it applies in some API calls, e.g.
http://idr.openmicroscopy.org/webgateway/table/Project.datasetLinks.child.imageLinks.child/1884807/query/?query=image-1884807 should it also apply in other API calls (e.g. Java etc) or
http://idr.openmicroscopy.org/webclient/omero_table/14209193/?query=image==1884807
In the second case there, the query is not parsed at-all. In the first case image-1884807 is converted to image==1884807, although ?query=image==1884807 works too.

For all the webclient changes discussed, I'll open a PR...

@sbesson
Copy link
Member

sbesson commented Mar 23, 2021

I need to look more into ome/omero-web#277 but I think the combination of being strict on writing and lenient on reading as proposed is generally the safest approach to these problems as long as it does not result into performance degradation.

As we are using both Roi and ROI across our code base, the former version is used here because it is the closest to how it's defined in the OMERO model?

@will-moore
Copy link
Member Author

I don't think we're using ROI anywhere. Just roi and Roi.
I propose we use Roi because it's consistent with using Image and Well elsewhere.

@sbesson
Copy link
Member

sbesson commented Mar 23, 2021

I think the difference is

  • ROI comes from the OME Data Model and it is the spelling used both in the web client (ROI count) and OMERO.iviewer (ROIs)

  • Roi comes from the OMERO model and is the capitalization used for all code-generated API e.g. RoiColumn

I think it's equally fine to use either form, I was just wondering whether we should handle /?query=ROI-100233 like the other forms.

@sbesson
Copy link
Member

sbesson commented Mar 25, 2021

Tested a few HCS and non HCS use cases with a combinations of headers/no-headers.
For a screen with 2 plates and 4 wells each

# header plate,well,s,d
plate_name,well_name,Drug,Concentration
Plate Name 0,A1,DMSO,0.0
Plate Name 0,A2,DMSO,0.5
Plate Name 0,B1,DMSO,1.0
Plate Name 0,B2,DMSO,1.5
Plate Name 1,A1,H2O,0.0
Plate Name 1,A2,H2O,0.5
Plate Name 1,B1,H2O,1.0
Plate Name 1,B2,H20,1.5
plate,well,Drug,Concentration
Plate Name 0,A1,DMSO,0.0
Plate Name 0,A2,DMSO,0.5
Plate Name 0,B1,DMSO,1.0
Plate Name 0,B2,DMSO,1.5
Plate Name 1,A1,H2O,0.0
Plate Name 1,A2,H2O,0.5
Plate Name 1,B1,H2O,1.0
Plate Name 1,B2,H20,1.5

For a single plate with 4 wells:

# header well,s,d
well_name,Drug,Concentration
A1,DMSO,0.0
A2,DMSO,0.5
B1,DMSO,1.0
B2,DMSO,1.5
well,Drug,Concentration
A1,DMSO,0.0
A2,DMSO,0.5
B1,DMSO,1.0
B2,DMSO,1.5

For an image with 2 ROIs

# header roi,d
id,area
421612,10
421613,20
roi,area
421612,10
421613,20
Roi Name,area
test1,10
test2,20

In all cases, the column names populated by either versions of these CSV were identical with the only variation being the ordering of the columns. All the Plate, Well, Roi column names were capitalized as expected.

Overall, I think normalizing the name of our specialized columns makes complete sense. The only scenario I can think of where this might lead to confusion is if another StringColumn was defined with a reserved name but I think the code should throw an error in that case e.g.

# header plate,well,s,d
plate_name,well_name,Plate,Concentration

The only remaining place that might be worth updating is the ADDED_COLUMN_TYPES list although I am unsure of where this is used.

In terms of test coverage, can a few asserts be added to the integration tests similarly to d511b69 to check columns with the correct names are created?

Then I think this should be good to merge.

@will-moore
Copy link
Member Author

@sbesson Apologies for dropping this for a while.
I added checks to assert the OMERO.table column names are as expected.

I'm not quite sure what you mean by ADDED_COLUMN_TYPES as I don't see that anywhere in this repo.

@will-moore
Copy link
Member Author

Hmmm - The build/tests don't seem to be running on that last commit.... ?

@sbesson
Copy link
Member

sbesson commented May 27, 2021

I'm not quite sure what you mean by ADDED_COLUMN_TYPES as I don't see that anywhere in this repo.

Sorry I meant ADDED_COLUMN_NAMES

Hmmm - The build/tests don't seem to be running on that last commit.... ?

Yes the GitHub workflow auto-disabled itself after 60 days of inactivity. I re-enabled it manually. Still considering how to automate this across the board as this is causing periodic issues

@sbesson sbesson closed this Jun 14, 2021
@sbesson sbesson reopened this Jun 14, 2021
@sbesson sbesson merged commit c697e97 into ome:master Jun 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants