-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bulk annotations and streaming #5
Conversation
@@ -24,12 +24,16 @@ | |||
|
|||
""" | |||
|
|||
import csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import causes the Python code compliance step to fail in Travis
@kkoz thanks for your first contribution to this plugin. I commented on one import which causes the Travis build to fail. Once CI passes, we can look into testing these changes functionally. A couple of immediate questions:
|
@sbesson Thanks for the notes. I will be making a commit shortly with the unnecessary import removed. For future reference, is there any way to check if my code will pass a travis-ci check without doing a commit? I've responded to your questions/comments below.
|
@@ -29,7 +29,9 @@ | |||
|
|||
from omero.model import PlateI, WellI, WellSampleI, OriginalFileI | |||
from omero.rtypes import rint, rstring, unwrap | |||
from omero.util.populate_metadata import ParsingContext | |||
from omero.util.populate_metadata import (ParsingContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realizing that this still imports the legacy OMERO.py classes rather than the class in the package. Largely, something I missed in #1 as I did not activate these tests. Can you try to update this line to
from populate_metadata import (ParsingContext,
...
``
and similarly in `test/integration/metadata/test_populate.py` so that your new factory class is picked up?
NB: in a follow-up PR, we'll probably want to look into properly namespacing these packages.
@kkoz for pre-testing, assuming you have Docker installed, you should be able to test your changes locally by running the same calls as the ones executed in
This is going to consume the omero-test-infra infrastructure to 1- create a Docker-based OMERO environment (DB, server, web), 2- install the plugin and run Python compliance checks, 3- execute the integration tests. |
@sbesson Thanks for the info. I ran |
@kkoz looks great now. Re The version of |
src/populate_metadata.py
Outdated
class ParsingContext(object): | ||
"""Generic parsing context for CSV files.""" | ||
|
||
def __init__(self, client, target_object, file=None, fileid=None, | ||
def __init__(self, client, target_object, | ||
parsing_util_factory, file=None, fileid=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is parsing_util_factory
a new mandatory argument in ParsingContext
as proposed here or an optional key/value pair as suggested in the test updates. Trying to consume this branch using the CLI wrapper against a project, I get the following error:
bash-4.2$ pip install git+git://github.com/kkoz/omero-metadata.git@bulk-annotations-and-streaming#egg=omero-metadata --user
Collecting omero-metadata from git+git://github.com/kkoz/omero-metadata.git@bulk-annotations-and-streaming#egg=omero-metadata
...
bash-4.2$ OMERO-server/OMERO.server/bin/omero metadata populate --report --batch=10000 --file IDR-import/idr0021-lawo-pericentriolarmaterial/experimentA/idr0021-experimentA-annotation.csv Project:1
Using session for root@localhost:4064. Idle timeout: 10 min. Current group: system
Traceback (most recent call last):
File "OMERO-server/OMERO.server/bin/omero", line 125, in <module>
rv = omero.cli.argv()
File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero/cli.py", line 1620, in argv
cli.invoke(args[1:])
File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero/cli.py", line 1095, in invoke
stop = self.onecmd(line, previous_args)
File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero/cli.py", line 1172, in onecmd
self.execute(line, previous_args)
File "/home/omero/workspace/OMERO-server/OMERO.server/lib/python/omero/cli.py", line 1254, in execute
args.func(args)
File "/home/omero/.local/lib/python2.7/site-packages/omero_cli_metadata.py", line 482, in populate
options=localcfg)
TypeError: __init__() takes at least 4 arguments (9 given)
NB: this also reveals the absence of integration test for the CLI metadata populate
subcommand which would have picked up this error. This is independent of this PR but I will try and produce an extra commit to cover this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention was to have it be mandatory. This was probably a bit of premature refactoring on my part. I will see if I can change the code to restore the signature of __init__
in ParsingContext. My eventual goal was to refactor the code such that it allowed for unit testing. But that is outside the scope of this PR so I'll try to change it.
…ignature the same as before
A couple of notes from the discussion with @emilroz @kkoz and @joshmoore. Generally, in the absence of any other outstanding requirement, we are starting to define the roadmap for a
In terms of functional testing perspective, while the latest Testing the last commit, the default
As discussed, no major objection to removing the |
…() for all contexts. Updated omero_cli_metadata.py and integration tests to no longer call write_to_omero.
@sbesson I made changes to solve the problem, but I'm having trouble testing them. When I run: |
@kkoz about the invalid choice issue, immediate question: does
For both of these, the raw data is also available if needed. |
@sbesson I successfully imported the idr0021datasets into omero. I added them to a project I created and ran |
@kkoz excellent. I'll have a go at retesting the annotation workflow (might be tomorrow). |
Tested these changes against the
For this type of annotation i.e. P/D/I with |
Executed a similar workflow to #5 (comment) against a screen ( I will perform a final round of code review on Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few inline comments. Proposing to merge this as it is especially as further refactoring is planned down the line /cc @joshmoore
options=localcfg) | ||
ctx.parse() | ||
loops = 0 | ||
ms = 0 | ||
if not args.dry_run: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly the --dry-run
argument is now largely a no-op, similarly to the -info
command. Primary question is whether we might want to resurrect it at some point via some constructor option of equivalent.
WELL_NAME_COLUMN, | ||
DATASET_NAME_COLUMN, | ||
IMAGE_NAME_COLUMN, | ||
'image'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update this line in my #10 once merged
|
||
original_file = table.getOriginalFile() | ||
file_annotation = FileAnnotationI() | ||
file_annotation.ns = rstring( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, it should be possible to consume omero.constants.namespaces.NSBULKANNOTATIONS
Dataset Project only supported for omero-metadata external plugin. Deal with breaking change in omero-metadata: ome/omero-metadata#5
Added changes to support bulk annotations for projects and datasets, and to change csv reading to process the data line by line instead of buffering the data in memory and writing to the table all at once.