Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Tracking Layer #286

Merged
merged 23 commits into from
Dec 15, 2023
Merged

Fix Tracking Layer #286

merged 23 commits into from
Dec 15, 2023

Conversation

stefanhahmann
Copy link
Contributor

The cluster layer was not generated for layers of type tracking layer in function generate_cluster_tracks()

This was caused by bugs in 2 functions():

  • The label ids were wrongly taken from the cluster id column instead of the "labels" column in generate_cluster_tracks()
  • The generate_cluster_image() function did not map the label ids to the cluster ids in the case of tracking layers

This PR adds a bugfix for this. In order to do so also some recent changes regarding performance optimization had to be reverted (1c2ec0f, dcd83c1, 5fee912, 0800f48)

@thorstenwagner there is now a unit test for the cluster plotter in combination with tracking layers. Perhaps you could try, if you could make your performance optimizations with this as well?

The PR additionally adds some unit tests to check for bugs that had recently been introduced.

* The generate_cluster_tracks() function did not work with the version without map_array
* A unit test is added to proof this
Copy link

codecov bot commented Dec 13, 2023

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (73ab55f) 76.39% compared to head (68751a7) 77.17%.

Files Patch % Lines
napari_clusters_plotter/_utilities.py 33.33% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #286      +/-   ##
==========================================
+ Coverage   76.39%   77.17%   +0.78%     
==========================================
  Files          16       16              
  Lines        1860     1893      +33     
==========================================
+ Hits         1421     1461      +40     
+ Misses        439      432       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Cryaaa
Copy link
Collaborator

Cryaaa commented Dec 14, 2023

Hey @stefanhahmann ,
thanks for catching this but before we revert all of the changes I have some questions: what exact errors are you getting when using tracking data? because I was testing the improved version and it actually worked on timelapse data which is almost identical to tracking data from the computation so I wonder what the exact problem is. Maybe it is just the problem with the labels that are not extracted properly as you mentioned:

  • The label ids were wrongly taken from the cluster id column instead of the "labels" column in generate_cluster_tracks()

Will do some testing now and report back.

@Cryaaa
Copy link
Collaborator

Cryaaa commented Dec 14, 2023

Okay so some testing on the very simple tracking example you implemented in the unit test showed me that actually the problem did not lie with the generate_cluster_immage function but indeed was caused by the wrong extraction of labels in the generate_cluster_tracks function.

  • The generate_cluster_image() function did not map the label ids to the cluster ids in the case of tracking layers

Regarding this: do you maybe have a simple tracking example dataset you might want to share which I can use for testing? After I have tested this in detail I will adjust the code and merge. Thanks for the unit tests again!

@thorstenwagner
Copy link
Collaborator

thorstenwagner commented Dec 14, 2023

Hm, I'm a bit confused. I just checkout your branch, put my new generate_cluster_image back in and all tests passed.

collecting ... collected 4 items

../../../../../../../../../home/twagner/Projects/napari-clusters-plotter/src/napari-clusters-plotter/napari_clusters_plotter/_tests/test_utilities.py::test_map_array_3d 
../../../../../../../../../home/twagner/Projects/napari-clusters-plotter/src/napari-clusters-plotter/napari_clusters_plotter/_tests/test_utilities.py::test_map_array_4d 
../../../../../../../../../home/twagner/Projects/napari-clusters-plotter/src/napari-clusters-plotter/napari_clusters_plotter/_tests/test_utilities.py::test_generate_cluster_image 
../../../../../../../../../home/twagner/Projects/napari-clusters-plotter/src/napari-clusters-plotter/napari_clusters_plotter/_tests/test_utilities.py::test_generate_cluster_tracks 

======================== 4 passed, 3 warnings in 1.48s =========================

Edit: Ah, @Cryaaa found the reason.

@Cryaaa
Copy link
Collaborator

Cryaaa commented Dec 14, 2023

@stefanhahmann,
I have changed this PR a bit now, the generate cluster image function will be kept as the updated one from @thorstenwagner but the other bug fix will be included. I added your new tests to the test_utils.py file to have everything in one place and have removed duplicate tests and test cases that do not apply to the plotter (e.g. just testing the generate cluster image function on 4D data). I would love to hear if this now works for your data!

* The unit test before had two time frames containing each a 3d image. Both 3d images contained *all* labels that occur in the full timelapse
* The unit test now has two time frames containing each a 3d image. The 3d image of the first timeframe now does not contain all labels that occur in the full timelapse.
…_cluster_image function

* int(np.max(label_image)) could lead to an index out of bounds error, in case where a single timeframe does not contain all labels of the complete timelapse
@stefanhahmann
Copy link
Contributor Author

Hm, I'm a bit confused. I just checkout your branch, put my new generate_cluster_image back in and all tests passed.

Sorry for the confusion. I updated the respective unit test so that it actually fails, cf.: 1d952d7

The unit test did pass, since both timeframes included in it had all labels of the complete timelapse. If a single timeframe does not have all timeframes of the complete timelapse, it causes an array index out of bounds error.

@Cryaaa and @thorstenwagner in b6e5f60 I have committed a suggestion how to fix this error with the new generate_cluster_image() version, but this breaks other tests: https://github.com/BiAPoL/napari-clusters-plotter/actions/runs/7212283328/job/19649645462 (test_cluster_image_generation_unsorted_non_sequential_labels() and test_generate_cluster_4d_labels())

Only with generate_cluster_image_old() all tests pass at the moment, cf.: 5957977 and https://github.com/BiAPoL/napari-clusters-plotter/actions/runs/7212550935/job/19650463473

Currently, I have no good idea how to make the new version of generate_cluster_image() pass all tests.

@thorstenwagner
Copy link
Collaborator

With my fix, all tests pass.

The problem is that my method assumes that label_list only contains labels that are actually part of the label_image, which is not the case for your track data. I think it should, and the implementation of generate_cluster_tracks may be buggy as it produces the same label_list for each time point, regardless of the actual labels at that time point. However, I have not had a close look yet.

@thorstenwagner
Copy link
Collaborator

Regarding my last commit, its indeed almost twice as fast:

image

@thorstenwagner
Copy link
Collaborator

thorstenwagner commented Dec 14, 2023

Regarding my very last commit of today: With that we reduce the runtime to

4.42 µs ± 16 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

However, this is probably not relevant for the total runtime of that method. Anyway ... 🤓

@thorstenwagner
Copy link
Collaborator

I just want to mention: As soon this is merged, we can probably go back to the 'old' map_array function :-)

@stefanhahmann
Copy link
Contributor Author

With my fix, all tests pass.

Thanks for the fix.

The problem is that my method assumes that label_list only contains labels that are actually part of the label_image, which is not the case for your track data.

Thanks for clarifying the assumptions of the new generate_cluster_image() method.

I think it should, and the implementation of generate_cluster_tracks may be buggy as it produces the same label_list for each time point, regardless of the actual labels at that time point. However, I have not had a close look yet.

With the old implementation it did not matter, which labels were contained at which timepoint. However, I agree, that including only those labels that exist in the considered timepoint should be included. generate_cluster_tracks() can be optimized in this regard. Thanks to you fix, functionality does not currently depend on this optimization.

@Cryaaa
Copy link
Collaborator

Cryaaa commented Dec 15, 2023

@thorstenwagner, @stefanhahmann: I think we can merge this now that you have fixed some of the more specific details?

@thorstenwagner
Copy link
Collaborator

Lgtm, from my side you can merge.

@Cryaaa Cryaaa merged commit e195b69 into main Dec 15, 2023
8 checks passed
@stefanhahmann stefanhahmann deleted the tracking_layer branch December 18, 2023 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants