Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edge effects #2

Open
vaxenburg opened this issue Dec 26, 2024 · 11 comments
Open

Edge effects #2

vaxenburg opened this issue Dec 26, 2024 · 11 comments

Comments

@vaxenburg
Copy link

Hi Will, I'm seeing some strange edge effects depending which side of the affinities volume is padded with zeros:

  • First row: top and left edges of the input affinities are padded with zeros (colored black), segmentation looks good, no edge effects.
  • Second row: same affinities but rotated by 180 degrees so the padding is on the opposite side now. Edge effects appear.

What do you think?

image
@pattonw
Copy link
Owner

pattonw commented Dec 30, 2024

I expect this is because rotation is not a trivial operation with affinities.
Each channel in your affinities represents a vector offset. We call a set of these a neighborhood. I'm assuming you're using [[0,1],[1,0]] as your neighborhood here.
If you simply rotate the array representing the affinities, and don't rotate the vectors in your neighborhood at the same time you get incompatible affinities.

For example:
Given labels:

1 1 1
2 2 2
2 2 2

You would get affinities:

offset: [0,1]
1 1 0
1 1 0
1 1 0

offset: [1, 0]
0 0 0
1 1 1
0 0 0

Rotating the affinities would give:

offset: [0,1]
0 1 1
0 1 1
0 1 1
Note that the 0s in column 0 are now indicating that there are no shared objects in column 0 and 1. These were only 0 in the first place due to the "padding" since we don't know if the labels continue out of bounds of the image.
The ones in column 2 are now ignored because they indicate that the object in column 2 is the same as the object in column 3, but that is out of bounds of our image so they aren't used.


offset: [1, 0]
0 0 0
1 1 1
0 0 0

If we now try to convert these back into labels we would get something like:

1 2 2
3 4 4
3 4 4

You could either rotate the associated offsets: [[0,1],[1,0]] -> [[0,-1],[-1,0]], or what I usually do for simplicity is rotate the labels before generating the affinities.

@vaxenburg
Copy link
Author

vaxenburg commented Dec 30, 2024

Right, of course the offsets have directions and should be rotated as well. Thanks for the correction.

Actually, I was trying to use the rotation example to show a different problem and I realize it was a bit misleading. Here is the same problem without rotations:

If I pad affinities with zeros and then do watershed with a bias (0.5 in this example), I get edge effects (at the right and bottom edges only):

# First row in the image.
affs = dataset.to_ndarray(roi=dataset.roi.grow(48, 48), fill_value=0)  # Pad with 0
segmentation = mwatershed.agglom(
    affinities=affs-0.5,  # Bias 0.5
    offsets=[(1, 0, 0), (0, 1, 0), (0, 0, 1)],
)

but I realized that if I set fill_value = bias the problem goes away:

# Second row in the image.
affs = dataset.to_ndarray(roi=dataset.roi.grow(48, 48), fill_value=0.5)  # Pad with 0.5
segmentation = mwatershed.agglom(
    affinities=affs-0.5,  # Bias 0.5
    offsets=[(1, 0, 0), (0, 1, 0), (0, 0, 1)],
)
image

@pattonw
Copy link
Owner

pattonw commented Dec 30, 2024

Ah ok. Now this looks like it is due to splitting and merging with equal weight at the edges.

If you pad with zeros and use a bias of 0.5 and your affs are very confident at 1. Then you might be merging into the padded region on one axis with weight 0.5, but splitting along the second axis also with weight -0.5. since the absolute value is the same, the order is random and you get annoying artifacts.

If you pad with your bias term, then the padded affs will have absolute weight near 0 and be processed last avoiding any false splits that they might cause.
Assuming you don't change the bias from 0.5, I expect if you used any positive value in the padding (0.1) you'd get basically the same results because this would also lead to the padded affs having a smaller absolute weight after subtracting the bias, so any split edges in the padded region would be processed after your objects have been merged together. On the other hand if you used a negative value in the background (-0.1) I expect the entire boundary to be covered in little fragments.

@pattonw
Copy link
Owner

pattonw commented Dec 30, 2024

Unfortunately this is a recurring problem with using mutex watershed. Because it's such a greedy algorithm that simply processes edges in order of greatest absolute value first and does not account for object size at all, it is very sensitive to tiny differences that can change the order of edge processing.
There are follow up papers that use things like mean or total affinity between two objects to determine merge/split order that seem to perform better, especially in these cases, they lose a lot of the nice computational efficiency of mutex

@vaxenburg
Copy link
Author

Ah thanks, I think I get it. I'm reading the original mutex watershed paper to better understand how it works.

I was also wondering why these artifacts seem to appear only at the padded edges of the volume rather than just at every background boundary (if I pad with 0, it's just the same as background). I can't really see any artifacts at the edges of those three blobs closer to the center of the image above, for example

@vaxenburg
Copy link
Author

It's interesting, I tried your idea and it looks like this. I'll have to think about it now..

image

@pattonw
Copy link
Owner

pattonw commented Dec 31, 2024

I can definitely see a few small artifacts on the edges of the 3 blobs in the middle. They aren't as large and problematic as the ones at the border though.
I think this is because predicted affinities will tend to be more consistent. I made a quick example in draw.io:
affs_padding

In these examples I am using red for 0 and green for 1. On the left example all 0s come from padding and the rest of the edges are 1. If we manually choose the order of edges to process, you could segment the left image however you want. Imagine you merge all the rows first, then you split each row by processing the vertical edges from right to left. You would segment the blue object into rows.
On the right there is no padding, we simply have predicted affinities, the predicted affinities are more consistent (there are no green paths connecting 2 nodes that share a red edge). This means that the order of edge processing doesn't matter, you cannot connect any blue and orange nodes.

As you've noticed, these artifacts can be mitigated at the boundary by choosing which values to pad with. If you can guarantee that the predicted edges get processed before the padded edges (by assigning those edges an absolute weight of 0 after accounting for your bias), then they won't have an impact on your final segmentation. But these kind of artifacts also show up at the boundaries of predicted objects and background, especially when you also include long range affinities.

@pattonw
Copy link
Owner

pattonw commented Dec 31, 2024

When these artifacts show up inside predictions they normally don't propagate as far. At the boundary of predicted objects you may have inconsistent affinities leading to small artifacts, but they will normally not be saturated to 0 and 1. At the boundary you may have values of 0.3 and 0.7 etc. Meaning inconsistencies due to edge processing order cannot propagate far from the low confidence boundary region into the high confidence region in the center of an object.

@vaxenburg
Copy link
Author

Right, I also thought that a practical solution would be to make sure that the padded edges are processed as late as possible (by having their weights be ~zero after accounting for the bias).

I was thinking about your drawn examples and was also trying to see why the artifacts only appear at the right and bottom boundaries of the volume (in fact, since it's a 3d volume, the defects appear at 3 out of 6 boundaries, though it's not clear from the 2d slices I attached above). So I think this directional preference comes from how the offsets are defined: [(1, 0, 0), (0, 1, 0), (0, 0, 1)]. With this choice of offsets in mind, I made a plot similar to yours to compare the top-left and the bottom-right padded corners and I think it's clear that for the top-left corner the affinities are consistent (no defects) and for the bottom-right they are not (there are defects).

I also think we could figure out a "clever" padding for the bottom/right boundaries to restore consistency there. I think in the simplest case of uniform boundary of a single large object, like the one around the bottom-right corner in my images above, we'd just have to pad with both 1s and 0s in a certain way to make the connectivity look like the "good" top-left corner

@pattonw
Copy link
Owner

pattonw commented Jan 2, 2025

Exactly, for the top left corner you don't run into the inconsistent affinities due to the padding. As you noted it's because of the neighborhood being positive values. If your neighborhood was defined with (-1,0,0)... You would have the same defects on the opposite side.

The inconsistencies are essentially caused by predictions going "out of bounds". If you use a network to predict the affinities, it will use shape priors and some context and probably knows with high confidence when objects continue past the edge of the image or not. This is why you get high confidence 1s predicted between the true pixels and padded pixels in my little diagram. This causes the inconsistency. Since we can't pad with the true context, I think there will almost always be errors.

But why do you need to pad? You could predict a larger region and crop to get the appropriate context and not deal with padding artifacts

@vaxenburg
Copy link
Author

Right, I likely won't have to pad eventually. I think I will be predicting a proper external boundary instead and the issue will disappear automatically. I just wanted to understand better where and why the algorithm might fail.

By the way, what do the strides and randomized_strides arguments do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants