Add function to make nx Graph from Skeleton #224

ns-rse · 2024-01-10T14:38:24Z

Further to initial work by @jni to implement NetworkX based pruning I've finally worked out how to reconstruct Skeleton from the NetworkX graph by storing the path_coordinates() in the edge properties.

We save the .path_coordinates() to the edge properties of a NetworkX graph, using the row.Index and the node_id_src / node_id_dst to define the nodes. This is required because in some of the test instances there are loops with the same src and dst and if only these are used we would lose information. Early work I was recreating the summarize() Pandas dataframes and the number of rows were differing which made me scratch my head until I realised the information was being over-written because the src and dst were the same. Including the row.Index as a proxy for the graph segment avoids this.

As the .path_coordinates() are saved as edge properties in NetworkX object we can now reconstruct the original Numpy array based on this information and convert to a Skeleton.

Tests are included for a range of sample graphs and exceptions are raised (and tested) when...

Wrong type of object is passed into nx_to_skeleton().
The NetworkX object is missing edge properties completely.
The edge properties are co-ordinates outside of the original images dimensions.

There may be more edge cases that we could/should test for.

I found problems in reconstructing tinyline as it was a 1-D "image" and so have made this a 2-D "image" with a single row and updated test_line() appropriately. This may be inappropriate as the intention is to have a 1-D skeleton but most skeletons will be 2-D, can be reverted if necessary.

Also on reviewing this still includes WIP on iterative pruning. It desirable this could be removed to keep the PR more focused on going from and to NetworkX graph objects.

Need to merge changes in though as have conflicting variable names which are already on `jni:main`.

Saves the `.path_coordinates()` to the edge properties of a NetworkX graph, using the `row.Index` _and_ the `node_id_src` / `node_id_dst` to define the nodes. This is required because in some instances there are loops with the same `src` and `dst` and if only these are used we would lose information, including the `row.Index` as a proxy for the graph segment avoids this. As the `.path_coordinates()` are saved as `edge` properties in NetworkX object we can now reconstruct the original Numpy array based on this information and convert to a `Skeleton`. Tests are included for a range of sample graphs and exceptions are raised (and tested) when... + Wrong type of object is passed into `nx_to_skeleton()`. + The NetworkX object is missing `edge` properties completely. + The `edge` properties are co-ordinates outside of the original images dimensions. There may be more edge cases that we could/should test for. I found problems with `tinyline` as it was a 1-D "image" and so have made this a 2-D "image" with a single row and updated `test_line()` appropriately.

jni · 2024-01-11T05:44:28Z

This is required because in some of the test instances there are loops with the same src and dst and if only these are used we would lose information.

Storing multiple IDs in one edge is an interesting solution @ns-rse, but would it be simpler to just use a networkx.MultiGraph?

Also on reviewing this still includes WIP on iterative pruning. It desirable this could be removed to keep the PR more focused on going from and to NetworkX graph objects.

I think that's ok. It's often useful to include a motivating example for a new function. 😊

ns-rse · 2024-01-11T10:06:51Z

networkx.MultiGraph() does indeed look like the way forward. I'd just not found it yet whilst reading through docs, hence the brute force solution.

I'll work on substituting that when I'm back on this project next week.

Use [networkx.MultiGraph()](https://networkx.org/documentation/stable/reference/classes/multigraph.html). Greatly simplifies things, we can... + Simply add edges with path coordinates as values, if nodes aren't defined they are added automatically. + If there are multiple edges between two nodes the respective path coordinates are automatically handled as additional edges are stored using unique keys to identify the edge. + Node co-ordinates are no longer explicitly required (they are implicitly stored in path co-ordinates) + Reconstruction of original Skeleton is simplified as we can access the data (path co-ordinates) directly. Yapf has done some reformatting (checked it wasn't Black interfering this time).

ns-rse · 2024-01-12T16:25:51Z

Was in the groove so went ahead and switched to MultiGraph() which was thankfully relatively painless. Thanks for the pointer @jni

ns-rse · 2024-01-29T16:28:28Z

Hi @jni,

Using Multigraph() was an excellent solution, let me know if there is anything else that can be improved.

jni · 2024-01-29T22:47:51Z

Thanks for the ping @ns-rse! I will review this today!

jni · 2024-01-30T07:41:55Z

pyproject.toml

@@ -106,5 +106,9 @@ skan = [
 [tool.setuptools_scm]
 write_to = 'src/skan/_version.py'

+[tool.pytest.ini_options]
+minversion = "7.4.2"
+addopts = "-W ignore"


Why did you add this? (ie what warnings are we ignoring here and is that wise 😅)

Currently its our own deprecation warnings that were recently added for the removal of _.

src/skan/test/test_csr.py:407 /home/neil/work/git/hub/ns-rse/skan/src/skan/test/test_csr.py:407: VisibleDeprecationWarning: separator in column name will change to _ in version 0.13; to silence this warning, use `separator='-'` to maintain current behavior and use `separator='_'` to switch to the new default behavior. csr.summarize(csr.Skeleton(skeleton1)),

Happy to remove this though I only added it so it was easier to read test output whilst writing them.

I'll leave it for now and make an issue to either wrap those in pytest.warns or pass the separator arg.

jni · 2024-01-30T07:51:41Z

src/skan/csr.py

+    np_img = np.zeros(orig_dim)
+    x, y = [], []
+    try:
+        for edge in g.edges.data(data='path', default=1):


Why default=1 here? There shouldn't be any edges without data in the skeleton? If there is one, then it's not a valid input skeleton (?) and we should probably just raise an error? (And this is equally easy to do with None as with 1?)

Honestly can't remember now, sorry.

Have removed it and left it as default, tests pass ok.

The defensive guards were imho *overly* defensive and made the function harder to understand. At any rate I was not convinced that they were exhaustively diagnostic, and in some cases caught errors that should be caught by type checkers and following the documented inputs to the function.

We can save necessary attributes in the graph attributes of the multigraph, and they can be used later. We also store the pixel *values* so that we can restore not only the positions of the skeleton but also the float values (if any).

jni · 2024-01-30T09:07:50Z

@ns-rse I've pushed a few changes:

I changed the code to use n-dimensional coordinates rather than 2D-only x/y (which wouldn't work for 3D in addition to not working for the admittedly-pretty-absurd 1D)
But since I could keep 1D now I reverted those changes 😂
I added the shape and dtype of the image to the graph attributes, so they don't need to be passed in to the function
I kept the path values in addition to the coordinates — this means that we can restore even float images
I removed all of the defensive guards, which I think mostly caught errors that should be caught by the type checker / reading the docs. I'm willing to bring some back if we find some common pitfalls. The one I'm most worried about is people passing nx graphs without the necessary attributes — but I think those should be assertion errors rather than inferring the errors raised by the calls, as it's easier to trivially understand what the code is testing.

I'm seeing now that all the 3.8 tests are failing. I'd be happy to drop 3.8 since most of the scipy ecosystem has dropped it, but I have no earthly idea why they are failing:

______________ test_skeleton_to_nx[skeleton_linear1 (no summary)] ______________

np_skeleton = array([[False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False],
      ...alse],
       [False, False, False, ..., False, False, False],
       [False, False, False, ..., False, False, False]])
summary = None, edges = 24, nodes = 24

    @pytest.mark.parametrize(
            ('np_skeleton', 'summary', 'nodes', 'edges'),
            [
                    pytest.param(
                            tinycycle, None, 1, 1, id='tinycircle (no summary)'
                            ),
                    pytest.param(tinyline, None, 2, 1, id='tinyline (no summary)'),
                    pytest.param(
                            skeleton0,
                            csr.summarize(csr.Skeleton(skeleton0), separator='_'),
                            4,
                            3,
                            id='skeleton0 (with summary)',
                            ),
                    pytest.param(
                            skeleton1, None, 4, 4, id='skeleton1 (no summary)'
                            ),
                    pytest.param(
                            skeleton1,
                            csr.summarize(csr.Skeleton(skeleton1)),
                            4,
                            4,
                            id='skeleton1 (with summary)'
                            ),
                    pytest.param(
                            skeleton2,
                            csr.summarize(csr.Skeleton(skeleton2)),
                            8,
                            8,
                            id='skeleton2 (with summary)'
                            ),
                    # pytest.param(skeleton3d, None, 7, 7, id='skeleton3d (no summary)'),
                    pytest.param(
                            skeleton_loop1,
                            None,
                            10,
                            10,
                            id='skeleton_loop1 (no summary)'
                            ),
                    pytest.param(
                            skeleton_loop2,
                            None,
                            10,
                            10,
                            id='skeleton_loop2 (no summary)'
                            ),
                    pytest.param(
                            skeleton_linear1,
                            None,
                            24,
                            24,
                            id='skeleton_linear1 (no summary)'
                            ),
                    pytest.param(
                            skeleton_linear2,
                            None,
                            4,
                            3,
                            id='skeleton_linear2 (no summary)'
                            ),
                    pytest.param(
                            skeleton_linear3,
                            None,
                            20,
                            17,
                            id='skeleton_linear3 (no summary)'
                            ),
                    ],
            )
    def test_skeleton_to_nx(
            np_skeleton: npt.NDArray, summary: pd.DataFrame | None, edges: int,
            nodes: int
            ) -> None:
        """Test creation of NetworkX Graph from skeletons arrays and summary."""
        skeleton = csr.Skeleton(np_skeleton)
        skan_nx = csr.skeleton_to_nx(skeleton, summary)
>       assert skan_nx.number_of_nodes() == nodes
E       assert 22 == 24
E        +  where 22 = <bound method Graph.number_of_nodes of <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>>()
E        +    where <bound method Graph.number_of_nodes of <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>> = <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>.number_of_nodes

../../../hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/skan/test/test_csr.py:464: AssertionError
=========================== short test summary info ============================
FAILED test/test_csr.py::test_skeleton_to_nx[skeleton_linear1 (no summary)] - assert 22 == 24
 +  where 22 = <bound method Graph.number_of_nodes of <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>>()
 +    where <bound method Graph.number_of_nodes of <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>> = <networkx.classes.multigraph.MultiGraph object at 0x1304ca370>.number_of_nodes
======================== 1 failed, 98 passed in 20.62s =========================

That is a strange way to fail for a Python version change... 🤔 Any thoughts about what could be causing this?

Anyway, is the pruning working with this? If not, we should probably indeed remove the pruning code from this PR, or bring it up to speed.

Thanks for getting this as far as you have! Looking at it now, I think it's actually feasible to make a Skeleton directly rather than going via an image — we'd need to rebuild the sparse graph structures, the coordinates, and the path ids. Since those are usually a factor of >10 smaller compared to the image, it's probably worth thinking about. But we should probably leave that optimisation for a later PR — first let's get this over the finish line, as I think it's very close. I just wanted to flag that I'm excited about this finally being an efficient operation!

PS in your last commit removing the default value (which I overwrote, sorry!), black formatting had again reformatted the whole file... Might need to check your editor settings again... 😅

So, to do:

either finish the pruning or remove it from this PR.
figure out why on earth 3.8 is failing
(I just saw) uncomment the param testing 3D images

I'm signing off for the day, but let me know if you want me to tackle these in the next few days or if you want to have a crack at them!

ns-rse · 2024-01-30T12:06:42Z

Thanks for the rapid improvements @jni

Of the three tasks...

(I just saw) uncomment the param testing 3D images

Done, these pass.

figure out why on earth 3.8 is failing

Agree its a very strange error.

I've tested locally in a fresh 3.8 environment for sanity and can recreate the problem so started looking at installed versions of packages between Python 3.8 and Python 3.11. There are quite a few differences (e.g. Numpy differences of 1.24.4 under Python 3.8 compared to 1.26.3 under Python 3.11) but the NetworkX versions are identical (both 3.1 under both versions of Python).

The test is failing under test_skeleton_to_nx() so I checked the skeleton that underpins this (by converting the NetworkX object back to Skeleton using csr.nx_to_skeleton()) and its identical to the original skeleton so its baffling that the number of nodes counted differ.

I did some trawling through the NetworkX issues and found a possibly related issue in their own test suite from 2019 test_numpy_type fails under Python 3.8 · Issue #3720 · networkx/networkx. It seems related to the order in which attributes are added under 3.8 and was purportedly addressed in Fix test_numpy_type to pass under Python 3.8 by s-t-e-v-e-n-k · Pull Request #3724 · networkx/networkx (see also detailed info here). This was merged and included in NetworkX 2.5 so I'd be surprised if its the same underlying issue though.

Worth noting that the current version of NetworkX has requires-python = '>=3.10' and so if Skan is to depend on NetworkX going forward it may be easiet to align versions.

If you are happy to do that let me know and I'll adjust Skan's pyproject.toml.

either finish the pruning or remove it from this PR.

I started looking at this copying some tests from #214 over but think it might take me a little time to work through this.

In #214 the aim of csr.iteratively_prune_paths() was geared towards the TopoStats use case which for linear skeletons was to prune side-branches leaving the longest branch and for looped skeletons remove side branches and retain the largest loop.

In this PR csr.iteratevly_prune_paths() now works differently, in so much as...

It prunes NetworkX graph objects.
It does so until there is just one object left.

I copied over the tests from #214 and (and tweaked to pass NetworkX representations tocsr.iteratively_prune_paths()) unsurprisingly these failed . I think it would be sensible to remove the pruning from this PR and I'll work on developing tests later this week/next.

If you are happy with this let me know and I shall remove the iterative pruning functions from this PR.

jni · 2024-01-31T05:45:38Z

If you are happy with this let me know and I shall remove the iterative pruning functions from this PR.

I think that's a good plan, as it will make this PR easier to review and get in quickly.

If you are happy to do that let me know and I'll adjust Skan's pyproject.toml.

Yeah I don't intend to keep supporting 3.8 but I am simply worried that it's actually exposing some kind of numerical instability in our approach that is simply masked by later versions. So it would be good before ripping it out to understand why it's failing. The approach I would take is:

write out the versions of the packages installed in 3.8 vs 3.9
make a new env in 3.9 with the default packages
downgrade the packages one by one until it breaks

I might also step through the relevant code with a debugger in 3.8 and in 3.10 simultaneously to see where the data starts to change.

Work is saved on `ns-rse/networkx-pruning` and will form a separate PR.

Discovered that single skeleton > networkx failing under Python 3.8 is due to spurious edge detection . See jni#225 for further details.

ns-rse · 2024-04-19T19:55:08Z

See #225 for details of why tests were failing under Python 3.8.

jni and others added 6 commits November 27, 2023 13:50

Add function to make nx Graph from Skeleton

bb57ff0

Add function to compute main branches in nx directly

d0b35bb

Initial work on networkx-based pruning

b6cb88e

wip - adding path co-ordinates to networkx edge attributes

7105248

Need to merge changes in though as have conflicting variable names which are already on `jni:main`.

Merge branch 'main' into ns-rse/networkx

2a5a6d4

jni reviewed Jan 30, 2024

View reviewed changes

jni added 6 commits January 30, 2024 19:07

Fix errors test

8a6dd17

Remove unnecessary parens

9dbc809

Store shape and dtype->no need to pass them later

ec84487

We can save necessary attributes in the graph attributes of the multigraph, and they can be used later. We also store the pixel *values* so that we can restore not only the positions of the skeleton but also the float values (if any).

Update tests to no longer need to pass shape

54b8e1e

Remove no-longer-erroring test case

45afec7

jni force-pushed the ns-rse/networkx branch from f434cba to 45afec7 Compare January 30, 2024 08:49

Enable skeleton <--> nx tests for 3D skeleton

e36253c

ns-rse added 3 commits February 1, 2024 11:26

Remove iterative pruning from this PR

636c330

Work is saved on `ns-rse/networkx-pruning` and will form a separate PR.

Reverting mistaken Black formatting

56b162e

Mark test as failing under 3.8

5d50eb6

Discovered that single skeleton > networkx failing under Python 3.8 is due to spurious edge detection . See jni#225 for further details.

jni marked this pull request as ready for review April 20, 2024 04:38

jni added 5 commits April 29, 2024 12:28

Avoid modifying function inputs

bc5a585

Add complete docstring to skeleton_to_nx

ace2c66

Add ouptut type annotation to skeleton_to_nx

fe3611b

Update nx_to_skeleton docstring

ff8e1bd

Refactor random test skeleton generation

f4d0a98

jni merged commit b703cf6 into jni:main May 1, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add function to make nx Graph from Skeleton #224

Add function to make nx Graph from Skeleton #224

ns-rse commented Jan 10, 2024 •

edited

Loading

jni commented Jan 11, 2024

ns-rse commented Jan 11, 2024 •

edited

Loading

ns-rse commented Jan 12, 2024

ns-rse commented Jan 29, 2024

jni commented Jan 29, 2024

jni Jan 30, 2024

ns-rse Jan 30, 2024

jni Apr 29, 2024

jni Jan 30, 2024

ns-rse Jan 30, 2024

jni commented Jan 30, 2024 •

edited

Loading

ns-rse commented Jan 30, 2024 •

edited

Loading

jni commented Jan 31, 2024

ns-rse commented Apr 19, 2024

Add function to make nx Graph from Skeleton #224

Add function to make nx Graph from Skeleton #224

Conversation

ns-rse commented Jan 10, 2024 • edited Loading

jni commented Jan 11, 2024

ns-rse commented Jan 11, 2024 • edited Loading

ns-rse commented Jan 12, 2024

ns-rse commented Jan 29, 2024

jni commented Jan 29, 2024

jni Jan 30, 2024

Choose a reason for hiding this comment

ns-rse Jan 30, 2024

Choose a reason for hiding this comment

jni Apr 29, 2024

Choose a reason for hiding this comment

jni Jan 30, 2024

Choose a reason for hiding this comment

ns-rse Jan 30, 2024

Choose a reason for hiding this comment

jni commented Jan 30, 2024 • edited Loading

ns-rse commented Jan 30, 2024 • edited Loading

jni commented Jan 31, 2024

ns-rse commented Apr 19, 2024

ns-rse commented Jan 10, 2024 •

edited

Loading

ns-rse commented Jan 11, 2024 •

edited

Loading

jni commented Jan 30, 2024 •

edited

Loading

ns-rse commented Jan 30, 2024 •

edited

Loading