-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function to make nx Graph from Skeleton #224
Conversation
Need to merge changes in though as have conflicting variable names which are already on `jni:main`.
Saves the `.path_coordinates()` to the edge properties of a NetworkX graph, using the `row.Index` _and_ the `node_id_src` / `node_id_dst` to define the nodes. This is required because in some instances there are loops with the same `src` and `dst` and if only these are used we would lose information, including the `row.Index` as a proxy for the graph segment avoids this. As the `.path_coordinates()` are saved as `edge` properties in NetworkX object we can now reconstruct the original Numpy array based on this information and convert to a `Skeleton`. Tests are included for a range of sample graphs and exceptions are raised (and tested) when... + Wrong type of object is passed into `nx_to_skeleton()`. + The NetworkX object is missing `edge` properties completely. + The `edge` properties are co-ordinates outside of the original images dimensions. There may be more edge cases that we could/should test for. I found problems with `tinyline` as it was a 1-D "image" and so have made this a 2-D "image" with a single row and updated `test_line()` appropriately.
Storing multiple IDs in one edge is an interesting solution @ns-rse, but would it be simpler to just use a networkx.MultiGraph?
I think that's ok. It's often useful to include a motivating example for a new function. 😊 |
I'll work on substituting that when I'm back on this project next week. |
Use [networkx.MultiGraph()](https://networkx.org/documentation/stable/reference/classes/multigraph.html). Greatly simplifies things, we can... + Simply add edges with path coordinates as values, if nodes aren't defined they are added automatically. + If there are multiple edges between two nodes the respective path coordinates are automatically handled as additional edges are stored using unique keys to identify the edge. + Node co-ordinates are no longer explicitly required (they are implicitly stored in path co-ordinates) + Reconstruction of original Skeleton is simplified as we can access the data (path co-ordinates) directly. Yapf has done some reformatting (checked it wasn't Black interfering this time).
Was in the groove so went ahead and switched to |
Hi @jni, Using |
Thanks for the ping @ns-rse! I will review this today! |
@@ -106,5 +106,9 @@ skan = [ | |||
[tool.setuptools_scm] | |||
write_to = 'src/skan/_version.py' | |||
|
|||
[tool.pytest.ini_options] | |||
minversion = "7.4.2" | |||
addopts = "-W ignore" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add this? (ie what warnings are we ignoring here and is that wise 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently its our own deprecation warnings that were recently added for the removal of _
.
src/skan/test/test_csr.py:407
/home/neil/work/git/hub/ns-rse/skan/src/skan/test/test_csr.py:407: VisibleDeprecationWarning: separator in column name will change to _ in version 0.13; to silence this warning, use `separator='-'` to maintain current behavior and use `separator='_'` to switch to the new default behavior.
csr.summarize(csr.Skeleton(skeleton1)),
Happy to remove this though I only added it so it was easier to read test output whilst writing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it for now and make an issue to either wrap those in pytest.warns or pass the separator arg.
src/skan/csr.py
Outdated
np_img = np.zeros(orig_dim) | ||
x, y = [], [] | ||
try: | ||
for edge in g.edges.data(data='path', default=1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why default=1
here? There shouldn't be any edges without data in the skeleton? If there is one, then it's not a valid input skeleton (?) and we should probably just raise an error? (And this is equally easy to do with None as with 1?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly can't remember now, sorry.
Have removed it and left it as default, tests pass ok.
The defensive guards were imho *overly* defensive and made the function harder to understand. At any rate I was not convinced that they were exhaustively diagnostic, and in some cases caught errors that should be caught by type checkers and following the documented inputs to the function.
We can save necessary attributes in the graph attributes of the multigraph, and they can be used later. We also store the pixel *values* so that we can restore not only the positions of the skeleton but also the float values (if any).
@ns-rse I've pushed a few changes:
I'm seeing now that all the 3.8 tests are failing. I'd be happy to drop 3.8 since most of the scipy ecosystem has dropped it, but I have no earthly idea why they are failing:
That is a strange way to fail for a Python version change... 🤔 Any thoughts about what could be causing this? Anyway, is the pruning working with this? If not, we should probably indeed remove the pruning code from this PR, or bring it up to speed. Thanks for getting this as far as you have! Looking at it now, I think it's actually feasible to make a Skeleton directly rather than going via an image — we'd need to rebuild the sparse graph structures, the coordinates, and the path ids. Since those are usually a factor of >10 smaller compared to the image, it's probably worth thinking about. But we should probably leave that optimisation for a later PR — first let's get this over the finish line, as I think it's very close. I just wanted to flag that I'm excited about this finally being an efficient operation! PS in your last commit removing the default value (which I overwrote, sorry!), black formatting had again reformatted the whole file... Might need to check your editor settings again... 😅 So, to do:
I'm signing off for the day, but let me know if you want me to tackle these in the next few days or if you want to have a crack at them! |
Thanks for the rapid improvements @jni Of the three tasks...
Done, these pass.
Agree its a very strange error. I've tested locally in a fresh 3.8 environment for sanity and can recreate the problem so started looking at installed versions of packages between Python 3.8 and Python 3.11. There are quite a few differences (e.g. Numpy differences of 1.24.4 under Python 3.8 compared to 1.26.3 under Python 3.11) but the NetworkX versions are identical (both 3.1 under both versions of Python). The test is failing under I did some trawling through the NetworkX issues and found a possibly related issue in their own test suite from 2019 test_numpy_type fails under Python 3.8 · Issue #3720 · networkx/networkx. It seems related to the order in which attributes are added under 3.8 and was purportedly addressed in Fix test_numpy_type to pass under Python 3.8 by s-t-e-v-e-n-k · Pull Request #3724 · networkx/networkx (see also detailed info here). This was merged and included in NetworkX 2.5 so I'd be surprised if its the same underlying issue though. Worth noting that the current version of NetworkX has If you are happy to do that let me know and I'll adjust Skan's
I started looking at this copying some tests from #214 over but think it might take me a little time to work through this. In #214 the aim of In this PR
I copied over the tests from #214 and (and tweaked to pass NetworkX representations to If you are happy with this let me know and I shall remove the iterative pruning functions from this PR. |
I think that's a good plan, as it will make this PR easier to review and get in quickly.
Yeah I don't intend to keep supporting 3.8 but I am simply worried that it's actually exposing some kind of numerical instability in our approach that is simply masked by later versions. So it would be good before ripping it out to understand why it's failing. The approach I would take is:
I might also step through the relevant code with a debugger in 3.8 and in 3.10 simultaneously to see where the data starts to change. |
Work is saved on `ns-rse/networkx-pruning` and will form a separate PR.
Discovered that single skeleton > networkx failing under Python 3.8 is due to spurious edge detection . See jni#225 for further details.
See #225 for details of why tests were failing under Python 3.8. |
Further to initial work by @jni to implement NetworkX based pruning I've finally worked out how to reconstruct
Skeleton
from the NetworkX graph by storing thepath_coordinates()
in theedge properties
.We save the
.path_coordinates()
to the edge properties of a NetworkX graph, using therow.Index
and thenode_id_src
/node_id_dst
to define the nodes. This is required because in some of the test instances there are loops with the samesrc
anddst
and if only these are used we would lose information. Early work I was recreating thesummarize()
Pandas dataframes and the number of rows were differing which made me scratch my head until I realised the information was being over-written because thesrc
anddst
were the same. Including therow.Index
as a proxy for the graph segment avoids this.As the
.path_coordinates()
are saved asedge
properties in NetworkX object we can now reconstruct the original Numpy array based on this information and convert to aSkeleton
.Tests are included for a range of sample graphs and exceptions are raised (and tested) when...
nx_to_skeleton()
.edge
properties completely.edge
properties are co-ordinates outside of the original images dimensions.There may be more edge cases that we could/should test for.
I found problems in reconstructing
tinyline
as it was a 1-D "image" and so have made this a 2-D "image" with a single row and updatedtest_line()
appropriately. This may be inappropriate as the intention is to have a 1-D skeleton but most skeletons will be 2-D, can be reverted if necessary.Also on reviewing this still includes WIP on iterative pruning. It desirable this could be removed to keep the PR more focused on going from and to NetworkX graph objects.