feat ✨ : Add closeness centrality and floyd warshall feature #72

Fre0Grella · 2024-07-31T14:33:20Z

Hi, i'm marco, a students from the university of Bologna.
For my final project i want to contribute this repository adding the closeness centrality and the floyd warshall algorithms for numpy.
The closeness centrality use the newly implemented floyd warshall.
The parallelized floyd warshall is a version of tiled FW referenced in the code.

Schefflera-Arboricola · 2024-07-31T17:29:04Z

@Fre0Grella nx-parallel algorithms receive a ParallelGraph object, so you need to convert it back to the nx.Graph object to run the usual graph object methods on it. That's why you'll see the following lines in all the nx-parallel algorithms

if hasattr(G, "graph_object"):
        G = G.graph_object

Hopefully this will make the tests pass.

Thanks!

still give numpy.float64 object is not callable on closeness.py

Fre0Grella · 2024-08-04T11:11:29Z

@Schefflera-Arboricola Hi, thanks for the first advice, i hope you could help with this one too.
Running the test i got an error that i don't understand;
on the closeness.py file, when i use joblib Parallel to call the _closeness_measure function (that from a row of the adjacency matrix get the value of closeness for the node rappresented by the row) i get the error in the joblib code saying TypeError: 'numpy.float64' object is not callable.
I tried changing the data format thinking that joblib doesn't support numpy data structure but he kept saying in this case TypeError: 'float' object is not callable.

assertion error on closeness centrality test

les miserables test closeness

Fix weight problem for optional unweighted graphs but it's painfully slow

Schefflera-Arboricola · 2024-08-24T14:45:57Z

Hi @Fre0Grella, Thank you for your contribution :)

Is this PR ready for review? I will aim to get to it as soon as possible, though I cannot provide a specific timeline at the moment. I'm also tagging @dschult in case he would like to review it.

Additionally, could you let us know if there is a deadline for your final project and if this PR needs to be merged before then?

Thank you :)

Fre0Grella · 2024-08-24T16:51:07Z

Hi @Schefflera-Arboricola, the PR is ready for review although i aim to improve the performance further. The dead line of the project is due the 18 of September.
I doesn't need the PR to be merged before that

Schefflera-Arboricola · 2024-09-19T11:35:00Z

Thanks @Fre0Grella! I think instead of nxp.cpu_count using nxp.get_n_jobs would have fixed the failing CI tests because we renamed it recently. If you still have the bandwidth please feel free to merge the main branch again. Really sorry for the delayed review!

Fre0Grella · 2024-09-23T12:47:19Z

Thanks @Fre0Grella! I think instead of nxp.cpu_count using nxp.get_n_jobs would have fixed the failing CI tests because we renamed it recently. If you still have the bandwidth please feel free to merge the main branch again. Really sorry for the delayed review!

Not sure if i'm doing something wrong but replacing nxp.cpu_count to nxp.get_n_jobs isn't working. On the test i get AttributeError: module 'nx_parallel' has no attribute 'get_n_jobs'

Schefflera-Arboricola · 2024-09-27T14:33:54Z

nx_parallel/algorithms/shortest_paths/dense.py

+                        if block_j == no_of_primary - 1:
+                            j_range = (block_j * blocking_factor, n - 1)
+                    params.append((i_range, j_range))
+        Parallel(n_jobs=(no_of_primary - 1) ** 2, require="sharedmem")(


Recently we have made it so that all the parameters of the Parallel call are controlled by the user and we don't pass anything in it. I would highly encourage you(@Fre0Grella) to join our weekly nx-parallel meetings : https://scientific-python.org/calendars/networkx.ics , if you can. I think your insights and feedback would really help us shape the nx-parallel project in a better way. Thank you :)

Schefflera-Arboricola · 2024-09-27T14:38:12Z

All the tests are passing now and the style test should also pass once you update this PR branch with the recent fix that got merged. Thank you :)

Fre0Grella · 2024-09-27T18:28:51Z

It would be awesome if we can merge this PR before 1st of october so i can present the project directly from the NetworkX repository to the exam commision. Thanks a lot for the support along the project. I'm also interest in further contribuition to the NetworkX project

Fre0Grella · 2024-10-16T11:47:51Z

@Schefflera-Arboricola hey, the PR is ready to be merged, it only need a Review

dPys · 2024-10-18T15:07:19Z

nx_parallel/algorithms/shortest_paths/dense.py

+def _find_nearest_divisor(x, y):
+    """
+    find the optimal value for the blocking factor parameter
+
+    Parameters
+    ----------
+    x : node number
+
+    y : cpu core available
+    """
+    if x < y:
+        return 1, False
+    # Find the square root of x
+    sqrt_x = int(math.sqrt(x)) + 1
+
+    # Execute the calculation in parallel
+    results = Parallel(n_jobs=-1)(
+        delayed(_calculate_divisor)(i, x, y) for i in range(2, sqrt_x)
+    )
+
+    # Filter out None results
+    results = [r for r in results if r is not None]
+
+    if len(results) <= 0:
+        # This check if a number is prime, although repeat process with a non prime number
+        best_divisor, _ = _find_nearest_divisor(x - 1, y)
+        return best_divisor, True
+    # Find the best divisor
+    best_divisor, _, _ = min(results, key=lambda x: x[2])
+
+    return best_divisor, False


Curious-- for large graphs with prime numbers of nodes, it seems like the final block could be disproportionately larger, leading to load imbalance across cores?

Yes, it could lead to load inbalance but the disproportion i think is sort of minimized.
I don't know if there's a better way of doing it

Hm, what about just swapping:

best_divisor, _ = _find_nearest_divisor(x - 1, y)

with something as simple as:

best_divisor = max(1, x // y)

to distribute load approximately evenly if no divisor is found?

dPys · 2024-10-18T15:10:23Z

nx_parallel/algorithms/centrality/closeness.py

+def _closeness_measure(k, v, wf_improved, len_G):
+    """calculate the closeness centrality measure of one node using the row of edges i
+
+    Parameters
+    ----------
+    n : 1D numpy.ndarray
+        the array of distance from every other node
+
+    Returns
+    -------
+    k : numebr
+        the closeness value for the selected node
+    """
+    n = v.values()
+    # print(n)
+    n_reachable = [x for x in n if x != float("inf")]
+    # print(n_reachable,len(n_reachable))
+    totsp = sum(n_reachable)
+    # print(totsp)
+    closeness_value = 0.0
+    if totsp > 0.0 and len_G > 1:
+        closeness_value = (len(n_reachable) - 1.0) / totsp
+        # normalize to number of nodes-1 in connected part
+        if wf_improved:
+            s = (len(n_reachable) - 1.0) / (len_G - 1)
+            closeness_value *= s
+
+    return k, closeness_value


I wonder if it could make sense to include an initial check in this function for graph disconnectedness, the goal being to "fail fast" to prevent edge-case scenarios where disconnected nodes lead to infinite geodesic distances. WDYT?

Yeah, a transitive closure test could be added in the future i think

dPys · 2024-10-20T21:08:37Z

nx_parallel/algorithms/shortest_paths/dense.py

+        difference1 = abs(result1 - y)
+        # difference2 = abs((result2 - 1) ** 2 - y)
+        difference2 = abs(result2 - y)
+
+        if difference1 < difference2:


Could be missing something obvious, but aren't divisor1 and divisor2 actually going to be the same thing here (i)? i.e. do we even need to compare difference1 with difference2?

I wonder whether this could be simplified to something like:

def _calculate_divisor(i, x, y): if x % i == 0: quotient = x // i return (i, quotient, abs(i - y)) return None

?

Not really the same thing, the idea is to find the divisor of x ( size of matrix) most similar to y (core available).
To find i try to divide x to all number from 1 to sqrt(x) so when i do x // i, divisor1 is gonna be i, the result of x // i is gonna be divisor2.

An example to clarify:
x=10
y=4
i=2
x // i = 5
So divisor1 = 2, divisor2 = 5

divisor2 is the most similar to 4 so i choose 5 as divisor

Thanks for the clarification on this @Fre0Grella

I think what I'm still struggling with is the need for two separate comparisons. Since divisor1 is i and divisor2 is x // i, aren't you always comparing i and x // i? The current implementation is functionally correct, but this seems like it could be greatly simplified without losing the intent of matching divisor2 more closely to y.

For example, maybe we could directly compute the abs difference of both i and x // i with y, then just return whichever is closest? And this would avoid the extra complexity in checking two sets of variables. So, building on my (incomplete) alternative suggestion previously, this might look something like:

def _calculate_divisor(i, x, y): if x % i == 0: divisor = x // i return (i, divisor, min(abs(i - y), abs(divisor - y))) return None

keeps the logic intact but then we don't need to assign and compare divisor1 and divisor2 explicitly. Let me know what you think / if I'm missing something!

dPys · 2024-10-23T17:55:51Z

@Fre0Grella -- I should mention (since I haven't yet) that this is excellent work. @Schefflera-Arboricola-- hopefully after the remaining polishing we can get this thing merged?

Fre0Grella added 9 commits July 15, 2024 18:10

docs: 🚧 write the first definition of new function

4a4809b

feat: ✨ implement the core function of tiled floyd warshall

62c646d

feat: 🚧 Add some private function and a parameter for main function

b93a4e4

feat: ✨ Add code for the third phase of the alogrithm

45770c2

feat: ✨ Add feature for graph with a prime number of nodes

f5e6065

docs: 📝 Remove TODO marker and update docs

645d67d

feat: ✨ Implement the closeness centrality function

8eb1b93

Merge branch 'main' into feature/closeness

abcab94

feat: 🔧 add closeness alg into interface.py and __init__.py

90e1063

Schefflera-Arboricola added the type: Enhancement New feature or request label Jul 31, 2024

Fre0Grella added 3 commits August 2, 2024 11:17

fix: 🚧 change parallel graph object into normal graph object

e88c3b7

fix: 🚧 Fix indexing problem of the matrix, still not working

d73fd49

fix: 🚧 fix matrix shape problem

ee252ff

still give numpy.float64 object is not callable on closeness.py

Fre0Grella added 4 commits August 9, 2024 00:09

fix: 🐛 Fix Floyd Warshall Tiling, functioning

1e68ae4

feat: ✨ Introduce normal floyd_warshall instead of numpy floyd_warshall

c4e89c0

assertion error on closeness centrality test

fix: 🐛 fix bug in the submatrix division function

03071c2

les miserables test closeness

fix: 🚧 fix weight problem

6e5932c

Fix weight problem for optional unweighted graphs but it's painfully slow

Merge branch 'networkx:main' into feature/closeness

76e9ef4

Schefflera-Arboricola self-requested a review September 19, 2024 11:35

Fre0Grella force-pushed the feature/closeness branch from d886d3c to 76e9ef4 Compare September 23, 2024 14:39

Marco Galeri and others added 4 commits September 23, 2024 17:01

style 🎨 fix format style

020eaad

Merge branch 'main' into feature/closeness

af67d0d

sorry missed closeness_centrality

6ecb2e8

Updating __init__.py

0733949

Schefflera-Arboricola reviewed Sep 27, 2024

View reviewed changes

Fre0Grella and others added 2 commits September 27, 2024 20:15

Merge branch 'networkx:main' into feature/closeness

eb9b139

style: 💚 fix style

1048683

dPys reviewed Oct 18, 2024

View reviewed changes

dPys reviewed Oct 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat ✨ : Add closeness centrality and floyd warshall feature #72

feat ✨ : Add closeness centrality and floyd warshall feature #72

Fre0Grella commented Jul 31, 2024

Schefflera-Arboricola commented Jul 31, 2024

Fre0Grella commented Aug 4, 2024

Schefflera-Arboricola commented Aug 24, 2024

Fre0Grella commented Aug 24, 2024

Schefflera-Arboricola commented Sep 19, 2024

Fre0Grella commented Sep 23, 2024

Schefflera-Arboricola Sep 27, 2024 •

edited

Loading

Schefflera-Arboricola commented Sep 27, 2024

Fre0Grella commented Sep 27, 2024

Fre0Grella commented Oct 16, 2024

dPys Oct 18, 2024

Fre0Grella Oct 18, 2024 •

edited

Loading

dPys Oct 20, 2024

dPys Oct 18, 2024

Fre0Grella Oct 18, 2024

dPys Oct 20, 2024

Fre0Grella Oct 22, 2024

dPys Oct 22, 2024

dPys commented Oct 23, 2024

feat ✨ : Add closeness centrality and floyd warshall feature #72

Are you sure you want to change the base?

feat ✨ : Add closeness centrality and floyd warshall feature #72

Conversation

Fre0Grella commented Jul 31, 2024

Schefflera-Arboricola commented Jul 31, 2024

Fre0Grella commented Aug 4, 2024

Schefflera-Arboricola commented Aug 24, 2024

Fre0Grella commented Aug 24, 2024

Schefflera-Arboricola commented Sep 19, 2024

Fre0Grella commented Sep 23, 2024

Schefflera-Arboricola Sep 27, 2024 • edited Loading

Choose a reason for hiding this comment

Schefflera-Arboricola commented Sep 27, 2024

Fre0Grella commented Sep 27, 2024

Fre0Grella commented Oct 16, 2024

dPys Oct 18, 2024

Choose a reason for hiding this comment

Fre0Grella Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

dPys Oct 20, 2024

Choose a reason for hiding this comment

dPys Oct 18, 2024

Choose a reason for hiding this comment

Fre0Grella Oct 18, 2024

Choose a reason for hiding this comment

dPys Oct 20, 2024

Choose a reason for hiding this comment

Fre0Grella Oct 22, 2024

Choose a reason for hiding this comment

dPys Oct 22, 2024

Choose a reason for hiding this comment

dPys commented Oct 23, 2024

Schefflera-Arboricola Sep 27, 2024 •

edited

Loading

Fre0Grella Oct 18, 2024 •

edited

Loading