-
Notifications
You must be signed in to change notification settings - Fork 4
2. Identify connected components
Most of the parameters to be calculated are defined only on the (strongly) connected components of the graph. In this section we identify the largest connected component and construct the corresponding nodes
and edges
tables.
The following statement produces a table named connected_components
consisting of two columns:
-
id
The node id -
connected_component
An integer specifying to which connected component the node belongs.
CREATE TABLE connected_components AS SELECT * FROM ST_ConnectedComponents(routes.edges, 'weight', 3);
Please see the Javadoc for ST_ConnectedComponents
for an explanation of the parameters.
Connected components are numbered sequentially starting from 1.
If we want to identify the number of nodes in each connected component, we can do the following:
CREATE TABLE connected_component_totals AS SELECT connected_component, COUNT() AS number_of_vertices FROM connected_components GROUP BY connected_component ORDER BY number_of_vertices DESC;
This creates a table named connected_component_totals
listing each connected component and the number of nodes contained in that component, ordered from largest to smallest.
The following may be repeated for each connected component (here we consider connected component 1):
CREATE TABLE routes_1.nodes AS SELECT a.the_geom, b.id FROM routes.nodes a, connected_components b WHERE a.id=b.id and b.connected_component=1;
Recover the edges having a start node in connected component 1:
CREATE TABLE routes_1_start.edges AS SELECT a.* FROM routes.edges a, routes_1.nodes b WHERE a.start_node=b.id;
Recover the edges having an end node in connected component 1:
CREATE TABLE routes_1_end.edges AS SELECT a.* FROM routes.edges a, routes_1.nodes b WHERE a.end_node=b.id;
Put the two together:
CREATE TABLE routes_1.edges AS SELECT DISTINCT a.* FROM routes_1_start.edges a, routes_1_end.edges b WHERE a.id=b.id;
Note: The reader may notice that the above may have been accomplished by the single query:
CREATE TABLE routes_1.edges AS SELECT DISTINCT a.* FROM routes.edges a, routes_1.nodes b WHERE a.start_node=b.id OR a.end_node=b.id;
In practice, however, this technique is much slower.