Skip to content
This repository has been archived by the owner on Mar 2, 2021. It is now read-only.

2. Identify connected components

agouge edited this page Mar 25, 2013 · 9 revisions

Most of the parameters to be calculated are defined only on the (strongly) connected components of the graph. In this section we identify the largest connected component and construct the corresponding nodes and edges tables.

Identify all connected components

A. Label each node by its connected component

The following statement produces a table named connected_components consisting of two columns:

  • id The node id
  • connected_component An integer specifying to which connected component the node belongs.
CREATE TABLE connected_components AS SELECT * FROM ST_ConnectedComponents(routes.edges, 'weight', 3);

Please see the Javadoc for ST_ConnectedComponents for an explanation of the parameters.

Connected components are numbered sequentially starting from 1.

B. Count the number of nodes in each connected component

If we want to identify the number of nodes in each connected component, we can do the following:

CREATE TABLE connected_component_totals AS SELECT connected_component, COUNT() AS number_of_vertices FROM connected_components GROUP BY connected_component ORDER BY number_of_vertices DESC;

This creates a table named connected_component_totals listing each connected component and the number of nodes contained in that component, ordered from largest to smallest.

Select one connected component

C. Recover the nodes of a connected component

The following may be repeated for each connected component (here we consider connected component 1):

CREATE TABLE routes_1.nodes AS SELECT a.the_geom, b.id FROM routes.nodes a, connected_components b WHERE a.id=b.id and b.connected_component=1;

D. Recover the edges of a connected component

Recover the edges having a start node in connected component 1:

CREATE TABLE routes_1_start.edges AS SELECT a.* FROM routes.edges a, routes_1.nodes b WHERE a.start_node=b.id;

Recover the edges having an end node in connected component 1:

CREATE TABLE routes_1_end.edges AS SELECT a.* FROM routes.edges a, routes_1.nodes b WHERE a.end_node=b.id;

Put the two together:

CREATE TABLE routes_1.edges AS SELECT DISTINCT a.* FROM routes_1_start.edges a, routes_1_end.edges b WHERE a.id=b.id;

Note: The reader may notice that the above may have been accomplished by the single query:

CREATE TABLE routes_1.edges AS SELECT DISTINCT a.* FROM routes.edges a, routes_1.nodes b WHERE a.start_node=b.id OR a.end_node=b.id;

In practice, however, this technique is much slower.