paper.qmd

---
title: "Route network simplification for transport planning"
bibliography: references.bib
author: 
  - name: Robin Lovelace
    affiliation: Leeds Institute for Transport Studies, University of Leeds, UK
    orcid: 0000-0001-5679-6536
  - name: Zhao Wang
    affiliation: Leeds Institute for Transport Studies, University of Leeds, UK
    orcid: 0000-0002-4054-0533
  - name: Will Deakin
    affiliation: Digital, Data and Technology services, Network Rail, UK
    orcid: 0009-0008-5656-4469
  - name: Josiah Parry
    affiliation: Environmental Systems Research Institute, Redlands, CA, USA 
    orcid: 0000-0001-9910-865X
format:
  # pdf: default
  html: default
number-sections: true
execute: 
  echo: false
  message: false
  warning: false
editor: 
  markdown: 
    wrap: sentence
# # Uncomment to run with Jupyter:
# jupyter: python3
---

```{r}
# Engine: knitr
```

<!-- # Reproducibility {.unnumbered} -->

<!-- <details>

```{r}
#| name: r-setup
#| include: false
library(sf)
library(tmap)
library(dplyr)
library(ggplot2)
library(stplanr)

tmap_mode("plot")

rnet_x = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_x_ed.geojson")
rnet_y = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")
```

To contribute to the papers written as quarto documents (with `.qmd` extensions) like this one, we recommend using the Quarto extension for VS Code.
You can go into the visual editor with the following shortcut:

```         
Ctrl+Shift+F4
```

You can then add citations with Ctrl+Shift+F11 and benefit from Quarto's other features for academic writing.

</details> -->

# Abstract {.unnumbered}

Route network datasets are central to transport models as key inputs and outputs.
The complexity of route network inputs from sources such as OpenStreetMap has increased over time, enabling more precise modelling of sustainable modes such as walking and cycling.
However, this complexity can affect the visualisation of model results.
A common issue is the presence of multiple parallel ways on the same corridor, which can lead to visual artefacts and misinterpretation of model outputs.
To address this challenges, we present and compare two methods for *simplifying* route network datasets: 1) image skeletonization and 2) Voronoi diagram-centreline identification.
These methods have real-world applications, as illustrated in the "Simplified network" layer in the Transport for Scotland funded Network Planning Tool, which is publicly available at [www.npt.scot](https://www.npt.scot).
Based on open access data and the open source `parenx` Python package, available on the Python Package Index, these methods are not only reproducible but also adaptable, enbabling their use in new contexts.

# Introduction

```{=html}
<!-- A wide range of data types can be used as inputs and outputs transport models, including route network datasets, origin-destination data, movement patterns captured by global positioning systems (GPS), and information on surface characteristics derived from remote sensing imagery.[^1]
Of these, route network datasets are unusual because they are commonly used as both inputs and outputs of transport models.

[^1]: See the [online documentation](https://transportgeography.org/contents/methods/network-data-models/).

As highlighted in the field of transport geography, network data models serve as digital representations of transportation networks, crucial for planning and operational purposes.
These models, enriched through graph theory, enclosed the complex, multi-modal nature of transportation data across various jurisdictions.
This complexity is particularly relevant when considering the enhancement of route network datasets for different use cases, offering widely applicable benefits.

This raises questions about what transport network datasets are, and how they can be optimized for more effective decision-making.
An intuitive definition is that route network datasets are digital representations of footpaths, cycleways, highways and other *ways* (to use the OpenStreetMap terminology) along which people and goods can travel. -->
```

Datasets representing route networks are important in every stage of transport planning.
Transport network datasets are *spatial networks* composed of nodes and edges, in which each edge has an associated edge cost (its weighted or unweighted length) [@barthélemy2011].
Nodes (points connecting edges) and the edges that connect them (lines between nodes representing ways) are located in space, with the edges representing the physical infrastructure of the transport network, often with attributes such as the type of way, its physical characteristics, and usage data, e.g. daily traffic for each way.
Route network simplification can be applied to both the input and output networks of transport models, with the aim of reducing complexity while preserving the spatial structure of the network and relevant attributes.

<!--
File formats for representing route networks include Transportation Network Test Problem files (TNTP and stored as a series of `.tntp` plain text files, examples of which can be found in [github.com/bstabler/TransportationNetworks](https://github.com/bstabler/TransportationNetworks)), `.DAT` files used by the proprietary SATURN transport modelling system and XML-based `.osm` or `.pbf` files that encode OpenStreetMap data.
Geographic file formats implementing the 'Simple Features Access' standard developed by the Open Geospatial Consortium ([OGC](https://www.ogc.org/standard/sfa/)), such as `.shp` and `.geojson`, can also be used to represent route networks.
-->

Growing availability of high resolution geographic datasets and performant hardware and software has enabled people (e.g. via OpenStreetMap) and mapping agencies to generate increasingly detailed maps, a trend that is set to continue.
Sustainable transport planning benefits from this trend, but the complexity and intricacy of street network geometries can create problems.
A clear and intuitive visual representation is crucial for identifying issues such as bottlenecks, congestion hotspots, and areas of poor accessibility.
Consequently, the necessity for network simplification becomes evident, aligning with wider 'map generalization' methods for pre-processing datasets depending on the scale of analysis [@sutton1998].
25 years since Sutton's paper on the topic, simplification of networks for transport planning and other applications remains an unsolved challenge. -->

Vector geometry simplification methods include Douglas-Peucker and Visvalingam-Whyatt algorithms [@liu2020vector; @de2014efficient].
While preserving the overall shape and geographical accuracy, this method struggles to reduce network complexity.
Vector smoothing approaches, including the use of Bezier curves [@pradhan2023modified] and Kernel-based smoothing [@duong2022statistical], can create more visually appealing lines or polygons, but do not reduce the number of vertices.

A common approach is simplification through converstion to an intermediate polygon (buffer) layer.
Numerous algorithms for such 'medial axis' calculationsjave been published open source software repositories [e.g. @smogavec2012; @centerli2023].
Recent implementations of network simplification methods include the under-development `neatnet` Python package [@fleischmann2024], and the `parenx` package which is available on the Python Package Index [@deakin2024].
The latter is used in this paper.

Other network simplification methods for transport planning include automatic detection of 'face artifacts' [@fleischmann] and removal of 'slivers' to generate simplified representations of 'street blocks' [@grippa2018].
However, these methods tend to be 'all or nothing' and do not provide flexibility in terms of the level of simplification or which features are removed.

Network simplification can be useful for a wide range of applications.
Riverine research and flood mapping applications, for example, require estimates of many river characteristics, including simplified centerlines derived from datasets representing river banks, as implemented in software including the R package `cmgo` [@golly2017] and an approach implemented in Google Earth Engine called RivWidthCloud [@yang2020].
The `riverdist` R package, to provide another example, provides functionality for checking whether datasets representing river channels are braided and for removing braids by returning only the shortest paths along the lengths of river centerlines, for example [@tyers2016].

```{=html}
<!-- Flexibility is important due to the wide range of transport planning use cases.
Some transport planning use cases require zoomed-in, geographically accurate and complex representations of transport networks, e.g. maps for junction design may have scales of 1:1000.
Other use cases, such as strategic network planning tools and 'planning support systems' [@page2020], typically involve zoomed-out representations of street networks at scales of 1:10,000+ beyond which dual carriageways are not relevant and can distort visualisation of networks. -->
```

The aim of this paper is to articulate the problem of complex route networks, present solutions with implementations in open source software for reproducible research, and describe applications of the methods to support more effective transport planning.
@sec-problem outlines the problem of complex route networks.
@sec-methods presents methods for route network simplification alongside results based on the example datasets.
@sec-application demonstrates the methods applied to a real transport network (Edinburgh, Scotland) and @sec-discussion concludes with a discussion of the results and future work.

# Problem definition {#sec-problem}

The problem tackled in this paper is the simplification of complex route networks.
This can be illustrated with reference to the Propensity to Cycle Tool for England (PCT) [@lovelace2017], the route networks of which are on methods for aggregating multiple overlapping routes into a route network with non-overlapping linestrings @morgan2020.
Implemented in the function `overline()` in the `stplanr` R package [@lovelace2017], the methods enable visualization of large transport networks and inform investment decisions in transport planning internationally [@lovelace2024; @félix2025].
However, this 'overline' approach, without further processing, has limitations:

-   Functionally redundant vertices are kept, leading to large file sizes and slow rendering.
-   Parallel ways that are not merged.

The latter issue is particularly problematic for visualisation of transport networks, as shown in @fig-pct [@lovelace2017].
The left panel shows Otley Road with a flow value of 818 (@fig-otley-road).
The right panel, by contrast, shows three parallel ways parallel to Armley Road with flow values of 515 (shown), 288 and 47 (values not shown) (@fig-armley-road).
Although this section of Armley road has a higher cycling potential than the section of Otley Road shown (515 + 288 + 47 \> 818), this is not clear from the visualisation.

::: {#fig-pct layout-ncol="2" width=50%}

![](images/otley-road-narrow.png){#fig-otley-road width=80%}

![](images/armley-road-narrow.png){#fig-armley-road width=80%}

![](images/otley-road-raster.png){#fig-otley-road-raster width=80%}

![](images/armley-road-raster.png){#fig-armley-road-raster width=80%}

Vector (top) and raster (bottom) visualisations of route network results in the Propensity to Cycle Tool.
Note that it is not clear from the visualisation that the corridor shown in the right hand figures (Armley Road corridor) has greater flow than the corridor shown in the left (Otley Road).
Note also the visual artefacts such as 'staircase' effects and overlapping values resulting from parallel lines along Armley Road (right panel).
Source: open access Propensity to Cycle Tool results available at www.pct.bike.

:::

<!-- A subsequent step described by @morgan2020 is to post-process the geographic representation of the transport network into a raster image, which can be used to visualise the network.
The 'rasterisation' stage can tackle some of the issues associated with multiple parallel ways, but introduces new issues, as shown in @fig-rasterisation. -->

<!-- ::: {#fig-rasterisation layout-ncol="2"}


Rasterised network results for the same corridors shown in @fig-pct.
Source: open access Propensity to Cycle Tool results available at www.pct.bike.
::: -->

<!-- The methods presented in this paper are designed to take a complex network as an input and output a simplified network, while preserving the spatial structure of the network and relevant attributes.
By reducing duplicated parallel lines and other intricacies, the outputs can enable easier-to-interpret visualisations of transport behaviour on the network patterns and behaviors. -->

# Data and Methods {#sec-methods}

```{r}
rnet_otley = sf::read_sf("./data/rnet_otley.geojson")
rnet_armley = sf::read_sf("./data/rnet_armley.geojson")
```

In this paper we use the two street networks discussed in the previous section to illustrate the methods.
See the ['`parenex` cookbook'](https://nptscot.github.io/networkmerge/cookbook.html) and [Methods](https://nptscot.github.io/networkmerge/methods.html) appendices for further details on the methods used in this paper and their application to alternative (railway based) datasets.

```{r}
#| include: false
#| label: tbl-input-data
input_data = tibble::tribble(
  ~Network, ~`N. segments`, ~Description, ~Source,
  "Otley Road", nrow(rnet_otley), "A corridor in Leeds represented by a single centreline", "Propensity to Cycle Tool (derived from OSM)",
  "Armley Road", nrow(rnet_armley), "A road in Leeds represented by multiple parallel 'braided' linestrings", "Propensity to Cycle Tool (derived from OSM)"
  )
knitr::kable(input_data)
```

There are two main challenges that need to be overcome to simplify transport networks, in a way that preserves their value:

1.  Simplifying the *geometry*
2.  Assigning attributes to the simplified network

The key contributions of the paper are the novel methods of image skeletonization, presented in @sec-simplification-via-skeletonization, and simplification with Voronoi diagrams to identify central lines, covered in @sec-simplification-via-voronoi-polygons.
<!-- To make use of simplified networks in transport planning, it is also necessary to assign attributes to the simplified network.
This is covered in @sec-joining-route-networks. -->

```{=html}
<!-- TODO: shouldn't the following topics be stand-alone subsections rather than existing within the skeletonization section?
Additionally, the section tackles challenges associated with knots at intersections, offering solutions for their removal to simplify the network's appearance.
The concept of a primal network that represents a high level of simplification is explored as well. -->
```

<!-- ## Topology-preserving simplification {#sec-topology-preserving-simplification}

Topology-preserving simplification reduces the number of vertices in a linestring while preserving (or at least attempting to preserve) the topology of the network, i.e. not merging parallel lines. -->

## Simplification via skeletonization {#sec-simplification-via-skeletonization}

The skeletonization approach generates a simplified network by buffering the network, applying an image skeletonization algorithm, and extracting lines segements from a raster of this buffer.

<!--- ### Create a projected combined buffered geometry --->

In both the skeletonization and Voronoi approaches, the network simplification process starts by applying a buffer to linear geometry in a projected, rather than coordinate system.
<!-- , achieved using the `get_geometry_buffer` function. --> We use a buffer size of 8 meters in this paper.

```{python}
#| name: python-setup-packages
#| include: false
import warnings
from functools import partial
from pathlib import Path

import geopandas as gp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from parenx.shared import combine_line, get_geometry_buffer, get_nx, get_source_target
from parenx.skeletonize import (
    get_affine_transform,
    get_raster_line,
    get_skeleton,
    split_centres,
)
from parenx.voronoi import (
    filter_buffer,
    filter_distance,
    get_linestring,
    get_geometry_line,
    get_voronoi_line,
    get_voronoi,
    get_segment_nx,
    set_geometry,
)
from shapely import box, get_coordinates, set_precision, unary_union
from shapely.affinity import affine_transform
from shapely.geometry import MultiPoint, Point
from shapely.ops import voronoi_diagram
```

```{python}
#| include: false
plt.rcParams["figure.figsize"] = (12, 12)

CRS = "EPSG:27700"
buffer_size = 8.0
radius = buffer_size
set_precision_pointone = partial(set_precision, grid_size=0.1)

base_otley = gp.read_file("./data/rnet_otley.geojson").to_crs(CRS)
base_otley["geometry"] = base_otley["geometry"].map(set_precision_pointone)
base_otley = combine_line(base_otley["geometry"]).to_frame("geometry")
otley_geometry = get_geometry_buffer(base_otley["geometry"], radius=buffer_size)

# Same for Armley:

base_armley = gp.read_file("./data/rnet_armley.geojson").to_crs(CRS)
base_armley["geometry"] = base_armley["geometry"].map(set_precision_pointone)
base_armley = combine_line(base_armley["geometry"]).to_frame("geometry")
armley_geometry = get_geometry_buffer(base_armley["geometry"], radius=buffer_size)
```

<!-- ### Network skeletonization -->

In skeletonization, overlapping lines are identified, buffered, transformed into a raster image, the image processed through a thinning algorithm, and a skeletal representation of the original network produced (see [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details).
This skeletal structure preserves the overall extent and connectivity of the network, with a central line that follows the centre-line of the combined buffered area.

In detail, skeletonization is only applied where more than line-segment buffer overlaps.
To identify overlapping line-segments, the buffer is split at the end of each line-segment.
The overlapping line-segments are then buffered while retaining the remaining disjoint lines.

```{=html}
<!---This visualization demonstrates the process of cutting and segmenting the buffer geometries. It highlights the transformations from the initial buffered geometries to a more segmented and manageable form, preparing them for further analysis and simplification steps​.
It effectively highlights the contrast between the more intricate and the simpler sections within these networks. --->
```

<!-- ::: {#fig-split-ends layout-ncol="2"} -->
```{python}
split_centre = partial(split_centres, offset=np.sqrt(1.5) * radius)

## overlapping
otley_centre = gp.GeoSeries(base_otley["geometry"].map(split_centre), crs=CRS)
otley_centre = otley_centre.buffer(radius, 0, join_style="round", cap_style="round")
combined_otley = gp.GeoSeries(unary_union(otley_centre.values).geoms, crs=CRS)
# combined_otley.plot()
```

```{python}
## overlapping
armley_centre = gp.GeoSeries(base_armley["geometry"].map(split_centre), crs=CRS)
armley_centre = armley_centre.buffer(radius, 0, join_style="round", cap_style="round")
combined_armley = gp.GeoSeries(unary_union(armley_centre.values).geoms, crs=CRS)
# combined_armley.plot()
```

```{=html}
<!--- Truncated and segmented buffer geometries of the Otley Road (left) and Armley Road (right) networks.
::: --->
```

```{python}
i, j = base_otley.sindex.query(combined_otley, predicate="intersects")

base_otley["class"] = -1
base_otley.loc[j, "class"] = combined_otley.index[i]
count = base_otley.groupby("class").count()
base_otley = base_otley.join(count["geometry"].rename("count"), on="class")
ix = base_otley["class"] == -1
base_otley.loc[ix, "count"] = 0

i, j = base_armley.sindex.query(combined_armley, predicate="intersects")

base_armley["class"] = -1
base_armley.loc[j, "class"] = combined_armley.index[i]
count = base_armley.groupby("class").count()
base_armley = base_armley.join(count["geometry"].rename("count"), on="class")
ix = base_armley["class"] == -1
base_armley.loc[ix, "count"] = 0
```

```{python}
ix = base_otley["count"].isin([0, 1])
p = base_otley.loc[~ix, "geometry"].copy()
p = p.buffer(radius, 512, join_style="round", cap_style="round")
try:
    p = gp.GeoSeries(list(unary_union(p.values).geoms), crs=CRS)
except AttributeError:
    p = gp.GeoSeries(unary_union(p.values), crs=CRS)

q = base_otley.loc[ix, "geometry"].buffer(0.612, 64, join_style="mitre", cap_style="round")
otley_segment = pd.concat([p, q])
try:
    otley_segment = gp.GeoSeries(list(unary_union(otley_segment.values).geoms), crs=CRS)
except AttributeError:
    otley_segment = gp.GeoSeries(unary_union(otley_segment.values), crs=CRS)
# otley_segment.plot()
```

```{python}
#| include: false
ix = base_armley["count"].isin([0, 1])
p = base_armley.loc[~ix, "geometry"].copy()
p = p.buffer(radius, 512, join_style="round", cap_style="round")
try:
    p = gp.GeoSeries(list(unary_union(p.values).geoms), crs=CRS)
except AttributeError:
    p = gp.GeoSeries(unary_union(p.values), crs=CRS)

q = base_armley.loc[ix, "geometry"].buffer(0.612, 64, join_style="mitre", cap_style="round")
armley_segment = pd.concat([p, q])
try:
    armley_segment = gp.GeoSeries(list(unary_union(armley_geometry.values).geoms), crs=CRS)
except AttributeError:
    armley_segment = gp.GeoSeries(unary_union(armley_segment.values), crs=CRS)
# armley_segment.plot()
```

<!-- Segmented line-buffer geometries (top) and geometries to be skeletonized (bottom) for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

As detail is lost in transforming of the geometry to an image buffer or raster, more detail can be retained by using an affine transformation to increase the number of points in the buffer prior to skeletonization and reducing scale when creating the simplified linear geometric representation.
This transformation is scaled to ensure that the projected coordinate geometry of the network aligns accurately with the corresponding dimensions of the scaled raster image.
The raster image also requires pre-processing to eliminate small holes that appear where buffered lines run parallel or intersect at shallow angles.
The skeltonisation algorithm is then appled to the raster image yielding a skeletal raster image and converted back into a linear vector geometry, completing the vector-to-raster-to-vector geometry transformation (see [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details).

```{python}
import rasterio.features as rif

r_matrix_otley, s_matrix_otley, out_shape_otley = get_affine_transform(otley_geometry, scale=2.0)
# For Armle
r_matrix_armley, s_matrix_armley, out_shape_armley = get_affine_transform(armley_geometry, scale=2.0)
```


```{python}
from skimage.morphology import remove_small_holes, skeletonize
import rasterio.plot as rip
```

<!-- ::: {#fig-rasterize layout-ncol="2"}
```{python}
otley_im = rif.rasterize(otley_segment.values, transform=r_matrix_otley, out_shape=out_shape_otley)
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    otley_im = remove_small_holes(otley_im, 20).astype(np.uint8)

# rip.show(otley_im, cmap="Greys", title="buffer geometry")
```

```{python}
armley_im = rif.rasterize(armley_segment.values, transform=r_matrix_armley, out_shape=out_shape_otley)
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    armley_im = remove_small_holes(armley_im, 20).astype(np.uint8)
# rip.show(armley_im, cmap="Greys", title="buffer geometry")
```

Rasterized versions of the Otley Road (left) and Armley Road (right) networks, with post processing to remove small holes.
::: -->

<!---The image undergoes a thinning process, yielding a skeletal raster image as the result.
This skeletonized image effectively captures the essential structure and layout of the original network, as illustrated in @fig-thin-skeleton. --->

<!-- ::: {#fig-thin-skeleton layout-ncol="2"} -->
```{python}
otley_skeleton = skeletonize(otley_im).astype(np.uint8)
# rip.show(otley_skeleton, cmap="Greys", title="skeleton geometry")
otley_p = np.stack(np.where(otley_skeleton >= 1))
otley_point = gp.GeoSeries(map(Point, otley_p.T), crs=CRS)
```

```{python}
armley_skeleton = skeletonize(armley_im).astype(np.uint8)
# rip.show(armley_skeleton, cmap="Greys", title="skeleton geometry")
armley_p = np.stack(np.where(armley_skeleton >= 1))
armley_point = gp.GeoSeries(map(Point, armley_p.T), crs=CRS)
```

<!-- Skeletonized versions of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- ### Recreating a linear geometry -->

<!-- Figure commented out as not necessary and similar to the subsequent figure: -->

<!-- ::: {#fig-skeleton-vector layout-ncol="2"} -->

```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_transform = otley_point.map(shapely_transform).map(set_precision_pointone)
# otley_transform.plot(edgecolor="black", color="blue").grid()
# plt.show()
```

```{python}
armley_transform = armley_point.map(shapely_transform).map(set_precision_pointone)
# armley_transform.plot(edgecolor="black", color="blue").grid()
# plt.show()
```

```{=html}
<!-- Skeletonized versions of the Otley Road (left) and Armley Road (right) networks, transformed back into point geometry.

::: -->
```

Adjacent points are identifed, typically within a 1x1 pixel square, based on proximity within the raster image coordinate system.
Line segments are then created by connecting these adjacent points.
These points are combined, giving a continuous line geometry representing the simplified network.
Finally, a reverse scaling affine transformation is applied to return to the original coordinate system.
<!-- This conversion from point to line geometry is a pivotal aspect of network simplification. -->

Noting that creating a line geometry from the set of points in the raster buffer is arguable the most complex step.

<!-- The resulting simplified network is illustrated in @fig-skeleton-line. -->

```{python}
otley_line = get_raster_line(otley_point, True)
armley_line = get_raster_line(armley_point, True)
```

<!-- ::: {#fig-skeleton-line layout-ncol="2"} -->
```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_sk = otley_line.map(shapely_transform).map(set_precision_pointone)
otley_sk = otley_sk.set_crs(CRS)
# otley_sk.plot()
```

```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_armley)
armley_sk = armley_line.map(shapely_transform).map(set_precision_pointone)
armley_sk = armley_sk.set_crs(CRS)
# armley_sk.plot()
```

<!-- Simplified versions of the Otley Road (left) and Armley Road (right) networks, transformed back into line geometry. -->
<!-- ::: -->

## Simplification via Voronoi polygons {#sec-simplification-via-voronoi-polygons}

Voronoi simplification takes the buffered network segments and converts them into a set of points.
The edges of these buffers are then segmented into sequences of points.
From these sequences, a centre-line is derived based on a set of Voronoi polygons that cover these points.
This approach facilitates the creation of a simplified network representation by focusing on the central alignment of the buffered lines.


```{python}
scale = 10.0
tolerance = 0.1

otley_clip = box(426800, 437400, 427000, 437600)
armley_clip = box(427200, 433500, 427400, 433700)
```

<!-- ::: {#fig-boundary layout-ncol="2"} -->
```{python}
otley_boundary = get_geometry_line(otley_geometry)
# otley_boundary.plot()
```

```{python}
armley_boundary = get_geometry_line(armley_geometry)
# armley_boundary.plot()
```

<!-- Simplified boundaries of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- @fig-segment showcase the conversion of segmented LineString geometries into point geometries. -->
<!-- This essential transformation forms the basis for constructing Voronoi diagrams. -->

<!-- ::: {#fig-segment layout-ncol="2"} -->
```{python}
otley_segment = get_segment_nx(otley_boundary, scale).reset_index(drop=True)
# ax = otley_segment.clip(otley_clip).plot(edgecolor="red", linestyle='--', linewidth=1)
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

```{python}
armley_segment = get_segment_nx(armley_boundary, scale).reset_index(drop=True)
# ax = armley_segment.clip(armley_clip).plot(edgecolor="red", linestyle='--', linewidth=1)
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

<!-- Detail segmented boundaries of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- @fig-voronoi-point, the process of converting the segmented LineString geometries into point geometries is illustrated. -->
<!-- This transformation is essential for the creation of Voronoi diagrams. -->

<!-- ::: {#fig-voronoi-point layout-ncol="2"} -->
```{python}
otley_point = otley_segment.loc[:, "geometry"].map(get_coordinates).explode()
otley_point = MultiPoint(otley_point[::2].map(Point).values)
nx_output = gp.GeoSeries(otley_point, crs=CRS)
# ax = nx_output.clip(otley_clip).plot(edgecolor="blue", color="white")
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

```{python}
armley_point = armley_segment.loc[:, "geometry"].map(get_coordinates).explode()
armley_point = MultiPoint(armley_point[::2].map(Point).values)
nx_output = gp.GeoSeries(armley_point, crs=CRS)
# ax = nx_output.clip(armley_clip).plot(edgecolor="blue", color="white")
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

<!-- Detail point segement of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_armley)
armley_sk = armley_line.map(shapely_transform).map(set_precision_pointone)
armley_sk = armley_sk.set_crs(CRS)
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_sk = otley_line.map(shapely_transform).map(set_precision_pointone)
otley_sk = otley_sk.set_crs(CRS)
```

<!-- we probably want to pick Voronoi #1 or Voronoi #2. I'd marginally favour #2 -->

<!-- In @fig-voronoi-2, the generation and clipping of the corresponding Voronoi diagrams to the bounds of the input geometry is depicted. -->

<!-- ::: {#fig-voronoi layout-ncol="2"} -->

```{python}
otley_envelope = box(*otley_point.bounds)
otley_voronoi = voronoi_diagram(otley_point, envelope=otley_envelope, tolerance=tolerance, edges=True)
otley_voronoi = gp.GeoSeries(map(set_precision_pointone, otley_voronoi.geoms), crs=CRS)
#ax = otley_voronoi.plot()
#ax.xaxis.set_ticklabels([])
#ax.yaxis.set_ticklabels([])
```

```{python}
armley_envelope = box(*armley_point.bounds)
armley_voronoi = voronoi_diagram(armley_point, envelope=armley_envelope, tolerance=tolerance, edges=True)
armley_voronoi = gp.GeoSeries(map(set_precision_pointone, armley_voronoi.geoms), crs=CRS)
#ax = armley_voronoi.plot()
#ax.xaxis.set_ticklabels([])
#ax.yaxis.set_ticklabels([])
```

```{=html}
<!-- Clipped Voronoi diagrams of the Otley Road (left) and Armley Road (right) networks.

<!-- ::: -->
```

<!-- ### Voronoi 2 -->

<!-- ::: {#fig-voronoi-2 layout-ncol="2"} -->
```{python}
otley_voronoi = otley_voronoi.explode(index_parts=False).clip(otley_envelope)
ix = ~otley_voronoi.is_empty & (otley_voronoi.type == "LineString")
otley_voronoi = otley_voronoi[ix].reset_index(drop=True)
# ax = otley_voronoi.plot()
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

```{python}
armley_voronoi = armley_voronoi.explode(index_parts=False).clip(armley_envelope)
ix = ~armley_voronoi.is_empty & (armley_voronoi.type == "LineString")
armley_voronoi = armley_voronoi[ix].reset_index(drop=True)
# ax = armley_voronoi.plot()
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```

<!-- Clipped Voronoi diagrams of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- ### Voronoi simplified network -->

Voronoi lines that are completely enclosed within the buffer geometry and are situated at a distance of less than half the buffer's width from the buffer edge are selected.
This selective visualization of Voronoi lines effectively demonstrates the method precision in capturing and representing the central alignment of the transport network within its buffered confines.
The final center-line network is created through a clean up process that involves the removal of knot-like features from the simplified network.
While knots common to both skeletonization and Voronoi simplification, these artefacts are more prevalent on Voronoi simplified networks.

<!-- ::: {#fig-voronoi-simplified layout-ncol="2"} -->
```{python}
offset = buffer_size / 2.0

otley_line = filter_distance(otley_voronoi, otley_boundary, offset)
# otley_line.plot()
```

```{python}
armley_line = filter_distance(armley_voronoi, armley_boundary, offset)
# armley_line.plot()
```

<!-- Voronoi diagram lines with lines that are completely within the buffer geometry and less than half-a-buffer-width from the buffer edge. -->
<!-- ::: -->

<!--- ### Voronoi simplified network --->


::: {#fig-skeleton-line layout-ncol="3"}

```{python}
base_otley.plot(edgecolor="blue", color="blue")
```

```{python}

otley_sk.plot()
```

```{python}
otley_line = filter_buffer(otley_line, otley_geometry)
otley_edge, otley_node = get_source_target(otley_line.to_frame("geometry"))
ix = otley_node["count"] < 4
otley_square = otley_node[ix].buffer(offset, cap_style="square", mitre_limit=offset)
otley_square = gp.GeoSeries(unary_union(otley_square.values).geoms, crs=CRS)
otley_line = otley_edge["geometry"].map(get_linestring).explode().to_frame("geometry")
otley_line = set_geometry(otley_line, otley_square)
otley_line = combine_line(otley_line)
otley_line.plot()
```

```{python}
base_armley.plot(edgecolor="blue", color="blue")
```


```{python}

armley_sk.plot()
```

```{python}
armley_line = filter_buffer(armley_line, armley_geometry)
armley_edge, armley_node = get_source_target(armley_line.to_frame("geometry"))
ix = armley_node["count"] < 4
armley_square = armley_node[ix].buffer(offset, cap_style="square", mitre_limit=offset)
armley_square = gp.GeoSeries(unary_union(armley_square.values).geoms, crs=CRS)
armley_line = armley_edge["geometry"].map(get_linestring).explode().to_frame("geometry")
armley_line = set_geometry(armley_line, armley_square)
armley_line = combine_line(armley_line)
armley_line.plot()
```

Original and simplified versions of the Otley Road (top) and Armley Road (bottom) networks.
From left to right: original network, skeletonized network, Voronoi simplified network.

:::

## Post-Processing

Both skeletonization and Voronoi simplified networks require post-processing to remove unwanted knots, short segments at intersections, resembling tangled knots.
To remove these features, short segments are clustered together, and a central point for each cluster is determined.
The end-points of longer lines that connect to these segment clusters are then realigned to the cluster's central point, as illustrated in the [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix.

An additional optional stage is to simplify the network further by removing vertices that are not essential for the network's connectivity, resulting in a primal network that captures the essential connectivity and layout of transport routes.
The primal network is thus composed of direct lines connecting start and end points, representing a high level of simplification that prioritises the network's structure and compression.

<!-- The primal networks for the Otley Road and Armley Road skeletonized networks are illustrated in @fig-primal-skeleton. -->

<!-- ::: {#fig-primal-skeleton layout-ncol="2"} -->
```{python}
otley_edge_sk = get_nx(otley_sk)
# otley_edge_sk.plot()
```

```{python}
armley_edge_sk = get_nx(armley_sk)
# armley_edge_sk.plot()
```

<!-- Primal skeletonized networks for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- ### Primal network -->

<!-- @fig-primal-voronoi illustrates the primal network derived from the Voronoi approach. -->

<!-- ::: {#fig-primal-voronoi layout-ncol="2"} -->
```{python}
otley_edge = get_nx(otley_line)
# otley_edge.plot()
```

```{python}
armley_edge = get_nx(armley_line)
# armley_edge.plot()
```

<!-- Primal Voronoi networks for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->

<!-- ![](images/paste-1.png) -->

See Section 4 in [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details.


## Joining route networks {#sec-joining-route-networks}

After generating a simplified network using the methods, it is often useful to translate attributes from the original network to the simplified network.
 <!-- described in the previous sections or through an alternative approach, the next crucial step involves transferring attribute values from the detailed network to the simplified one.
This process is commonly referred to as 'conflation' and 'integration'.
Conflation is essential because while the source file (detailed network) might be rich in attributes like street names, address ranges, and zip codes, it may lack positional accuracy.
Conversely, the target file (simplified network) is likely to be positionally precise but deficient in detailed attributes. -->
<!-- As noted by [@sutton1998], this process involves two key aspects: the geometric integration, involving the link and node feature elements, and the integration of attributes such as highway data. -->
In the context of network planning tools, the purpose of the joining stage is to join traffic estimates from the source network onto the geometrically simplified target network [@sutton1998].
<!-- This 'joining' step is vital for using simplified networks as the basis for presenting model outputs generated on a complex network in a easy-to-interpret form. -->
<!-- In instances where a simplified version of the network is readily available, such as the Ordnance Survey's Open Roads dataset in the UK, the steps for network simplification can be bypassed to save time.
We have implemented the steps presented in this section in the `rnet_merge()` function in the `stplanr` R package. -->

<!-- The process is analogous to joining two datasets based on a common 'key' variable. -->
In this case there is no definitive key, meaning that network joining can be regarded as a 'fuzzy' or 'keyless' join process [@suri; @wachowicz2019]: as with the network simplification steps outlined above, the user must select joining parameters to maximise the accuracy of the join.
There are at least a couple of implementations of network joining methods in open source software: the `rnet_merge()` function in the `stplanr` R package [@lovelace2019], and the [`rnetmatch` Rust crate](https://github.com/nptscot/rnetmatch/tree/main/rust) for which has bindings to R and Python are planned.
The details of network joining methods, algorithms and implementations are outside the scope of this paper, see the documentation associated with the projects mentioned above for more information.

<!-- **Data Preparation:** -->

<!-- The main inputs of the network merging function are two route networks: `rnet_x` (the simple network geometry) and `rnet_y` (the original network with a more complex geometry and attributes that need to be translated onto the new network). -->

<!-- `rnet_y` in this case represents the detailed network with attributes such as model outputs representing transport flows to be translated onto the new network. -->

<!-- **Step 1: Coordinate Reference System Alignment** -->

```{=html}
<!-- Transform the spatial data of both `rnet_x` (the simplified network) and `rnet_y` (the detailed network) to the same coordinate reference system.
For this project, we have selected EPSG:27700. -->
```

```{r}
#| echo: false
#| eval: false
rnet_xp = sf::st_transform(rnet_x, "EPSG:27700")
rnet_yp = sf::st_transform(rnet_y, "EPSG:27700")
```

<!-- **Step 2: Defining the Function List** -->

```{=html}
<!-- Create a function list that dictates how each attribute is to be processed:

-   **Exclude**: "geometry" from processing.

-   **Mean Function**: Applied to attributes like "Gradient" and "Quietness".
    TODO expline the funtion of sum and mean

-   **Sum Function**: Used for aggregating values in columns such as "all_bicycle".

**Step 3: Using `stplanr::rnet_merge` for Integration**

Employ the `rnet_merge` function from the `stplanr` package.
This function is designed to merge route network data, taking into account the predefined functions and alignment in the coordinate system. -->
```

```{r}
#| eval: false
#| echo: false
rnet_merged = stplanr::rnet_merge(rnet_xp, rnet_yp, dist = 20, segment_length = 10, funs = funs, max_angle_diff = 30) 
```

```{=html}
<!-- The network joining function can also take a number of arguments that define how the attributes are translated to the simplified target network:

-   **`dist`** (Buffer Distance): This parameter defines the buffer zone around `rnet_xp` in meters, for determining the proximity at which features from both networks are considered for merging.
    Typically, this value is refined to approximate the width of streets, ensuring a realistic spatial correlation between the network elements.

-   **`segment_length`** (Maximum Segment Length): This optional argument (with 0 being the default meaning no splitting) specifies the maximum length of segments in `rnet_y` before they are split.
    Segmenting long segments in the detailed network reduces number of source geometries that do not fit within target geometries, which can be key for achieving higher accuracy in attribute integration. -->
```

```{=html}
<!-- -   **`funs`** (Function List): Comprises a series of key-value pairs representing variable names and the function to apply to each.
    Any function can be used, with sum and mean being typical values.

-   **`max_angle_diff`** (Maximum Angular Difference): This argument specifies the maximum angular difference between segments and target lines for values in matching source geometries to be translated to the target simplified geometries.
    A low value, such as 20 degrees, ensures that values are translated only to segments with similar orientations, preventing overestimation of values on side roads.
    @fig-max_angle_diff demonstrates the impact of this parameter in preventing the overestimation of values on side roads. -->
```

```{=html}
<!-- ::: {#fig-max_angle_diff layout-ncol="1"}
![The effect of setting `max_angle_diff` to 20 degrees compared to a null value, illustrating the reduction in value overestimation on side roads](pics/Prevent%20overestimate%20of%20values%20on%20sideroads.png)

The effect of setting `max_angle_diff` to 20 degrees compared to a null value, illustrating the reduction in value overestimation on side Roads.
::: -->
```

```{=html}
<!-- An optional `sum_flows` argument can be used to ensure that more influential lines have a proportionate impact on the final aggregated value for each geometry in the simplified network.
The formula for the weighted sum is:

$$
\text{Weighted Sum} = \sum (x_i \times w_i)
$$

Where `xi` represents the attribute value for the i-th line in the complex network, and `wi` is the length weight for the i-th line. -->
```

```{=html}
<!-- An optional normalization step adjusts the values associated with each feature in the simplified network.
This adjustment ensures that the total flow values of both the complex and simplified networks are equivalent.
The need for normalization arises from the fact that simplified networks typically have a reduced total length and are less circuitous compared to their complex counterparts. -->
```

```{=html}
<!-- The normalization process is governed by the following formula:

$$
\text{Normalized Value} = \frac{\text{Weighted Sum}}{\sum w_i}
$$

Additionally, to account for potential overestimation of values in the simplified network, the following formula is applied:

$$
\text{over\_estimate} = \frac{\sum(\text{Aggregated Value} \times \text{Length of Single Line})}{\sum(\text{Original Value} \times \text{Length of Lines in Complex Network})}
$$

The final step involves adjusting the normalized value to correct any overestimation, as calculated above:

$$
\text{Adjusted Value} = \frac{\text{Normalized Value}}{\text{over\_estimate}}
$$

Through these steps, the normalized and adjusted values more accurately reflect the true distribution and intensity of flows within the simplified network, preserving the overall data integrity. -->
```

<!-- Content for long form appendix?: -->

```{=html}
<!-- To calculate max_angle_diff, it's necessary to accurately calculate the angles (bearings) of lines from two networks, rnet_x$angle_x and rnet_y$angle_y.
The difference in these angles is then computed as rnetj$angle_diff <- rnetj$angle_y - rnetj\$angle_x.
In this study, the line_bearing function is defined to calculate the angle.
As shown below,
       
rnet$angle = line_bearing(rnet_y, bidirectional = TRUE)

The process of calculating the bidirectional bearing of a line segment in geographical coordinates involves normalizing the standard bearing so that it is the same regardless of the direction of travel along the line.
Below is the detailed formula:

1.  **Convert Latitudes and Longitudes to Radians**:

$$
\text{lat}_1 = \text{lat}_1 \times \frac{\pi}{180}
$$

$$
\text{lat}_2 = \text{lat}_2 \times \frac{\pi}{180}
$$

$$
\text{lon}_1 = \text{lon}_1 \times \frac{\pi}{180}
$$

$$
\text{lon}_2 = \text{lon}_2 \times \frac{\pi}{180}
$$

2.  **Calculate Bearing**:

$$
\Delta lon = lon_2 - lon_1
$$

$$
\theta = atan2(\sin(\Delta lon) \times \cos(lat_2), \cos(lat_1) \times \sin(lat_2) - \sin(lat_1) \times \cos(lat_2) \times \cos(\Delta lon))
$$

3.  **Convert Bearing from Radians to Degrees and Normalize Bidirectional Bearing to \[0, 180) Degrees**:

$$
\theta = \theta \times \frac{180}{\pi}
$$

$$
\theta = \theta \mod 180
$$

In this bidirectional adjustment, the bearing is normalized to a range of \[0, 180) degrees.
This means that opposite directions along a line yield the same bearing value.
This approach is particularly useful in scenarios where the directionality of the line is not a primary concern, ensuring consistency in bearing values regardless of the line's travel direction. 
-->
```

<!-- ![image](pics/Prevent_overestimate_of_values_on_sideroads.png) -->

```{r, eval = FALSE}

rnet_xp = st_transform(rnet_x, "EPSG:27700")
rnet_yp = st_transform(rnet_y, "EPSG:27700")

# Extract column names from the rnet_yp
name_list = names(rnet_yp)

# Initialize an empty list
funs = list()

# Loop through each name and assign it a function based on specific conditions
for (name in name_list) {
  if (name == "geometry") {
    next  # Skip the current iteration
  } else if (name %in% c("Gradient", "Quietness")) {
    funs[[name]] = mean
  } else {
    funs[[name]] = sum
  }
}

dist = 20
rnet_merged_with_angle = stplanr::rnet_merge(rnet_xp, rnet_yp, dist = dist, segment_length = 10, funs = funs, max_angle_diff = 20)
rnet_merged_without_angle= stplanr::rnet_merge(rnet_xp, rnet_yp, dist = dist, segment_length = 10, funs = funs)

# Define breaks for the color scale
brks = c(0, 100, 500, 1000, 5000)

# Set global tmap options for tighter margins
tmap_options(
  outer.margins = c(0, 0, 0, 0),
  inner.margins = c(0, 0, 0, 0)
)

# Create the first map with scale bar
m1 = tm_shape(rnet_merged_with_angle) + 
     tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
     tm_scale_bar() +
     tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)

# Create the second map
m2 = tm_shape(rnet_merged_without_angle) + 
     tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
     tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)

# Arrange the two maps vertically with a tight layout and synchronization
tmap_arrange(m1, m2, nrow = 2, sync = TRUE)
```

```{=html}
<!-- After the joining process has completed, a series of post-processing steps can refine the results:

-   **Cleaning**: The `rnet_merged` dataset undergoes an initial cleaning phase where superfluous columns are removed.
    This step enhances dataset manageability and focuses on relevant data by eliminating columns such as 'identifier' and 'length_x'.

-   **Dimensionality Reduction**: Z and M dimensions are removed from the dataset, as they are redundant for our analysis and contribute to unnecessary increases in file size.
    This reduction simplifies the dataset and optimizes it for storage and processing efficiency.

-   **NA Handling**: A thorough examination is conducted for the presence of NA values across the columns.
    Any rows where all the targeted columns contain NA values are filtered out.

-   **Spatial Subsetting**: Following the removal of linestrings that contain NA attributes from the `rnet_merged` dataset, we proceed to create a geometric buffer zone.
    This zone serves as a spatial criterion for identifying and retaining only those geometries that lie outside of this buffer.
    A subset of `rnet_yp`, termed `rnet_yp_rest`, is then delineated based on this spatial relationship, effectively omitting any geometries that intersect with or fall within the buffered area.
    This step ensures that only those components of `rnet_y` which are spatially distinct from `rnet_merged` are retained for further consideration.

-   **Network Simplification**: In the final stage of simplification, the datasets `rnet_yp_rest` and `rnet_merged_all` are combined to form the final network, which retains the simplicity of the original simplified structure while concurrently encompassing the detailed attributes, thereby providing a comprehensive yet efficient foundation for further analysis.

These post-processing steps are adaptable and can be customized according to the specific requirements of the analysis. -->
```

```{python}
#| eval: false
rnet_x = gp.read_file("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_x_ed.geojson")
rnet_y = gp.read_file("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")
# # Plotting both GeoJSON files side by side
# fig, axes = plt.subplots(1, 2, figsize=(15, 8))

# rnet_x.plot(ax=axes[0], color='blue')
# axes[0].set_title('RNet X')

# rnet_y.plot(ax=axes[1], color='green')
# axes[1].set_title('RNet Y')

# plt.tight_layout()
# plt.show()
```

<!-- TODO: add content to this section. -->

<!-- TODO: Is this possible? -->

<!-- ## Combined network simplification and attribute merging -->

# Application to Edinburgh City Centre {#sec-application}

<!-- In this section, we delve into the practical application of our proposed route network simplification techniques, focusing on a complex urban road network situated in the heart of Edinburgh.
The key characteristics of the Edinburgh city center are detailed in @tbl-new-datasets. -->

```{r}
#| label: tbl-new-datasets
#| include: false
dataset_table = tribble(
  ~Dataset,               ~GeometryType, ~NumberOfSegments, ~NumberOfNodes, ~Complexity,
  "Edinburgh City Centre", "LineString", 1144,              4603,           5265832
  # "Doncaster Rail Network","LineString", 693,               11728,          8127504
)

knitr::kable(dataset_table)
```

<!--- Characteristics of Datasets: Edinburgh city centre and the Doncaster rail network. -->

````{=html}
<!-- ```{python}
import geopandas as gpd

# Load the geojson files
princes_street_geojson = gpd.read_file('data/rnet_princes_street.geojson')
doncaster_rail_geojson = gpd.read_file('data/rnet_doncaster_rail.geojson')

# Function to calculate the number of nodes in a LineString/MultiLineString geometry
def count_nodes(geometry):
    # For LineString, count start and end points
    if geometry.type == 'LineString':
        return len(geometry.coords)
    # For MultiLineString, count start and end points of each component LineString
    elif geometry.type == 'MultiLineString':
        return sum(len(line.coords) for line in geometry)
    else:
        return 0

# Calculate the characteristics for each dataset
princes_street_info = {
    "Geometry Type": princes_street_geojson.geometry.iloc[0].type,
    "Number of Segments": princes_street_geojson.shape[0],
    "Number of Nodes": princes_street_geojson.geometry.apply(count_nodes).sum(),
    # Complexity could be defined in many ways, here we use a simple measure: segments * nodes
    "Complexity": princes_street_geojson.shape[0] * princes_street_geojson.geometry.apply(count_nodes).sum()
}

doncaster_rail_info = {
    "Geometry Type": doncaster_rail_geojson.geometry.iloc[0].type,
    "Number of Segments": doncaster_rail_geojson.shape[0],
    "Number of Nodes": doncaster_rail_geojson.geometry.apply(count_nodes).sum(),
    # Complexity could be defined in many ways, here we use a simple measure: segments * nodes
    "Complexity": doncaster_rail_geojson.shape[0] * doncaster_rail_geojson.geometry.apply(count_nodes).sum()
}

# princes_street_info, doncaster_rail_info
``` -->
````

@fig-results illustrates the methods presented in this paper, providing a comparative view of the impact of our methodology.
The skeletonized network (top right) is a simplified version of the input dataset, with the Voronoi diagram (bottom left) providing an alternative simplification.
Finally, the primal network (bottom right) is the most simplified version of the input dataset, with only the most essential connections retained.

```{python}
PLOT_DATA = ("base_armley,base_edinburgh,base_doncaster,armley_sk,edinburgh_sk,doncaster_sk,"
             "armley_line,edinburgh_line,doncaster_line,armley_edge_sk,edinburgh_edge_sk,"
             "doncaster_edge_sk,armley_edge,edinburgh_edge,doncaster_edge").split(",")

for variable in PLOT_DATA:
    filepath = Path(f"data/rnet_{variable}.geojson")
    if variable in globals():
        continue
    if filepath.exists():
        data = gp.read_file(filepath)
        globals()[variable] = data
```

```{python}
SCALE=2.0
if "edinburgh_sk" not in globals():
    base_edinburgh = gp.read_file("./data/rnet_princes_street.geojson").to_crs(CRS)
    edinburgh_geometry = get_geometry_buffer(base_edinburgh["geometry"], radius)
    r_matrix_edinburgh, s_matrix_edinburgh, out_shape_edinburgh = get_affine_transform(
        edinburgh_geometry, scale=SCALE
    )
    shapely_transform = partial(affine_transform, matrix=s_matrix_edinburgh)
    edinburgh_skeleton = get_skeleton(
        edinburgh_geometry, r_matrix_edinburgh, out_shape_edinburgh, SCALE
    )
    edinburgh_p = np.stack(np.where(edinburgh_skeleton >= 1))
    edinburgh_point = gp.GeoSeries(map(Point, edinburgh_p.T), crs=CRS)
    edinburgh_line = get_raster_line(edinburgh_point)
    edinburgh_sk = edinburgh_line.map(shapely_transform).map(set_precision_pointone)
    edinburgh_sk = gp.GeoSeries(edinburgh_sk, crs=CRS)
    edinburgh_edge_sk = get_nx(edinburgh_sk)

if "doncaster_sk" not in globals():
    base_doncaster = gp.read_file("./data/rnet_doncaster_rail.geojson").to_crs(CRS)
    doncaster_geometry = get_geometry_buffer(base_doncaster["geometry"], radius)
    r_matrix_doncaster, s_matrix_doncaster, out_shape_doncaster = get_affine_transform(
        doncaster_geometry, scale=SCALE
    )
    shapely_transform = partial(affine_transform, matrix=s_matrix_doncaster)
    doncaster_skeleton = get_skeleton(
        doncaster_geometry, r_matrix_doncaster, out_shape_doncaster, SCALE
    )
    doncaster_p = np.stack(np.where(doncaster_skeleton >= 1))
    doncaster_point = gp.GeoSeries(map(Point, doncaster_p.T), crs=CRS)
    doncaster_line = get_raster_line(doncaster_point)
    doncaster_sk = doncaster_line.map(shapely_transform).map(set_precision_pointone)
    doncaster_sk = gp.GeoSeries(doncaster_sk, crs=CRS)
    doncaster_edge_sk = get_nx(doncaster_sk)
```

```{python}
if "edinburgh_edge" not in globals():
    edinburgh_boundary = get_geometry_line(edinburgh_geometry)
    edinburgh_voronoi = get_voronoi(edinburgh_boundary, tolerance, SCALE)
    edinburgh_line = get_voronoi_line(
        edinburgh_voronoi, edinburgh_boundary, edinburgh_geometry, radius
    )
    edinburgh_line = edinburgh_line.simplify(1.0)
    edinburgh_edge = get_nx(edinburgh_line)

if "doncaster_edge" not in globals():
    doncaster_boundary = get_geometry_line(doncaster_geometry)
    doncaster_voronoi = get_voronoi(doncaster_boundary, tolerance, SCALE)
    doncaster_line = get_voronoi_line(
        doncaster_voronoi, doncaster_boundary, doncaster_geometry, radius
    )
    doncaster_line = doncaster_line.simplify(1.0)
    doncaster_edge = get_nx(doncaster_line)
```

```{python}
for variable in PLOT_DATA:
    filepath = Path(f"data/rnet_{variable}.geojson")
    if filepath.exists():
        continue
    if variable in globals():
        data = globals()[variable]
        data.to_file(filepath)
```

<!-- TODO: improve the figure layout. -->

::: {#fig-results layout-ncol="1"}
```{python}
# Create a figure with 2 x 2 rows

# Plot 1: Input
fig = plt.figure(figsize=(8, 5))
# ax1 = plt.subplot2grid((3, 2), (0, 0), colspan=2)
ax1 = plt.subplot2grid((2, 2), (0, 0))
ax2 = plt.subplot2grid((2, 2), (0, 1))
ax3 = plt.subplot2grid((2, 2), (1, 0))
ax4 = plt.subplot2grid((2, 2), (1, 1))

ax1 = base_edinburgh.plot(ax=ax1, color='blue')
ax1.grid(True, color='grey', linestyle='--', linewidth=0.5)
ax1.xaxis.set_ticklabels([])
ax1.yaxis.set_ticklabels([])
ax1.set_title('Input')

# Plot 2: Skeletonized
ax2 = edinburgh_sk.plot(ax=ax2, color='green')
ax2.grid(True, color='grey', linestyle='--', linewidth=0.5)
ax2.xaxis.set_ticklabels([])
ax2.yaxis.set_ticklabels([])
ax2.set_title('Skeletonized')

# Plot 3: Voronoi
ax3 = edinburgh_line.plot(ax=ax3, color='purple')
ax3.grid(True, color='grey', linestyle='--', linewidth=0.5)
ax3.xaxis.set_ticklabels([])
ax3.yaxis.set_ticklabels([])
ax3.set_title('Voronoi')

# Plot 4: Primal Skeletonized
ax4 = edinburgh_edge_sk.plot(ax=ax4, color='green')
ax4.grid(True, color='grey', linestyle='--', linewidth=0.5)
ax4.xaxis.set_ticklabels([])
ax4.yaxis.set_ticklabels([])
ax4.set_title('Primal')

plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()
```

<!--- Matrix of Route Network Simplification Stages across Armley, Edinburgh, and Doncaster Rail. --->

Results of the route network simplification methods applied to Edinburgh city center, comparing the input dataset (top left), skeletonized network (top right), Voronoi diagram (bottom left), and primal network (bottom right).
:::

<!-- @fig-apply_results presents the comparison results, showcasing the Edinburgh road network simplified by the proposed method.
It is observed that the attribute `value` from the original network has been correctly transferred to the simplified network via a `sum` calculation process.
This comparison between the original and simplified Edinburgh Road Networks demonstrates the effectiveness of the proposed simplification techniques in maintaining the essential characteristics of the network while reducing its complexity. -->

<!-- ::: {#fig-apply_results} -->
```{r}
#| eval: false
rnet_x = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_x_ed.geojson")
rnet_y = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")

rnet_xp = st_transform(rnet_x, "EPSG:27700")
rnet_yp = st_transform(rnet_y, "EPSG:27700")

# Extract column names from the rnet_yp
name_list = names(rnet_yp)

# Initialize an empty list
funs = list()

# Loop through each name and assign it a function based on specific conditions
for (name in name_list) {
  if (name == "geometry") {
    next  # Skip the current iteration
  } else if (name %in% c("Gradient", "Quietness")) {
    funs[[name]] = mean
  } else {
    funs[[name]] = sum
  }
}

dist = 20
rnet_merged = stplanr::rnet_merge(rnet_xp, rnet_yp, dist = dist, segment_length = 10, funs = funs, max_angle_diff = 10)

# Define breaks for the color scale
brks = c(0, 100, 500, 1000, 2000,5000,8000,10000)

# Set global tmap options for tighter margins
tmap_options(
  outer.margins = c(0, 0, 0, 0),
  inner.margins = c(0, 0, 0, 0)
)

# Create the first map with scale bar
m1 = tm_shape(rnet_yp) + 
     tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
     tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)
# Create the second map
m2 = tm_shape(rnet_merged) + 
     tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
     tm_scale_bar() +
     tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)


# Arrange the two maps vertically with a tight layout and synchronization
tmap_arrange(m1, m2, nrow = 2, sync = TRUE)
```

<!-- Comparison of Original Edinburgh Road Network (Up) and Simplied Edinburgh Road Network (Down). -->
<!-- ::: -->

# Discussion and Conclusion {#sec-discussion}

The two main methods developed in this paper are implemented in the open source software for reproducible research (see the reproducible source code of this paper at [github.com/nptscot](https://github.com/nptscot)).
Both Skeletonization and Voronoi-based methods result in geometries that are substantially simpler and less resource-intensive for subsequent network analysis steps, while preserving the essential connectivity and spatial characteristics of the original network.
Which to use, and the appropriate balance between simplification and preservation of network geometries will vary depending on the specific requirements of the analysis.
An advantage of the implementations we have developed, in the Python package `parenx`, is that they are adaptable to the requirements of the user, from low levels of simplification to the aggressive simplification of a 'primal network' (see the [`parenx` cookbook](https://nptscot.github.io/networkmerge/cookbook.html)).

There are several potential directions of travel building on this work.
One is to explore alternative algorithms for thinning buffered networks, recursively, for example the grassfire algorithm [@leymarie1992], or with medial axis algorithms [e.g. @smogavec2012].
Another is to speed up the process, with options including parallelisation, implementation in lower-level languages, and optimisation of the algorithms themselves.

We developed the methods to support the Network Planning Tool for Scotland, which provides a strong evidence base enabling more data-driven prioritisation of investment in active travel infrastructure.
However, we believe it has applications beyond this use case, for example in strategic rail network modelling and transport network redesign to support more sustainable transport systems.

# Data Availability

All data and code used to generate the results presented in this paper are available in the GitHub repository associated with the paper, to be made available upon acceptance.
To ensure that the paper is fully reproducible, we use GitHub actions to rebuild the paper whenever changes are made to the repository.

# References

```{python}
# osmnx test:
# import osmnx as ox
# import geopandas as gpd
# import momepy
# import networkx as nx
# import topojson as tp

# gdf = gpd.read_file("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")
# # Convert to EPSG:27700
# gdf = gdf.to_crs('EPSG:27700')
# gdf.plot()
# gdf_topo = tp.Topology(gdf)
# gdf_simple = gdf_topo.toposimplify(10).to_gdf()
# gdf_simple.plot()
# # Convert gdf linestrings to nodes:
# gdf_nx = momepy.gdf_to_nx(gdf, approach='dual')
# nx.draw(gdf_nx)
# momepy.roundabout_simplification(gdf)
```

```{python}
# Mompy test:
# import osmnx as ox
# import geopandas as gpd
# import momepy
# import networkx as nx
# import topojson as tp

# gdf = gpd.read_file('data/minimal-input.geojson')
# # Convert to EPSG:27700
# gdf = gdf.to_crs('EPSG:27700')
# gdf.plot()
# gdf_topo = tp.Topology(gdf)
# gdf_simple = gdf_topo.toposimplify(10).to_gdf()
# gdf_simple.plot()
# # Convert gdf linestrings to nodes:
# gdf_nx = momepy.gdf_to_nx(gdf, approach='dual')
# nx.draw(gdf_nx)
# momepy.roundabout_simplification(gdf)
```