-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathpaper.qmd
1364 lines (1062 loc) · 62.8 KB
/
paper.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Route network simplification for transport planning"
bibliography: references.bib
author:
- name: Robin Lovelace
affiliation: Leeds Institute for Transport Studies, University of Leeds, UK
orcid: 0000-0001-5679-6536
- name: Zhao Wang
affiliation: Leeds Institute for Transport Studies, University of Leeds, UK
orcid: 0000-0002-4054-0533
- name: Will Deakin
affiliation: Digital, Data and Technology services, Network Rail, UK
orcid: 0009-0008-5656-4469
- name: Josiah Parry
affiliation: Environmental Systems Research Institute, Redlands, CA, USA
orcid: 0000-0001-9910-865X
format:
# pdf: default
html: default
number-sections: true
execute:
echo: false
message: false
warning: false
editor:
markdown:
wrap: sentence
# # Uncomment to run with Jupyter:
# jupyter: python3
---
```{r}
# Engine: knitr
```
<!-- # Reproducibility {.unnumbered} -->
<!-- <details>
```{r}
#| name: r-setup
#| include: false
library(sf)
library(tmap)
library(dplyr)
library(ggplot2)
library(stplanr)
tmap_mode("plot")
rnet_x = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_x_ed.geojson")
rnet_y = sf::read_sf("https://github.com/ropensci/stplanr/releases/download/v1.0.2/rnet_y_ed.geojson")
```
To contribute to the papers written as quarto documents (with `.qmd` extensions) like this one, we recommend using the Quarto extension for VS Code.
You can go into the visual editor with the following shortcut:
```
Ctrl+Shift+F4
```
You can then add citations with Ctrl+Shift+F11 and benefit from Quarto's other features for academic writing.
</details> -->
# Abstract {.unnumbered}
Route network datasets are central to transport models as key inputs and outputs.
The complexity of route network inputs from sources such as OpenStreetMap has increased over time, enabling more precise modelling of sustainable modes such as walking and cycling.
However, this complexity can affect the visualisation of model results.
A common issue is the presence of multiple parallel ways on the same corridor, which can lead to visual artefacts and misinterpretation of model outputs.
To address this challenges, we present and compare two methods for *simplifying* route network datasets: 1) image skeletonization and 2) Voronoi diagram-centreline identification.
These methods have real-world applications, as illustrated in the "Simplified network" layer in the Transport for Scotland funded Network Planning Tool, which is publicly available at [www.npt.scot](https://www.npt.scot).
Based on open access data and the open source `parenx` Python package, available on the Python Package Index, these methods are not only reproducible but also adaptable, enbabling their use in new contexts.
# Introduction
```{=html}
<!-- A wide range of data types can be used as inputs and outputs transport models, including route network datasets, origin-destination data, movement patterns captured by global positioning systems (GPS), and information on surface characteristics derived from remote sensing imagery.[^1]
Of these, route network datasets are unusual because they are commonly used as both inputs and outputs of transport models.
[^1]: See the [online documentation](https://transportgeography.org/contents/methods/network-data-models/).
As highlighted in the field of transport geography, network data models serve as digital representations of transportation networks, crucial for planning and operational purposes.
These models, enriched through graph theory, enclosed the complex, multi-modal nature of transportation data across various jurisdictions.
This complexity is particularly relevant when considering the enhancement of route network datasets for different use cases, offering widely applicable benefits.
This raises questions about what transport network datasets are, and how they can be optimized for more effective decision-making.
An intuitive definition is that route network datasets are digital representations of footpaths, cycleways, highways and other *ways* (to use the OpenStreetMap terminology) along which people and goods can travel. -->
```
Datasets representing route networks are important in every stage of transport planning.
Transport network datasets are *spatial networks* composed of nodes and edges, in which each edge has an associated edge cost (its weighted or unweighted length) [@barthélemy2011].
Nodes (points connecting edges) and the edges that connect them (lines between nodes representing ways) are located in space, with the edges representing the physical infrastructure of the transport network, often with attributes such as the type of way, its physical characteristics, and usage data, e.g. daily traffic for each way.
Route network simplification can be applied to both the input and output networks of transport models, with the aim of reducing complexity while preserving the spatial structure of the network and relevant attributes.
<!--
File formats for representing route networks include Transportation Network Test Problem files (TNTP and stored as a series of `.tntp` plain text files, examples of which can be found in [github.com/bstabler/TransportationNetworks](https://github.com/bstabler/TransportationNetworks)), `.DAT` files used by the proprietary SATURN transport modelling system and XML-based `.osm` or `.pbf` files that encode OpenStreetMap data.
Geographic file formats implementing the 'Simple Features Access' standard developed by the Open Geospatial Consortium ([OGC](https://www.ogc.org/standard/sfa/)), such as `.shp` and `.geojson`, can also be used to represent route networks.
-->
Growing availability of high resolution geographic datasets and performant hardware and software has enabled people (e.g. via OpenStreetMap) and mapping agencies to generate increasingly detailed maps, a trend that is set to continue.
Sustainable transport planning benefits from this trend, but the complexity and intricacy of street network geometries can create problems.
A clear and intuitive visual representation is crucial for identifying issues such as bottlenecks, congestion hotspots, and areas of poor accessibility.
Consequently, the necessity for network simplification becomes evident, aligning with wider 'map generalization' methods for pre-processing datasets depending on the scale of analysis [@sutton1998].
25 years since Sutton's paper on the topic, simplification of networks for transport planning and other applications remains an unsolved challenge. -->
Vector geometry simplification methods include Douglas-Peucker and Visvalingam-Whyatt algorithms [@liu2020vector; @de2014efficient].
While preserving the overall shape and geographical accuracy, this method struggles to reduce network complexity.
Vector smoothing approaches, including the use of Bezier curves [@pradhan2023modified] and Kernel-based smoothing [@duong2022statistical], can create more visually appealing lines or polygons, but do not reduce the number of vertices.
A common approach is simplification through converstion to an intermediate polygon (buffer) layer.
Numerous algorithms for such 'medial axis' calculationsjave been published open source software repositories [e.g. @smogavec2012; @centerli2023].
Recent implementations of network simplification methods include the under-development `neatnet` Python package [@fleischmann2024], and the `parenx` package which is available on the Python Package Index [@deakin2024].
The latter is used in this paper.
Other network simplification methods for transport planning include automatic detection of 'face artifacts' [@fleischmann] and removal of 'slivers' to generate simplified representations of 'street blocks' [@grippa2018].
However, these methods tend to be 'all or nothing' and do not provide flexibility in terms of the level of simplification or which features are removed.
Network simplification can be useful for a wide range of applications.
Riverine research and flood mapping applications, for example, require estimates of many river characteristics, including simplified centerlines derived from datasets representing river banks, as implemented in software including the R package `cmgo` [@golly2017] and an approach implemented in Google Earth Engine called RivWidthCloud [@yang2020].
The `riverdist` R package, to provide another example, provides functionality for checking whether datasets representing river channels are braided and for removing braids by returning only the shortest paths along the lengths of river centerlines, for example [@tyers2016].
```{=html}
<!-- Flexibility is important due to the wide range of transport planning use cases.
Some transport planning use cases require zoomed-in, geographically accurate and complex representations of transport networks, e.g. maps for junction design may have scales of 1:1000.
Other use cases, such as strategic network planning tools and 'planning support systems' [@page2020], typically involve zoomed-out representations of street networks at scales of 1:10,000+ beyond which dual carriageways are not relevant and can distort visualisation of networks. -->
```
The aim of this paper is to articulate the problem of complex route networks, present solutions with implementations in open source software for reproducible research, and describe applications of the methods to support more effective transport planning.
@sec-problem outlines the problem of complex route networks.
@sec-methods presents methods for route network simplification alongside results based on the example datasets.
@sec-application demonstrates the methods applied to a real transport network (Edinburgh, Scotland) and @sec-discussion concludes with a discussion of the results and future work.
# Problem definition {#sec-problem}
The problem tackled in this paper is the simplification of complex route networks.
This can be illustrated with reference to the Propensity to Cycle Tool for England (PCT) [@lovelace2017], the route networks of which are on methods for aggregating multiple overlapping routes into a route network with non-overlapping linestrings @morgan2020.
Implemented in the function `overline()` in the `stplanr` R package [@lovelace2017], the methods enable visualization of large transport networks and inform investment decisions in transport planning internationally [@lovelace2024; @félix2025].
However, this 'overline' approach, without further processing, has limitations:
- Functionally redundant vertices are kept, leading to large file sizes and slow rendering.
- Parallel ways that are not merged.
The latter issue is particularly problematic for visualisation of transport networks, as shown in @fig-pct [@lovelace2017].
The left panel shows Otley Road with a flow value of 818 (@fig-otley-road).
The right panel, by contrast, shows three parallel ways parallel to Armley Road with flow values of 515 (shown), 288 and 47 (values not shown) (@fig-armley-road).
Although this section of Armley road has a higher cycling potential than the section of Otley Road shown (515 + 288 + 47 \> 818), this is not clear from the visualisation.
::: {#fig-pct layout-ncol="2" width=50%}
{#fig-otley-road width=80%}
{#fig-armley-road width=80%}
{#fig-otley-road-raster width=80%}
{#fig-armley-road-raster width=80%}
Vector (top) and raster (bottom) visualisations of route network results in the Propensity to Cycle Tool.
Note that it is not clear from the visualisation that the corridor shown in the right hand figures (Armley Road corridor) has greater flow than the corridor shown in the left (Otley Road).
Note also the visual artefacts such as 'staircase' effects and overlapping values resulting from parallel lines along Armley Road (right panel).
Source: open access Propensity to Cycle Tool results available at www.pct.bike.
:::
<!-- A subsequent step described by @morgan2020 is to post-process the geographic representation of the transport network into a raster image, which can be used to visualise the network.
The 'rasterisation' stage can tackle some of the issues associated with multiple parallel ways, but introduces new issues, as shown in @fig-rasterisation. -->
<!-- ::: {#fig-rasterisation layout-ncol="2"}
Rasterised network results for the same corridors shown in @fig-pct.
Source: open access Propensity to Cycle Tool results available at www.pct.bike.
::: -->
<!-- The methods presented in this paper are designed to take a complex network as an input and output a simplified network, while preserving the spatial structure of the network and relevant attributes.
By reducing duplicated parallel lines and other intricacies, the outputs can enable easier-to-interpret visualisations of transport behaviour on the network patterns and behaviors. -->
# Data and Methods {#sec-methods}
```{r}
rnet_otley = sf::read_sf("./data/rnet_otley.geojson")
rnet_armley = sf::read_sf("./data/rnet_armley.geojson")
```
In this paper we use the two street networks discussed in the previous section to illustrate the methods.
See the ['`parenex` cookbook'](https://nptscot.github.io/networkmerge/cookbook.html) and [Methods](https://nptscot.github.io/networkmerge/methods.html) appendices for further details on the methods used in this paper and their application to alternative (railway based) datasets.
```{r}
#| include: false
#| label: tbl-input-data
input_data = tibble::tribble(
~Network, ~`N. segments`, ~Description, ~Source,
"Otley Road", nrow(rnet_otley), "A corridor in Leeds represented by a single centreline", "Propensity to Cycle Tool (derived from OSM)",
"Armley Road", nrow(rnet_armley), "A road in Leeds represented by multiple parallel 'braided' linestrings", "Propensity to Cycle Tool (derived from OSM)"
)
knitr::kable(input_data)
```
There are two main challenges that need to be overcome to simplify transport networks, in a way that preserves their value:
1. Simplifying the *geometry*
2. Assigning attributes to the simplified network
The key contributions of the paper are the novel methods of image skeletonization, presented in @sec-simplification-via-skeletonization, and simplification with Voronoi diagrams to identify central lines, covered in @sec-simplification-via-voronoi-polygons.
<!-- To make use of simplified networks in transport planning, it is also necessary to assign attributes to the simplified network.
This is covered in @sec-joining-route-networks. -->
```{=html}
<!-- TODO: shouldn't the following topics be stand-alone subsections rather than existing within the skeletonization section?
Additionally, the section tackles challenges associated with knots at intersections, offering solutions for their removal to simplify the network's appearance.
The concept of a primal network that represents a high level of simplification is explored as well. -->
```
<!-- ## Topology-preserving simplification {#sec-topology-preserving-simplification}
Topology-preserving simplification reduces the number of vertices in a linestring while preserving (or at least attempting to preserve) the topology of the network, i.e. not merging parallel lines. -->
## Simplification via skeletonization {#sec-simplification-via-skeletonization}
The skeletonization approach generates a simplified network by buffering the network, applying an image skeletonization algorithm, and extracting lines segements from a raster of this buffer.
<!--- ### Create a projected combined buffered geometry --->
In both the skeletonization and Voronoi approaches, the network simplification process starts by applying a buffer to linear geometry in a projected, rather than coordinate system.
<!-- , achieved using the `get_geometry_buffer` function. --> We use a buffer size of 8 meters in this paper.
```{python}
#| name: python-setup-packages
#| include: false
import warnings
from functools import partial
from pathlib import Path
import geopandas as gp
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from parenx.shared import combine_line, get_geometry_buffer, get_nx, get_source_target
from parenx.skeletonize import (
get_affine_transform,
get_raster_line,
get_skeleton,
split_centres,
)
from parenx.voronoi import (
filter_buffer,
filter_distance,
get_linestring,
get_geometry_line,
get_voronoi_line,
get_voronoi,
get_segment_nx,
set_geometry,
)
from shapely import box, get_coordinates, set_precision, unary_union
from shapely.affinity import affine_transform
from shapely.geometry import MultiPoint, Point
from shapely.ops import voronoi_diagram
```
```{python}
#| include: false
plt.rcParams["figure.figsize"] = (12, 12)
CRS = "EPSG:27700"
buffer_size = 8.0
radius = buffer_size
set_precision_pointone = partial(set_precision, grid_size=0.1)
base_otley = gp.read_file("./data/rnet_otley.geojson").to_crs(CRS)
base_otley["geometry"] = base_otley["geometry"].map(set_precision_pointone)
base_otley = combine_line(base_otley["geometry"]).to_frame("geometry")
otley_geometry = get_geometry_buffer(base_otley["geometry"], radius=buffer_size)
# Same for Armley:
base_armley = gp.read_file("./data/rnet_armley.geojson").to_crs(CRS)
base_armley["geometry"] = base_armley["geometry"].map(set_precision_pointone)
base_armley = combine_line(base_armley["geometry"]).to_frame("geometry")
armley_geometry = get_geometry_buffer(base_armley["geometry"], radius=buffer_size)
```
<!-- ### Network skeletonization -->
In skeletonization, overlapping lines are identified, buffered, transformed into a raster image, the image processed through a thinning algorithm, and a skeletal representation of the original network produced (see [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details).
This skeletal structure preserves the overall extent and connectivity of the network, with a central line that follows the centre-line of the combined buffered area.
In detail, skeletonization is only applied where more than line-segment buffer overlaps.
To identify overlapping line-segments, the buffer is split at the end of each line-segment.
The overlapping line-segments are then buffered while retaining the remaining disjoint lines.
```{=html}
<!---This visualization demonstrates the process of cutting and segmenting the buffer geometries. It highlights the transformations from the initial buffered geometries to a more segmented and manageable form, preparing them for further analysis and simplification steps.
It effectively highlights the contrast between the more intricate and the simpler sections within these networks. --->
```
<!-- ::: {#fig-split-ends layout-ncol="2"} -->
```{python}
split_centre = partial(split_centres, offset=np.sqrt(1.5) * radius)
## overlapping
otley_centre = gp.GeoSeries(base_otley["geometry"].map(split_centre), crs=CRS)
otley_centre = otley_centre.buffer(radius, 0, join_style="round", cap_style="round")
combined_otley = gp.GeoSeries(unary_union(otley_centre.values).geoms, crs=CRS)
# combined_otley.plot()
```
```{python}
## overlapping
armley_centre = gp.GeoSeries(base_armley["geometry"].map(split_centre), crs=CRS)
armley_centre = armley_centre.buffer(radius, 0, join_style="round", cap_style="round")
combined_armley = gp.GeoSeries(unary_union(armley_centre.values).geoms, crs=CRS)
# combined_armley.plot()
```
```{=html}
<!--- Truncated and segmented buffer geometries of the Otley Road (left) and Armley Road (right) networks.
::: --->
```
```{python}
i, j = base_otley.sindex.query(combined_otley, predicate="intersects")
base_otley["class"] = -1
base_otley.loc[j, "class"] = combined_otley.index[i]
count = base_otley.groupby("class").count()
base_otley = base_otley.join(count["geometry"].rename("count"), on="class")
ix = base_otley["class"] == -1
base_otley.loc[ix, "count"] = 0
i, j = base_armley.sindex.query(combined_armley, predicate="intersects")
base_armley["class"] = -1
base_armley.loc[j, "class"] = combined_armley.index[i]
count = base_armley.groupby("class").count()
base_armley = base_armley.join(count["geometry"].rename("count"), on="class")
ix = base_armley["class"] == -1
base_armley.loc[ix, "count"] = 0
```
```{python}
ix = base_otley["count"].isin([0, 1])
p = base_otley.loc[~ix, "geometry"].copy()
p = p.buffer(radius, 512, join_style="round", cap_style="round")
try:
p = gp.GeoSeries(list(unary_union(p.values).geoms), crs=CRS)
except AttributeError:
p = gp.GeoSeries(unary_union(p.values), crs=CRS)
q = base_otley.loc[ix, "geometry"].buffer(0.612, 64, join_style="mitre", cap_style="round")
otley_segment = pd.concat([p, q])
try:
otley_segment = gp.GeoSeries(list(unary_union(otley_segment.values).geoms), crs=CRS)
except AttributeError:
otley_segment = gp.GeoSeries(unary_union(otley_segment.values), crs=CRS)
# otley_segment.plot()
```
```{python}
#| include: false
ix = base_armley["count"].isin([0, 1])
p = base_armley.loc[~ix, "geometry"].copy()
p = p.buffer(radius, 512, join_style="round", cap_style="round")
try:
p = gp.GeoSeries(list(unary_union(p.values).geoms), crs=CRS)
except AttributeError:
p = gp.GeoSeries(unary_union(p.values), crs=CRS)
q = base_armley.loc[ix, "geometry"].buffer(0.612, 64, join_style="mitre", cap_style="round")
armley_segment = pd.concat([p, q])
try:
armley_segment = gp.GeoSeries(list(unary_union(armley_geometry.values).geoms), crs=CRS)
except AttributeError:
armley_segment = gp.GeoSeries(unary_union(armley_segment.values), crs=CRS)
# armley_segment.plot()
```
<!-- Segmented line-buffer geometries (top) and geometries to be skeletonized (bottom) for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
As detail is lost in transforming of the geometry to an image buffer or raster, more detail can be retained by using an affine transformation to increase the number of points in the buffer prior to skeletonization and reducing scale when creating the simplified linear geometric representation.
This transformation is scaled to ensure that the projected coordinate geometry of the network aligns accurately with the corresponding dimensions of the scaled raster image.
The raster image also requires pre-processing to eliminate small holes that appear where buffered lines run parallel or intersect at shallow angles.
The skeltonisation algorithm is then appled to the raster image yielding a skeletal raster image and converted back into a linear vector geometry, completing the vector-to-raster-to-vector geometry transformation (see [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details).
```{python}
import rasterio.features as rif
r_matrix_otley, s_matrix_otley, out_shape_otley = get_affine_transform(otley_geometry, scale=2.0)
# For Armle
r_matrix_armley, s_matrix_armley, out_shape_armley = get_affine_transform(armley_geometry, scale=2.0)
```
```{python}
from skimage.morphology import remove_small_holes, skeletonize
import rasterio.plot as rip
```
<!-- ::: {#fig-rasterize layout-ncol="2"}
```{python}
otley_im = rif.rasterize(otley_segment.values, transform=r_matrix_otley, out_shape=out_shape_otley)
with warnings.catch_warnings():
warnings.simplefilter("ignore")
otley_im = remove_small_holes(otley_im, 20).astype(np.uint8)
# rip.show(otley_im, cmap="Greys", title="buffer geometry")
```
```{python}
armley_im = rif.rasterize(armley_segment.values, transform=r_matrix_armley, out_shape=out_shape_otley)
with warnings.catch_warnings():
warnings.simplefilter("ignore")
armley_im = remove_small_holes(armley_im, 20).astype(np.uint8)
# rip.show(armley_im, cmap="Greys", title="buffer geometry")
```
Rasterized versions of the Otley Road (left) and Armley Road (right) networks, with post processing to remove small holes.
::: -->
<!---The image undergoes a thinning process, yielding a skeletal raster image as the result.
This skeletonized image effectively captures the essential structure and layout of the original network, as illustrated in @fig-thin-skeleton. --->
<!-- ::: {#fig-thin-skeleton layout-ncol="2"} -->
```{python}
otley_skeleton = skeletonize(otley_im).astype(np.uint8)
# rip.show(otley_skeleton, cmap="Greys", title="skeleton geometry")
otley_p = np.stack(np.where(otley_skeleton >= 1))
otley_point = gp.GeoSeries(map(Point, otley_p.T), crs=CRS)
```
```{python}
armley_skeleton = skeletonize(armley_im).astype(np.uint8)
# rip.show(armley_skeleton, cmap="Greys", title="skeleton geometry")
armley_p = np.stack(np.where(armley_skeleton >= 1))
armley_point = gp.GeoSeries(map(Point, armley_p.T), crs=CRS)
```
<!-- Skeletonized versions of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!-- ### Recreating a linear geometry -->
<!-- Figure commented out as not necessary and similar to the subsequent figure: -->
<!-- ::: {#fig-skeleton-vector layout-ncol="2"} -->
```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_transform = otley_point.map(shapely_transform).map(set_precision_pointone)
# otley_transform.plot(edgecolor="black", color="blue").grid()
# plt.show()
```
```{python}
armley_transform = armley_point.map(shapely_transform).map(set_precision_pointone)
# armley_transform.plot(edgecolor="black", color="blue").grid()
# plt.show()
```
```{=html}
<!-- Skeletonized versions of the Otley Road (left) and Armley Road (right) networks, transformed back into point geometry.
::: -->
```
Adjacent points are identifed, typically within a 1x1 pixel square, based on proximity within the raster image coordinate system.
Line segments are then created by connecting these adjacent points.
These points are combined, giving a continuous line geometry representing the simplified network.
Finally, a reverse scaling affine transformation is applied to return to the original coordinate system.
<!-- This conversion from point to line geometry is a pivotal aspect of network simplification. -->
Noting that creating a line geometry from the set of points in the raster buffer is arguable the most complex step.
<!-- The resulting simplified network is illustrated in @fig-skeleton-line. -->
```{python}
otley_line = get_raster_line(otley_point, True)
armley_line = get_raster_line(armley_point, True)
```
<!-- ::: {#fig-skeleton-line layout-ncol="2"} -->
```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_sk = otley_line.map(shapely_transform).map(set_precision_pointone)
otley_sk = otley_sk.set_crs(CRS)
# otley_sk.plot()
```
```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_armley)
armley_sk = armley_line.map(shapely_transform).map(set_precision_pointone)
armley_sk = armley_sk.set_crs(CRS)
# armley_sk.plot()
```
<!-- Simplified versions of the Otley Road (left) and Armley Road (right) networks, transformed back into line geometry. -->
<!-- ::: -->
## Simplification via Voronoi polygons {#sec-simplification-via-voronoi-polygons}
Voronoi simplification takes the buffered network segments and converts them into a set of points.
The edges of these buffers are then segmented into sequences of points.
From these sequences, a centre-line is derived based on a set of Voronoi polygons that cover these points.
This approach facilitates the creation of a simplified network representation by focusing on the central alignment of the buffered lines.
```{python}
scale = 10.0
tolerance = 0.1
otley_clip = box(426800, 437400, 427000, 437600)
armley_clip = box(427200, 433500, 427400, 433700)
```
<!-- ::: {#fig-boundary layout-ncol="2"} -->
```{python}
otley_boundary = get_geometry_line(otley_geometry)
# otley_boundary.plot()
```
```{python}
armley_boundary = get_geometry_line(armley_geometry)
# armley_boundary.plot()
```
<!-- Simplified boundaries of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!-- @fig-segment showcase the conversion of segmented LineString geometries into point geometries. -->
<!-- This essential transformation forms the basis for constructing Voronoi diagrams. -->
<!-- ::: {#fig-segment layout-ncol="2"} -->
```{python}
otley_segment = get_segment_nx(otley_boundary, scale).reset_index(drop=True)
# ax = otley_segment.clip(otley_clip).plot(edgecolor="red", linestyle='--', linewidth=1)
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
```{python}
armley_segment = get_segment_nx(armley_boundary, scale).reset_index(drop=True)
# ax = armley_segment.clip(armley_clip).plot(edgecolor="red", linestyle='--', linewidth=1)
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
<!-- Detail segmented boundaries of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!-- @fig-voronoi-point, the process of converting the segmented LineString geometries into point geometries is illustrated. -->
<!-- This transformation is essential for the creation of Voronoi diagrams. -->
<!-- ::: {#fig-voronoi-point layout-ncol="2"} -->
```{python}
otley_point = otley_segment.loc[:, "geometry"].map(get_coordinates).explode()
otley_point = MultiPoint(otley_point[::2].map(Point).values)
nx_output = gp.GeoSeries(otley_point, crs=CRS)
# ax = nx_output.clip(otley_clip).plot(edgecolor="blue", color="white")
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
```{python}
armley_point = armley_segment.loc[:, "geometry"].map(get_coordinates).explode()
armley_point = MultiPoint(armley_point[::2].map(Point).values)
nx_output = gp.GeoSeries(armley_point, crs=CRS)
# ax = nx_output.clip(armley_clip).plot(edgecolor="blue", color="white")
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
<!-- Detail point segement of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
```{python}
shapely_transform = partial(affine_transform, matrix=s_matrix_armley)
armley_sk = armley_line.map(shapely_transform).map(set_precision_pointone)
armley_sk = armley_sk.set_crs(CRS)
shapely_transform = partial(affine_transform, matrix=s_matrix_otley)
otley_sk = otley_line.map(shapely_transform).map(set_precision_pointone)
otley_sk = otley_sk.set_crs(CRS)
```
<!-- we probably want to pick Voronoi #1 or Voronoi #2. I'd marginally favour #2 -->
<!-- In @fig-voronoi-2, the generation and clipping of the corresponding Voronoi diagrams to the bounds of the input geometry is depicted. -->
<!-- ::: {#fig-voronoi layout-ncol="2"} -->
```{python}
otley_envelope = box(*otley_point.bounds)
otley_voronoi = voronoi_diagram(otley_point, envelope=otley_envelope, tolerance=tolerance, edges=True)
otley_voronoi = gp.GeoSeries(map(set_precision_pointone, otley_voronoi.geoms), crs=CRS)
#ax = otley_voronoi.plot()
#ax.xaxis.set_ticklabels([])
#ax.yaxis.set_ticklabels([])
```
```{python}
armley_envelope = box(*armley_point.bounds)
armley_voronoi = voronoi_diagram(armley_point, envelope=armley_envelope, tolerance=tolerance, edges=True)
armley_voronoi = gp.GeoSeries(map(set_precision_pointone, armley_voronoi.geoms), crs=CRS)
#ax = armley_voronoi.plot()
#ax.xaxis.set_ticklabels([])
#ax.yaxis.set_ticklabels([])
```
```{=html}
<!-- Clipped Voronoi diagrams of the Otley Road (left) and Armley Road (right) networks.
<!-- ::: -->
```
<!-- ### Voronoi 2 -->
<!-- ::: {#fig-voronoi-2 layout-ncol="2"} -->
```{python}
otley_voronoi = otley_voronoi.explode(index_parts=False).clip(otley_envelope)
ix = ~otley_voronoi.is_empty & (otley_voronoi.type == "LineString")
otley_voronoi = otley_voronoi[ix].reset_index(drop=True)
# ax = otley_voronoi.plot()
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
```{python}
armley_voronoi = armley_voronoi.explode(index_parts=False).clip(armley_envelope)
ix = ~armley_voronoi.is_empty & (armley_voronoi.type == "LineString")
armley_voronoi = armley_voronoi[ix].reset_index(drop=True)
# ax = armley_voronoi.plot()
# ax.xaxis.set_ticklabels([])
# ax.yaxis.set_ticklabels([])
```
<!-- Clipped Voronoi diagrams of the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!-- ### Voronoi simplified network -->
Voronoi lines that are completely enclosed within the buffer geometry and are situated at a distance of less than half the buffer's width from the buffer edge are selected.
This selective visualization of Voronoi lines effectively demonstrates the method precision in capturing and representing the central alignment of the transport network within its buffered confines.
The final center-line network is created through a clean up process that involves the removal of knot-like features from the simplified network.
While knots common to both skeletonization and Voronoi simplification, these artefacts are more prevalent on Voronoi simplified networks.
<!-- ::: {#fig-voronoi-simplified layout-ncol="2"} -->
```{python}
offset = buffer_size / 2.0
otley_line = filter_distance(otley_voronoi, otley_boundary, offset)
# otley_line.plot()
```
```{python}
armley_line = filter_distance(armley_voronoi, armley_boundary, offset)
# armley_line.plot()
```
<!-- Voronoi diagram lines with lines that are completely within the buffer geometry and less than half-a-buffer-width from the buffer edge. -->
<!-- ::: -->
<!--- ### Voronoi simplified network --->
::: {#fig-skeleton-line layout-ncol="3"}
```{python}
base_otley.plot(edgecolor="blue", color="blue")
```
```{python}
otley_sk.plot()
```
```{python}
otley_line = filter_buffer(otley_line, otley_geometry)
otley_edge, otley_node = get_source_target(otley_line.to_frame("geometry"))
ix = otley_node["count"] < 4
otley_square = otley_node[ix].buffer(offset, cap_style="square", mitre_limit=offset)
otley_square = gp.GeoSeries(unary_union(otley_square.values).geoms, crs=CRS)
otley_line = otley_edge["geometry"].map(get_linestring).explode().to_frame("geometry")
otley_line = set_geometry(otley_line, otley_square)
otley_line = combine_line(otley_line)
otley_line.plot()
```
```{python}
base_armley.plot(edgecolor="blue", color="blue")
```
```{python}
armley_sk.plot()
```
```{python}
armley_line = filter_buffer(armley_line, armley_geometry)
armley_edge, armley_node = get_source_target(armley_line.to_frame("geometry"))
ix = armley_node["count"] < 4
armley_square = armley_node[ix].buffer(offset, cap_style="square", mitre_limit=offset)
armley_square = gp.GeoSeries(unary_union(armley_square.values).geoms, crs=CRS)
armley_line = armley_edge["geometry"].map(get_linestring).explode().to_frame("geometry")
armley_line = set_geometry(armley_line, armley_square)
armley_line = combine_line(armley_line)
armley_line.plot()
```
Original and simplified versions of the Otley Road (top) and Armley Road (bottom) networks.
From left to right: original network, skeletonized network, Voronoi simplified network.
:::
## Post-Processing
Both skeletonization and Voronoi simplified networks require post-processing to remove unwanted knots, short segments at intersections, resembling tangled knots.
To remove these features, short segments are clustered together, and a central point for each cluster is determined.
The end-points of longer lines that connect to these segment clusters are then realigned to the cluster's central point, as illustrated in the [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix.
An additional optional stage is to simplify the network further by removing vertices that are not essential for the network's connectivity, resulting in a primal network that captures the essential connectivity and layout of transport routes.
The primal network is thus composed of direct lines connecting start and end points, representing a high level of simplification that prioritises the network's structure and compression.
<!-- The primal networks for the Otley Road and Armley Road skeletonized networks are illustrated in @fig-primal-skeleton. -->
<!-- ::: {#fig-primal-skeleton layout-ncol="2"} -->
```{python}
otley_edge_sk = get_nx(otley_sk)
# otley_edge_sk.plot()
```
```{python}
armley_edge_sk = get_nx(armley_sk)
# armley_edge_sk.plot()
```
<!-- Primal skeletonized networks for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!-- ### Primal network -->
<!-- @fig-primal-voronoi illustrates the primal network derived from the Voronoi approach. -->
<!-- ::: {#fig-primal-voronoi layout-ncol="2"} -->
```{python}
otley_edge = get_nx(otley_line)
# otley_edge.plot()
```
```{python}
armley_edge = get_nx(armley_line)
# armley_edge.plot()
```
<!-- Primal Voronoi networks for the Otley Road (left) and Armley Road (right) networks. -->
<!-- ::: -->
<!--  -->
See Section 4 in [Methods](https://nptscot.github.io/networkmerge/methods.html) appendix for details.
## Joining route networks {#sec-joining-route-networks}
After generating a simplified network using the methods, it is often useful to translate attributes from the original network to the simplified network.
<!-- described in the previous sections or through an alternative approach, the next crucial step involves transferring attribute values from the detailed network to the simplified one.
This process is commonly referred to as 'conflation' and 'integration'.
Conflation is essential because while the source file (detailed network) might be rich in attributes like street names, address ranges, and zip codes, it may lack positional accuracy.
Conversely, the target file (simplified network) is likely to be positionally precise but deficient in detailed attributes. -->
<!-- As noted by [@sutton1998], this process involves two key aspects: the geometric integration, involving the link and node feature elements, and the integration of attributes such as highway data. -->
In the context of network planning tools, the purpose of the joining stage is to join traffic estimates from the source network onto the geometrically simplified target network [@sutton1998].
<!-- This 'joining' step is vital for using simplified networks as the basis for presenting model outputs generated on a complex network in a easy-to-interpret form. -->
<!-- In instances where a simplified version of the network is readily available, such as the Ordnance Survey's Open Roads dataset in the UK, the steps for network simplification can be bypassed to save time.
We have implemented the steps presented in this section in the `rnet_merge()` function in the `stplanr` R package. -->
<!-- The process is analogous to joining two datasets based on a common 'key' variable. -->
In this case there is no definitive key, meaning that network joining can be regarded as a 'fuzzy' or 'keyless' join process [@suri; @wachowicz2019]: as with the network simplification steps outlined above, the user must select joining parameters to maximise the accuracy of the join.
There are at least a couple of implementations of network joining methods in open source software: the `rnet_merge()` function in the `stplanr` R package [@lovelace2019], and the [`rnetmatch` Rust crate](https://github.com/nptscot/rnetmatch/tree/main/rust) for which has bindings to R and Python are planned.
The details of network joining methods, algorithms and implementations are outside the scope of this paper, see the documentation associated with the projects mentioned above for more information.
<!-- **Data Preparation:** -->
<!-- The main inputs of the network merging function are two route networks: `rnet_x` (the simple network geometry) and `rnet_y` (the original network with a more complex geometry and attributes that need to be translated onto the new network). -->
<!-- `rnet_y` in this case represents the detailed network with attributes such as model outputs representing transport flows to be translated onto the new network. -->
<!-- **Step 1: Coordinate Reference System Alignment** -->
```{=html}
<!-- Transform the spatial data of both `rnet_x` (the simplified network) and `rnet_y` (the detailed network) to the same coordinate reference system.
For this project, we have selected EPSG:27700. -->
```
```{r}
#| echo: false
#| eval: false
rnet_xp = sf::st_transform(rnet_x, "EPSG:27700")
rnet_yp = sf::st_transform(rnet_y, "EPSG:27700")
```
<!-- **Step 2: Defining the Function List** -->
```{=html}
<!-- Create a function list that dictates how each attribute is to be processed:
- **Exclude**: "geometry" from processing.
- **Mean Function**: Applied to attributes like "Gradient" and "Quietness".
TODO expline the funtion of sum and mean
- **Sum Function**: Used for aggregating values in columns such as "all_bicycle".
**Step 3: Using `stplanr::rnet_merge` for Integration**
Employ the `rnet_merge` function from the `stplanr` package.
This function is designed to merge route network data, taking into account the predefined functions and alignment in the coordinate system. -->
```
```{r}
#| eval: false
#| echo: false
rnet_merged = stplanr::rnet_merge(rnet_xp, rnet_yp, dist = 20, segment_length = 10, funs = funs, max_angle_diff = 30)
```
```{=html}
<!-- The network joining function can also take a number of arguments that define how the attributes are translated to the simplified target network:
- **`dist`** (Buffer Distance): This parameter defines the buffer zone around `rnet_xp` in meters, for determining the proximity at which features from both networks are considered for merging.
Typically, this value is refined to approximate the width of streets, ensuring a realistic spatial correlation between the network elements.
- **`segment_length`** (Maximum Segment Length): This optional argument (with 0 being the default meaning no splitting) specifies the maximum length of segments in `rnet_y` before they are split.
Segmenting long segments in the detailed network reduces number of source geometries that do not fit within target geometries, which can be key for achieving higher accuracy in attribute integration. -->
```
```{=html}
<!-- - **`funs`** (Function List): Comprises a series of key-value pairs representing variable names and the function to apply to each.
Any function can be used, with sum and mean being typical values.
- **`max_angle_diff`** (Maximum Angular Difference): This argument specifies the maximum angular difference between segments and target lines for values in matching source geometries to be translated to the target simplified geometries.
A low value, such as 20 degrees, ensures that values are translated only to segments with similar orientations, preventing overestimation of values on side roads.
@fig-max_angle_diff demonstrates the impact of this parameter in preventing the overestimation of values on side roads. -->
```
```{=html}
<!-- ::: {#fig-max_angle_diff layout-ncol="1"}

The effect of setting `max_angle_diff` to 20 degrees compared to a null value, illustrating the reduction in value overestimation on side Roads.
::: -->
```
```{=html}
<!-- An optional `sum_flows` argument can be used to ensure that more influential lines have a proportionate impact on the final aggregated value for each geometry in the simplified network.
The formula for the weighted sum is:
$$
\text{Weighted Sum} = \sum (x_i \times w_i)
$$
Where `xi` represents the attribute value for the i-th line in the complex network, and `wi` is the length weight for the i-th line. -->
```
```{=html}
<!-- An optional normalization step adjusts the values associated with each feature in the simplified network.
This adjustment ensures that the total flow values of both the complex and simplified networks are equivalent.
The need for normalization arises from the fact that simplified networks typically have a reduced total length and are less circuitous compared to their complex counterparts. -->
```
```{=html}
<!-- The normalization process is governed by the following formula:
$$
\text{Normalized Value} = \frac{\text{Weighted Sum}}{\sum w_i}
$$
Additionally, to account for potential overestimation of values in the simplified network, the following formula is applied:
$$
\text{over\_estimate} = \frac{\sum(\text{Aggregated Value} \times \text{Length of Single Line})}{\sum(\text{Original Value} \times \text{Length of Lines in Complex Network})}
$$
The final step involves adjusting the normalized value to correct any overestimation, as calculated above:
$$
\text{Adjusted Value} = \frac{\text{Normalized Value}}{\text{over\_estimate}}
$$
Through these steps, the normalized and adjusted values more accurately reflect the true distribution and intensity of flows within the simplified network, preserving the overall data integrity. -->
```
<!-- Content for long form appendix?: -->
```{=html}
<!-- To calculate max_angle_diff, it's necessary to accurately calculate the angles (bearings) of lines from two networks, rnet_x$angle_x and rnet_y$angle_y.
The difference in these angles is then computed as rnetj$angle_diff <- rnetj$angle_y - rnetj\$angle_x.
In this study, the line_bearing function is defined to calculate the angle.
As shown below,
rnet$angle = line_bearing(rnet_y, bidirectional = TRUE)
The process of calculating the bidirectional bearing of a line segment in geographical coordinates involves normalizing the standard bearing so that it is the same regardless of the direction of travel along the line.
Below is the detailed formula:
1. **Convert Latitudes and Longitudes to Radians**:
$$
\text{lat}_1 = \text{lat}_1 \times \frac{\pi}{180}
$$
$$
\text{lat}_2 = \text{lat}_2 \times \frac{\pi}{180}
$$
$$
\text{lon}_1 = \text{lon}_1 \times \frac{\pi}{180}
$$
$$
\text{lon}_2 = \text{lon}_2 \times \frac{\pi}{180}
$$
2. **Calculate Bearing**:
$$
\Delta lon = lon_2 - lon_1
$$
$$
\theta = atan2(\sin(\Delta lon) \times \cos(lat_2), \cos(lat_1) \times \sin(lat_2) - \sin(lat_1) \times \cos(lat_2) \times \cos(\Delta lon))
$$
3. **Convert Bearing from Radians to Degrees and Normalize Bidirectional Bearing to \[0, 180) Degrees**:
$$
\theta = \theta \times \frac{180}{\pi}
$$
$$
\theta = \theta \mod 180
$$
In this bidirectional adjustment, the bearing is normalized to a range of \[0, 180) degrees.
This means that opposite directions along a line yield the same bearing value.
This approach is particularly useful in scenarios where the directionality of the line is not a primary concern, ensuring consistency in bearing values regardless of the line's travel direction.
-->
```
<!--  -->
```{r, eval = FALSE}
rnet_xp = st_transform(rnet_x, "EPSG:27700")
rnet_yp = st_transform(rnet_y, "EPSG:27700")
# Extract column names from the rnet_yp
name_list = names(rnet_yp)
# Initialize an empty list
funs = list()
# Loop through each name and assign it a function based on specific conditions
for (name in name_list) {
if (name == "geometry") {
next # Skip the current iteration
} else if (name %in% c("Gradient", "Quietness")) {
funs[[name]] = mean
} else {
funs[[name]] = sum
}
}
dist = 20
rnet_merged_with_angle = stplanr::rnet_merge(rnet_xp, rnet_yp, dist = dist, segment_length = 10, funs = funs, max_angle_diff = 20)
rnet_merged_without_angle= stplanr::rnet_merge(rnet_xp, rnet_yp, dist = dist, segment_length = 10, funs = funs)
# Define breaks for the color scale
brks = c(0, 100, 500, 1000, 5000)
# Set global tmap options for tighter margins
tmap_options(
outer.margins = c(0, 0, 0, 0),
inner.margins = c(0, 0, 0, 0)
)
# Create the first map with scale bar
m1 = tm_shape(rnet_merged_with_angle) +
tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
tm_scale_bar() +
tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)
# Create the second map
m2 = tm_shape(rnet_merged_without_angle) +
tm_lines("value", palette = "viridis", lwd = 5, breaks = brks) +
tm_layout(frame = FALSE, inner.margins = 0, outer.margins = 0, asp = 0)
# Arrange the two maps vertically with a tight layout and synchronization