Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

287 improve osm maps with tag metadata #294

Merged
merged 21 commits into from
Jun 25, 2024

Conversation

r-leyshon
Copy link
Contributor

@r-leyshon r-leyshon commented Jun 19, 2024

Description

  • FindTags now issues a PerformanceWarning if the OSM file it is being instantiated with is larger than 500 KB.
  • FindLocations.plot_ids() now has an include_tags parameter, False by default. Setting this to True adds all the tag metadata to the folium map tooltip.
  • FindLocations.plot_ids() now exposes selected folium map styling parameters.
  • An example implementation is included in the quarto OSM tutorial.
  • Finally, all the runners broke when raising PR, as breaking changes with numpy 2.0.0 that had nothing to do with this diff. My solution has been to pin numpy version to that which passed in the previous successful run.

Fixes #287

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

How Has This Been Tested?

Test configuration details:

  • OS: macos
  • Python version: 3.9.13
  • Java version: openjdk 11.0.19 2023-04-18 LTS
  • Python management system: pip

Advice for reviewer

Checklist:

  • My code follows the intended structure of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Additional comments

@r-leyshon r-leyshon added documentation Improvements or additions to documentation enhancement New feature or request OSM labels Jun 19, 2024
@r-leyshon r-leyshon added this to the Version 1 Release milestone Jun 19, 2024
@r-leyshon r-leyshon requested a review from CBROWN-ONS June 19, 2024 11:19
@r-leyshon r-leyshon linked an issue Jun 19, 2024 that may be closed by this pull request
@CBROWN-ONS CBROWN-ONS self-assigned this Jun 19, 2024
Copy link
Collaborator

@CBROWN-ONS CBROWN-ONS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff Rich. Just a few small comments on this

Comment on lines +813 to +819
style_kwds: dict = {
"color": "#3f5277",
"fill": True,
"fillOpacity": 0.3,
"fillColor": "#3f5277",
"weight": 4,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the correct approach? it assumes that a user wants to specify all of these features.
In theory, a user might only want to change e.g., the colour, however to do this they would have to specify all the other keywords themselves since style_kwds would be overwritten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is ok:
image

@@ -719,6 +873,7 @@ def plot_ids(
_type_defence(ids, "ids", list)
_type_defence(feature_type, "feature_type", str)
_type_defence(crs, "crs", (str, int))
_type_defence(include_tags, "include_tags", bool)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add defences for the remainder of the new params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. good shout.

@@ -37,6 +39,12 @@
# ---------utilities-----------


class PerformanceWarning(UserWarning):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think inheriting from Warning would be better than UserWarning here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

@@ -541,6 +549,16 @@ def __init__(
_is_expected_filetype(
osm_pth, "osm_pth", check_existing=True, exp_ext=".pbf"
)
self.large_file_thresh = 50000 # 50 KB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very small thing that doesn't really matter.
Do you think this needs to be defined as an attribute of the class? This should be a one time use in the initial size check therefore it can simply be assigned to a local variable.
The only downfall of having it as a class attribute is that a user could potentially view it as something they can alter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with your point. I used name mangling and updated the docstring.

Comment on lines 554 to 561
osm_size = os.path.getsize(osm_pth)
if osm_size > self.large_file_thresh:
warnings.warn(
f"PBF file is {osm_size} bytes. Tag operations are expensive."
" Consider filtering the pbf file smaller than"
f" {self.large_file_thresh} bytes",
PerformanceWarning,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very small technicality, but the error message provokes the user to reduce file size < 50000, however an error isn't raised if file size is 50000.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll still work for sizes greater than 50 KB, it just gets slow on init. It's hoped this would nudge the user to think about the size of the files prior to using FindTags as it's the slowest of all the osm flavoured classes.

Comment on lines +727 to +737
for d in [dict1, dict2]:
if not isinstance(d, dict):
raise TypeError(f"Expected dict but found {type(d)}: {d}")
for id_, tags in dict1.items(): # child_tags is nested
# find duplicated keys and prepend parent keys
if dupes := set(tags.keys()).intersection(dict2.keys()):
for key in dupes:
dict2[f"{prepend_pattern}{key}"] = dict2.pop(key)
# merge parent and child tag collections
tags_out[id_] = tags | dict2
return tags_out
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very good way of doing this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's neat now, but it took a lot of wiggling to get there!

@@ -353,6 +353,14 @@ To read more on `osmosis` filtering strategies, refer to the `completeWays` and
`completeRelations` flag descriptions in the
[Osmosis detailed usage documentation](https://wiki.openstreetmap.org/wiki/Osmosis/Detailed_Usage_0.48).


Note that additional metadata can be added to the map by setting `include_tags=True`. Adding this rich contextual data to the map can be useful but is also expensive. This operation should be avoided for large osm files, for example anything over 500 KB.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'computationally expensive' rather than 'expensive' here would be a better description, in my opinion, as it provides greater context to the reader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

@r-leyshon r-leyshon force-pushed the 287-improve-osm-maps-with-tag-metadata branch from f3491a1 to 3004a81 Compare June 25, 2024 09:23
@r-leyshon
Copy link
Contributor Author

Hey @CBROWN-ONS , I've just implemented your suggestions and pushed the diff to the PR. Thanks for your review on this, some good catches in there.

@codecov-commenter
Copy link

codecov-commenter commented Jun 25, 2024

Codecov Report

Attention: Patch coverage is 98.24561% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.13%. Comparing base (3fa154e) to head (3004a81).

Files Patch % Lines
src/transport_performance/osm/validate_osm.py 98.24% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##              dev     #294   +/-   ##
=======================================
  Coverage   98.13%   98.13%           
=======================================
  Files          21       21           
  Lines        1927     1983   +56     
=======================================
+ Hits         1891     1946   +55     
- Misses         36       37    +1     
Flag Coverage Δ
unittests 98.13% <98.24%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@CBROWN-ONS
Copy link
Collaborator

Thanks for the changes @r-leyshon . All good, will merge this now.

@CBROWN-ONS CBROWN-ONS merged commit cef299b into dev Jun 25, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request OSM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improve OSM maps with tag metadata
3 participants