This Python script and included binaries will download OSM data from the Geofabrik servers, filter the data to include features that are tagged as a highway, and saves the highways as shapefiles for use with any GIS software.
This directory contains data from the OpenStreetMap project: https://openstreetmap.org
OpenStreetMap is a GIS datasource that is constructed by a "community of mappers that contribute and maintain data about roads, trails, cafes, railway stations, and much more" (Source: About OSM). The data is "Open", in that anyone can edit or download the data for their use.
This directory contains a weekly extract of OpenStreetMap data for New York State, which is sourced from here: https://download.geofabrik.de/north-america/us/new-york.html
The new-york-latest.osm.pbf
from the above link contains all of New York State's OpenStreetMap data, which
includes everything you see on the map. To limit the size of the data in these directories, the data is filtered
to the most commonly used GIS data source at DOTs - the roads.
OSM uses a key/value pair to tag attributes to each roadway. The OSM download is filtered to include all data that are tagged with a "highway" key. From there, the following values are removed to help focus the data on the public roadway network:
- footway
- pedestrian
- path
- track
- steps
- cycleway
- bridleway
- private
- raceway
- abandoned
The OSM data comes with multiple "layers", which include different shape types. The most relevant data will be the data that includes the suffix "_lines.shp" on the shapefiles, as these are where most of the roadways are. All of the layers are unpacked and saved to the directory, so you can get a taste of what else is potentially available.
OpenStreetMap data changes frequently, and often times the accuracy can be questionable. Please be very considerate when attempting to draw conclusions from this data. OpenStreetMap and the data providers are not responsible for errors or omissions. We think of this mostly as a cartographic data source, but many individuals and companies (including the Fortune 500) have had great success extracting value from OSM data.
The script is written in Python 3.6+ syntax. The easiest way to run this tool is using the Anaconda package distributor: https://anaconda.org/anaconda The original script was authored using miniconda. Conda installs Python packages and their Operating System dependencies, simplifying the process of getting things like GDAL, NumPy, and Pandas running on your machine. The conda environment can be frozen so that it's easily replicated on other machines.
The environment.yml
file in this directory can be used to reconstruct the Python execution environment of this tool.
To recreate the environment, install Miniconda or Anaconda with Python 3.6+ and run the following two command:
conda upgrade conda
conda env create -f environment.yml
Conda should proceed to install the required packages. You can find more information regarding conda environments here: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
You will find the OSM Convert and OSM Filter binaries predownloaded for windows 64-bit in the ./bin
directory.
If you leave them in place (an you're using 64-bit Windows!!) and do not
move the download_and_extract_osm.py
script from the root directory, the binaries should be discovered by the script
without additional work.
This script relies on two open source tools distributed by OSM contributors. The tools are OSM Convert and OSM Filter. There are binary files that can be downloaded from the web for 64 bit windows, which are the easiest way to get things running. OSM Convert is a utility that can convert OpenStreetMap data to and from various file formats. The data is downloaded in Protocol Buffer Format, which is a compressed version of the OSM XML file. The download_and_extract_osm.py script will use this utility to convert the download from .pbf to the .o5m format. This step is required for OSM Filter to execute. OSM Filter is optimized for the o5m format.
To install OSM Convert, go to this link:
https://wiki.openstreetmap.org/wiki/Osmconvert (or use the version included in the ./bin
directory of this repository).
Once you have downloaded the file, update the download_and_extract_osm.py
file to point to the osmconvert.exe
binary.
Python will then manage the execution of the tool.
OSM Filter is a utility that can be used to extract data of interest from the full OSM file. OSM data includes every single thing that's visible on the map, from the traffic lights to the building polygons. Using OSM Filter, we can read the data from NYS and find the data that is tagged with transportation related key/value pairs. See the ABOUT THIS PROJECT section of the README for more information on what data are excluded by this script.
To install OSM Filter, go to this link:
https://wiki.openstreetmap.org/wiki/Osmfilter (or use the version included in the ./bin
directory of this repository).
Once you have downloaded the file, update the download_and_extract_osm.py
file to point to the osmfilter.exe
binary.
Python will then manage the execution of the tool.
This directory contains a Windows batch file, which can be used in the Windows Task Scheduler to schedule
reoccuring updates of the OpenStreetMap data. The file is called download_and_extract_osm_schedule.bat
The .bat file simply activates the conda environment, runs the python script, and deactivates the conda env.
Once you have replicated the conda environment, you'll need to update the download_and_extract_osm_schedule.bat
script to point to the correct conda environment. If you navigate to the folder where you installed conda,
you'll find a folder called condabin
which includes a file called activate.bat
. Replace any call to conda
you may find in the batch file with the full filepath to the activate script. If you use the full path, you
can get around adding conda to your system path variable. It's best to leave Python 3 off of the conda path at this
time, since we are still using ArcGIS Desktop which relies on the deprecated Python 2. YMMV.
Once you have installed Conda and updated the path in the batch file, ensure you change line 2 of
downlaod_and_extract_osm_scheduled.bat
to match the location of the download_and_extract_osm.py
on the system
you'll be using to run the script.
To actually schedule the job, simply go to the Windoze task Scheuduler and "Add a Task". You can set the frequency
to whatever you desire. Make sure the "Action" is to execute the download_and_extract_osm_schedule.bat
script.
NOTE: I'd highly recommend ensuring that the script can be executed from a terminal window prior to scheduling the job.
The Windows Task Scheduler can be a bit difficult to debug, as the terminal window will disappear quickly. You can
set the timeout 5
line in the download_and_extract_osm_schedule.bat
script to a bigger number (e.g. 3600 seconds)
to work around the quick-to-close problem.
You may find issues with the conda environment file. When pulling the file from this directory then trying to install on a Windows 10
machine, I sometimes encounter an Unexpected Error
when executing conda env create -f environment.yml
. The only workaround for this
I have found involves VS Code. I open the ./environment.yml
file in VS code, change the encoding to UTF-8 (bottom right corner of VS Code),
then change it back to the original encoding (i just use ctrl+z
to undo the change), and save the file. This seems to fix the following error:
Traceback (most recent call last):
File "D:\Program_Files\miniconda3\lib\site-packages\conda\exceptions.py", line 1079, in __call__
return func(*args, **kwargs)
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\cli\main.py", line 80, in do_call
exit_code = getattr(module, func_name)(args, parser)
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\cli\main_create.py", line 80, in execute
directory=os.getcwd())
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\specs\__init__.py", line 40, in detect
if spec.can_handle():
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\specs\yaml_file.py", line 18, in can_handle
self._environment = env.from_file(self.filename)
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\env.py", line 151, in from_file
return from_yaml(yamlstr, filename=filename)
File "D:\Program_Files\miniconda3\lib\site-packages\conda_env\env.py", line 136, in from_yaml
data = yaml_load_standard(yamlstr)
File "D:\Program_Files\miniconda3\lib\site-packages\conda\common\serialize.py", line 76, in yaml_load_standard
return yaml.load(string, Loader=yaml.Loader, version="1.2")
File "D:\Program_Files\miniconda3\lib\site-packages\ruamel_yaml\main.py", line 933, in load
loader = Loader(stream, version, preserve_quotes=preserve_quotes)
File "D:\Program_Files\miniconda3\lib\site-packages\ruamel_yaml\loader.py", line 50, in __init__
Reader.__init__(self, stream, loader=self)
File "D:\Program_Files\miniconda3\lib\site-packages\ruamel_yaml\reader.py", line 85, in __init__
self.stream = stream # type: Any # as .read is called
File "D:\Program_Files\miniconda3\lib\site-packages\ruamel_yaml\reader.py", line 117, in stream
self.check_printable(val)
File "D:\Program_Files\miniconda3\lib\site-packages\ruamel_yaml\reader.py", line 255, in check_printable
'special characters are not allowed',
ruamel_yaml.reader.ReaderError: unacceptable character #x0000: special characters are not allowed
in "<unicode string>", position 3
If anyone knows what could be causing this issue, please open an issue in this repository!
Some managed IT environments will lead to errors in anaconda and the python requests
library. This is often related to IT organizations
that manage TLS certificates. A brute force workaround exists, which involves turning off encryption for https requests.
This is a last resort option, and not recommended!
To enforce this workaround in conda, run the following command:
conda config --set ssl_verify False
To enforce this workaround in the Python script, set verify_tls = False
.
If using VS Code, ensure that you update the settings.json file (found in ./.vscode/settings.json
) with the Anaconda
Python executable for your system. It's currently set to:
D:\Program_Files\miniconda3\envs\py3\python.exe
The VS Code settings are currently using Bandit linting. You can easily change that if you'd like.