Basic Usage

The basic usage consists in two parts: 1. Downloading data dumps (osm.pbf files) containing all map data of an area * Countries * Regions * Planet 2. Extracting specific data into tabular format (geopandas dataframes) from these data dumps * With pre-written wrappers for some infrastructure classes * Using any tag syntax from OSM

[1]:
# Loading necessary data packages
import matplotlib.pyplot as plt

import sys
sys.path.append('') #'your-path-to/osm-flex/src'

import osm_flex
import osm_flex.download as dl
import osm_flex.extract as ex
import osm_flex.config
import osm_flex.clip as cp

osm_flex.enable_logs()

Step 1: Downloading data dumps

Per default, an osm/osm_bpf folder is created in your home directory, and all data dumps are downloaded to this path. This can be changed in the osm_flex.config module by modifying the variable OSM_DATA_DIR

[2]:
# Download the Switzerland country file from download.geofabrik.de
# downloads requested file only if necessary, and returns save path
iso3 = 'CHE'
path_che_dump = dl.get_country_geofabrik(iso3)
print(f'Saved as {path_che_dump}')
INFO:osm_flex.download:Skip existing file: /Users/evelynm/osm/osm_bpf/switzerland-latest.osm.pbf
Saved as /Users/evelynm/osm/osm_bpf/switzerland-latest.osm.pbf

Download Central America regional file from geofabrik to~/osm/osm_bpf/central-america-latest.osm.pbf folder

[3]:
region = 'central-america'
path_ca_dump = dl.get_region_geofabrik(region)
print(f'Saved as {path_ca_dump}')
INFO:osm_flex.download:Skip existing file: /Users/evelynm/osm/osm_bpf/central-america-latest.osm.pbf
Saved as /Users/evelynm/osm/osm_bpf/central-america-latest.osm.pbf

Step 2: Extracting geospatial data

Option 1:

Using pre-written wrappers for certain critical infrastructure types (check which ones are available), with the method extract_cis.

They can be further configured in the DICT_CIS_OSM dictionary located in the config file

[4]:
# available wrapper categories:
osm_flex.config.DICT_CIS_OSM.keys()
[4]:
dict_keys(['education', 'healthcare', 'water', 'telecom', 'road', 'main_road', 'rail', 'air', 'gas', 'oil', 'power', 'wastewater', 'food', 'buildings'])
[5]:
# check the signature of the extraction function
? ex.extract_cis
[6]:
gdf_che_mainroad = ex.extract_cis(path_che_dump, 'main_road')
extract points: 0it [00:07, ?it/s]
extract multipolygons: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:42<00:00, 21.47s/it]
extract lines: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 117861/117861 [00:16<00:00, 7217.83it/s]
[12]:
fig, ax = plt.subplots()
gdf_che_mainroad.plot(ax=ax, linewidth=0.5)
fig.suptitle('Main roads in Switzerland')
plt.show()
_images/0_basic_usage_12_0.png

Option 2:

Using key (and value) tags from OSM (check [https://taginfo.openstreetmap.org/]) with the method extract

[5]:
# Example using keys and value constraints
gdf_ca_forest = ex.extract(path_ca_dump,'multipolygons',
                            ['landuse', 'name'],
                            "landuse='forest'")
Warning 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO
extract multipolygons:   6%|█████▊                                                                                             | 446/7578 [00:58<10:53, 10.91it/s]Warning 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO
extract multipolygons: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 7578/7578 [01:39<00:00, 76.31it/s]
[16]:
fig, ax = plt.subplots()
gdf_ca_forest.plot(ax=ax)
fig.suptitle('Forests in Central America')
plt.show()
_images/0_basic_usage_15_0.png
[5]:
# Example using only a key (i.e., parse all items which have a non-null entry for the first key)
gdf_che_bldgs = ex.extract(path_che_dump,'multipolygons',
                          ['building', 'name'])
extract multipolygons: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 2608882/2608882 [02:02<00:00, 21261.05it/s]
[6]:
fig, ax = plt.subplots()
gdf_che_bldgs.plot(ax=ax)
fig.suptitle('All buildings in Switzerland')
plt.show()
_images/0_basic_usage_17_0.png

Optional Step: Clipping data

For custom data dumps, the planet, regional or country files can be clipped (=cut) to a user-defined geographical extent and saved as new osm.pbf file using the clip module. Instructions can be found in tutorial 1_clipping_shapes.

Optional Step: Simplifying data

Parsing data can result in duplicates or near-dupicates, or yield too many results. A few simple methods are provided in the simplify module to handle this. Instructions can be found in tutorial 2_simplifications.

[ ]: