This page was generated from source/notebooks/L6/retrieve_osm_data.ipynb.
Binder badge
Binder badge CSC badge

Retrieving OpenStreetMap data

image0

What is OpenStreetMap?

OpenStreetMap (OSM) is a global collaborative (crowd-sourced) dataset and project that aims at creating a free editable map of the world containing a lot of information about our environment. It contains data for example about streets, buildings, different services, and landuse to mention a few. You can view the map at www.openstreetmap.org. You can also sign up as a contributor if you want to edit the map.

OSM has a large userbase with more than 4 million users and over a million contributers that update actively the OSM database with 3 million changesets per day. In total OSM contains 5 billion nodes that form the basis of the digitally mapped world that OSM provides (stats from November 2019).

OpenStreetMap is used not only for integrating the OSM maps as background maps to visualizations or online maps, but also for many other purposes such as routing, geocoding, education, and research. OSM is also widely used for humanitarian response e.g. in crisis areas (e.g. after natural disasters) and for fostering economic development (see more from Humanitarian OpenStreetMap Team (HOTOSM) website).

OSMnx

This week we will explore a Python module called OSMnx that can be used to retrieve, construct, analyze, and visualize street networks from OpenStreetMap, and also retrieve data about Points of Interest such as restaurants, schools, and lots of different kind of services. It is also easy to conduct network routing based on walking, cycling or driving by combining OSMnx functionalities with a package called NetworkX.

To get an overview of the capabilities of the package, see an introductory video given by the lead developer of the package, Prof. Geoff Boeing: “Meet the developer: Introduction to OSMnx package by Geoff Boeing”.

There is also a scientific article available describing the package:

Download and visualize OpenStreetMap data with OSMnx

One the most useful features that OSMnx provides is an easy-to-use way of retrieving OpenStreetMap data (using OverPass API).

In this tutorial, we will learn how to download and visualize OSM data covering a specified area of interest: a district of Kamppi in Helsinki, Finland.

Street network

OSMnx makes it really easy to do that as it allows you to specify an address to retrieve the OpenStreetMap data around that area. In fact, OSMnx uses the same Nominatim Geocoding API to do this, which we tested during the Lesson 2.

  • Let’s retrieve OpenStreetMap (OSM) data by specifying "Kamppi, Helsinki, Finland" as the place from where the data should be downloaded.
[1]:
import osmnx as ox
import matplotlib.pyplot as plt
%matplotlib inline

# Specify the name that is used to seach for the data
place_name = "Kamppi, Helsinki, Finland"

# Fetch OSM street network from the location
graph = ox.graph_from_place(place_name)
  • Check the data type of the graph:
[2]:
type(graph)
[2]:
networkx.classes.multidigraph.MultiDiGraph

Okey, as we can see the data that we retrieved is a special data object called networkx.classes.multidigraph.MultiDiGraph. A DiGraph is a data type that stores nodes and edges with optional data, or attributes. What we can see here is that this data type belongs to a Python module called networkx that can be used to create, manipulate, and study the structure, dynamics, and functions of complex networks. Networkx module contains algorithms that can be used to calculate shortest paths along road networks using e.g. Dijkstra’s or A* algorithm.

  • Let’s see how our street network looks like. It is easy to visualize the graph with OSMnx with plot_graph() function. The function utilizes Matplotlib for visualizing the data, hence as a result it returns a matplotlib figure and axis objects:
[3]:
# Plot the streets
fig, ax = ox.plot_graph(graph)
../../_images/notebooks_L6_retrieve_osm_data_7_0.png

Great! Now we can see that our graph contains the nodes (blue circles) and the edges (gray lines) that connects those nodes to each other.

Place polygon

Let’s also plot the Polygon that represents our area of interest (Kamppi, Helsinki). We can retrieve the Polygon geometry using the gdf_from_place() -function.

  • Retrieve the extent of our location:
[4]:
area = ox.gdf_from_place(place_name)

As the name of the function already tells us, gdf_from_place()returns a GeoDataFrame based on the specified place name query.

  • Check the data type:
[5]:
type(area)
[5]:
geopandas.geodataframe.GeoDataFrame
  • Check the data:
[6]:
area
[6]:
geometry place_name bbox_north bbox_south bbox_east bbox_west
0 POLYGON ((24.92074 60.16690, 24.92075 60.16687... Kamppi, Southern major district, Helsinki, Hel... 60.172109 60.160474 24.943453 24.920742
  • Plot the area:
[7]:
area.plot()
[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x15aa208cc50>
../../_images/notebooks_L6_retrieve_osm_data_16_1.png

Building footprints

It is also possible to retrieve other types of OSM data features with OSMnx such as buildings or points of interest (POIs). Let’s download the buildings with OSMnx footprints_from_place() -function (same as buildings_from_place method in OSMnx<0.9) and plot them on top of our street network in Kamppi.

  • Retrieve buildings from the area:
[8]:
buildings = ox.footprints_from_place(place_name)

Note, you can also get other types of footprints using the parameter ``footprint_type`` (default is “buildings”).

  • Check how many building footprints we received:
[9]:
len(buildings)
[9]:
427

Buildings GeoDataFrame contains several polygons.

  • Check the first rows:
[10]:
buildings.head(3)
[10]:
nodes geometry addr:city addr:country addr:housenumber addr:street building name name:fi name:ko ... outdoor_seating addr:floor access covered type brand building:part ele electrified addr:unit
8035238 [60069605, 60069615, 60275530, 1036979252, 105... POLYGON ((24.93563 60.17045, 24.93557 60.17054... Helsinki FI 22-24 Mannerheimintie public Lasipalatsi Lasipalatsi 라시팔라치 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
8042297 [1378950415, 1378950417, 1378950418, 319515866... POLYGON ((24.92938 60.16795, 24.92933 60.16797... Helsinki FI 2 Runeberginkatu yes Radisson Blu Royal NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
14797170 [146125363, 3203698292, 3203698293, 3203698294... POLYGON ((24.92427 60.16648, 24.92427 60.16650... Helsinki FI 10 Lapinlahdenkatu school NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

3 rows × 93 columns

As you can see, there are several columns in the buildings-layer. Each column contains information about a spesific tag that OpenStreetMap contributors have added. Each tag consists of a key (the column name), and several potential values (for example building=yes or building=school). Read more about tags and tagging practices in the OpenStreetMap wiki.

[11]:
buildings.columns
[11]:
Index(['nodes', 'geometry', 'addr:city', 'addr:country', 'addr:housenumber',
       'addr:street', 'building', 'name', 'name:fi', 'name:ko', 'name:sv',
       'start_date', 'url', 'wikidata', 'wikipedia', 'addr:postcode', 'bar',
       'email', 'fax', 'internet_access', 'internet_access:fee', 'phone',
       'smoking', 'tourism', 'website', 'operator', 'source', 'architect',
       'building:levels', 'landuse', 'suojelumerkintä', 'layer', 'ref',
       'fixme', 'last_full_renovation', 'roof:levels',
       'building:maintenance:operator', 'name:local', 'levels', 'old_name',
       'created_by', 'omistusasuntoja', 'building:material', 'roof:shape',
       'building:colour', 'roof:colour', 'name:en', 'name:fr', 'alt_name',
       'note', 'short_name', 'was:building', 'was:guard:operator',
       'wheelchair', 'building:min_level', 'amenity', 'cuisine', 'historic',
       'inscription', 'tomb', 'addr:housename', 'opening_hours', 'shop',
       'toilets:wheelchair', 'name:da', 'name:nn', 'wheelchair:description',
       'denomination', 'religion', 'height', 'name:ru', 'official_name',
       'last_pipe_renovation', 'contact:website', 'guard:operator', 'loc_name',
       'name:no', 'alt_name:en', 'name:zh', 'drive_through', 'ice_cream',
       'lippakioski', 'takeaway', 'outdoor_seating', 'addr:floor', 'access',
       'covered', 'type', 'brand', 'building:part', 'ele', 'electrified',
       'addr:unit'],
      dtype='object')

Points-of-interest

OSMnx has a nice function called ox.pois_from_place() that can be used retrieve specific points-of-interest (POIs) from OpenStreetMap based on their amenity-tag. We can, for excample, retrieve all points with a tag amenity=restaurant, by passing an argument to the amenities paremeter. We could also retrieve several POI categories by passing a list of OSM amenity tag values to the function.

  • Let’s retrieve restaurants that are located in our area of interest:
[12]:
# Retrieve restaurants
restaurants = ox.pois_from_place(place_name, amenities=['restaurant'])

# How many restaurants do we have?
len(restaurants)
[12]:
212

As we can see, there are quite many restaurants in the area.

  • Let’s explore what kind of attributes we have in our restaurants GeoDataFrame:
[13]:
# Available columns
restaurants.columns
[13]:
Index(['osmid', 'geometry', 'addr:city', 'addr:country', 'addr:housenumber',
       'addr:postcode', 'addr:street', 'amenity', 'cuisine', 'name', 'phone',
       'website', 'wheelchair', 'element_type', 'toilets:wheelchair',
       'created_by', 'outdoor_seating', 'fixme', 'opening_hours', 'email',
       'internet_access', 'internet_access:fee', 'opening_hours:brunch',
       'diet:vegetarian', 'name:fi', 'name:zh', 'short_name', 'takeaway',
       'contact:website', 'diet:vegan', 'name:ru', 'operator', 'smoking',
       'wheelchair:description', 'level', 'contact:phone', 'source', 'name:en',
       'building', 'addr:housename', 'note', 'address', 'brunch',
       'contact:foursquare', 'contact:yelp', 'ref:vatin', 'delivery', 'url',
       'lunch:menu', 'reservation', 'room', 'toilets', 'capacity',
       'access:dog', 'shop', 'opening_hours:lunch_buffet', 'is_in', 'wikidata',
       'alt_name', 'contact:email', 'established', 'description', 'name:sv',
       'lunch', 'description:en', 'old_name', 'highchair', 'was:name',
       'website:en', 'lunch:buffet', 'office', 'addr:place', 'entrance',
       'addr:floor', 'layer', 'image', 'payment:mastercard', 'payment:visa',
       'nodes'],
      dtype='object')

Wow, there is quite a lot of information related to the POIs. One of the useful ones might be for example the name, address information and opening_hours information:

[14]:
# Select some useful cols and print
cols = ['name', 'opening_hours', 'addr:city', 'addr:country',
        'addr:housenumber', 'addr:postcode', 'addr:street']
# Print only selected cols
restaurants[cols].head(10)
[14]:
name opening_hours addr:city addr:country addr:housenumber addr:postcode addr:street
60062502 Kabuki NaN Helsinki FI 12 00180 Lapinlahdenkatu
60133792 Ateljé Finne NaN Helsinki FI NaN NaN NaN
62965963 Empire Plaza NaN NaN NaN NaN NaN NaN
62967659 Ravintola Pääposti NaN Helsinki NaN 1 B 00100 Mannerheiminaukio
68734026 Hampton Bay NaN Helsinki FI 6 00120 Hietalahdenranta
76617692 Johan Ludvig NaN Helsinki FI NaN NaN NaN
76624339 Ravintola Rivoletto Mo-Th 11:00-23:00; Fr 11:00-24:00; Sa 15:00-24... Helsinki FI 38 00120 Albertinkatu
76624351 Pueblo NaN Helsinki FI NaN NaN Eerikinkatu
76627823 Atabar NaN Helsinki FI NaN NaN Eerikinkatu
77642757 Southpark Mo-Sa 11:00-15:00; Su 10:30-17:00 Helsinki NaN 40 00120 Sinebrychoffin puisto, Bulevardi

As we can see, there exists a lot of useful information about restaurants that can be retrieved easily with OSMnx. Also, if some of the information need updating, you can go over to www.openstreetmap.org and edit the source data! :)

Graph to GeoDataFrame

We can now plot all these different OSM layers by using the familiar plot() function of Geopandas. As you might remember, the street network data is not a GeoDataFrame (it is networkx.MultiDiGraph). Luckily, OSMnx provides a convenient function graph_to_gdfs() that can convert the graph into two separate GeoDataFrames where the first one contains the information about the nodes and the second one about the edge.

  • Let’s extract the nodes and edges from the graph as GeoDataFrames:
[15]:
# Retrieve nodes and edges
nodes, edges = ox.graph_to_gdfs(graph)
[16]:
nodes.head()
[16]:
y x osmid highway ref geometry
3216400385 60.167552 24.934005 3216400385 turning_circle NaN POINT (24.93400 60.16755)
1372233731 60.162290 24.929274 1372233731 crossing NaN POINT (24.92927 60.16229)
319885318 60.165072 24.925487 319885318 NaN NaN POINT (24.92549 60.16507)
1005744134 60.161622 24.924423 1005744134 NaN NaN POINT (24.92442 60.16162)
3216400394 60.167662 24.933920 3216400394 NaN NaN POINT (24.93392 60.16766)
[17]:
edges.head()
[17]:
u v key osmid name highway maxspeed oneway length geometry lanes service tunnel junction access bridge ref
0 3216400385 301360890 0 15240373 Kansakoulukuja residential 30 False 13.177 LINESTRING (24.93400 60.16755, 24.93393 60.167... NaN NaN NaN NaN NaN NaN NaN
1 1372233731 298367080 0 86533507 NaN footway NaN False 6.925 LINESTRING (24.92927 60.16229, 24.92917 60.16225) NaN NaN NaN NaN NaN NaN NaN
2 1372233731 292859610 0 15103120 NaN primary_link 30 True 33.874 LINESTRING (24.92927 60.16229, 24.92930 60.162... 2 NaN NaN NaN NaN NaN NaN
3 1372233731 4430643601 0 [154412960, 86533507] NaN footway NaN False 12.489 LINESTRING (24.92927 60.16229, 24.92941 60.162... NaN NaN NaN NaN NaN NaN NaN
4 1372233731 311043714 0 86533509 Hietalahdenkatu primary 30 True 38.768 LINESTRING (24.92927 60.16229, 24.92938 60.162... 2 NaN NaN NaN NaN NaN NaN

Nice! Now, as we can see, we have our graph as GeoDataFrames and we can plot them using the same functions and tools as we have used before.

Note

There are also other ways of retrieving the data from OpenStreetMap with OSMnx such as passing a Polygon to extract the data from that area, or passing Point coordinates and retrieving data around that location with specific radius. Take a look of this tutorial to find out how to use those features of OSMnx.

Plotting the data

  • Let’s create a map out of the streets, buildings, restaurants, and the area Polygon but let’s exclude the nodes (to keep the figure clearer).
[18]:
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(12,8))

# Plot the footprint
area.plot(ax=ax, facecolor='black')

# Plot street edges
edges.plot(ax=ax, linewidth=1, edgecolor='#BC8F8F')

# Plot buildings
buildings.plot(ax=ax, facecolor='khaki', alpha=0.7)

# Plot restaurants
restaurants.plot(ax=ax, color='green', alpha=0.7, markersize=10)
plt.tight_layout()
../../_images/notebooks_L6_retrieve_osm_data_39_0.png

Cool! Now we have a map where we have plotted the restaurants, buildings, streets and the boundaries of the selected region of ‘Kamppi’ in Helsinki. And all of this required only a few lines of code. Pretty neat!

As a final step, we might want to re-project the layers to a local projection for plotting. Here, we will use the tools we already know, namely pyproj CRS. In the latter part of this lesson we will learn how to use OSMnx to re-project our data to UTM coordinates.

  • Re-project the layers to epsg:3067
[19]:
from pyproj import CRS

# Set projection
projection = CRS.from_epsg(3067)

# Re-project layers
area = area.to_crs(projection)
edges = edges.to_crs(projection)
buildings = buildings.to_crs(projection)
restaurants = restaurants.to_crs(projection)
  • Create a new plot with the re-projected layers:
[20]:
fig, ax = plt.subplots(figsize=(12,8))

# Plot the footprint
area.plot(ax=ax, facecolor='black')

# Plot street edges
edges.plot(ax=ax, linewidth=1, edgecolor='dimgray')

# Plot buildings
buildings.plot(ax=ax, facecolor='silver', alpha=0.7)

# Plot restaurants
restaurants.plot(ax=ax, color='yellow', alpha=0.7, markersize=10)
plt.tight_layout()
../../_images/notebooks_L6_retrieve_osm_data_44_0.png

Task

Retrieve OpenStreetMap data from some other area! Download these elements using OSMnx functions from your area of interest:

  • Extent of the area using gdf_from_place()
  • Street network using graph_from_place(), and convert to gdf using ox.graph_to_gdfs()
  • Building footprints using ox.footprints_from_place()

Note, the larger the area you choose, the longer it takes to retrieve data from the API! Use parameter ``network_type=drive`` to limit the graph query to filter out un-driveable roads.

[ ]:

Extra: Park polygons

Notice that we can also retrieve other types of footprints from OpenStreetMap by specifying the footprint_type when using functions from the OSMnx footprints module. buildings is the default value for this parameter, but we can also pass other OpenStreetMap tag keys.

Let’s try to fetch all public parks in the Kamppi area. In OpenStreetMap, parks are often tagged as leisure=park (also other tags might be used, such as landuse=recreation_ground,landuse=grass, see OpenStreetMap, and OSM wiki for more details).

  • We need to start by fetching all footprints from the tag leisure:
[21]:
leisure = ox.footprints_from_place(place_name, footprint_type="leisure")
  • let’s check the data:
[22]:
leisure.head(3)
[22]:
nodes geometry leisure name name:fi name:sv hoitoluokitus_viheralue source wikidata wikipedia access alt_name loc_name barrier sport colour fixme mooring short_name short_name:sv
8042256 [292719496, 1001543836, 1037987967, 1001544060... POLYGON ((24.93566 60.17132, 24.93566 60.17130... park NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
8042613 [552965718, 293390264, 295056669, 256264975, 1... POLYGON ((24.93701 60.16947, 24.93627 60.16919... park Simonpuistikko Simonpuistikko Simonsskvären NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
15218362 [144181223, 150532964, 150532958, 150532966, 1... POLYGON ((24.92330 60.16499, 24.92323 60.16500... park Työmiehenpuistikko Työmiehenpuistikko Arbetarparken A2 survey NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
  • Check all values for the column leisure:
[23]:
leisure["leisure"].value_counts()
[23]:
park          15
pitch          8
playground     6
dog_park       2
flowerbed      1
marina         1
Name: leisure, dtype: int64
  • select all park polygons (here, selecting both “park” and “playground”):
[24]:
parks = leisure[leisure["leisure"].isin(["park","playground"])]
  • plot the parks:
[25]:
parks.plot(color="green")
[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x15aa2ef95c0>
../../_images/notebooks_L6_retrieve_osm_data_56_1.png
  • Finally, we can re-project the park polygons and add them to our map:
[26]:
parks = parks.to_crs(projection)
[27]:
fig, ax = plt.subplots(figsize=(12,8))

# Plot the footprint
area.plot(ax=ax, facecolor='black')

# Plot the parks
parks.plot(ax=ax, facecolor="green")

# Plot street edges
edges.plot(ax=ax, linewidth=1, edgecolor='dimgray')

# Plot buildings
buildings.plot(ax=ax, facecolor='silver', alpha=0.7)

# Plot restaurants
restaurants.plot(ax=ax, color='yellow', alpha=0.7, markersize=10)
plt.tight_layout()
../../_images/notebooks_L6_retrieve_osm_data_59_0.png