Retrieving OpenStreetMap data¶
What is OpenStreetMap?¶
OpenStreetMap (OSM) is a global collaborative (crowd-sourced) dataset and project that aims at creating a free editable map of the world containing a lot of information about our environment. It contains data for example about streets, buildings, different services, and landuse to mention a few.
OSM has a large userbase with more than 4 million users that contribute actively on OSM by updating the OSM database with 3 million changesets per day. In total OSM contains more than 4 billion nodes that form the basis of the digitally mapped world that OSM provides (stats from November 2017.
OpenStreetMap is used not only for integrating the OSM maps as background maps to visualizations or online maps, but also for many other purposes such as routing, geocoding, education, and research. OSM is also widely used for humanitarian response e.g. in crisis areas (e.g. after natural disasters) and for fostering economic development (see more from Humanitarian OpenStreetMap Team (HOTOSM) website.
Osmnx¶
This week we will explore a nice Python module called osmnx that can be used to retrieve, construct, analyze, and visualize street networks from OpenStreetMap. In short it offers really handy functions to download data from OpenStreet map, analyze the properties of the OSM street networks, and conduct network routing based on walking, cycling or driving.
There is also a scientific article available describing the package:
- Boeing, G. 2017. “OSMnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks.” Computers, Environment and Urban Systems 65, 126-139. doi:10.1016/j.compenvurbsys.2017.05.004
Download and visualize OpenStreetMap data with OSMnx¶
As said, one the most useful features that osmnx provides is an easy-to-use way of retrieving OpenStreetMap data (using OverPass API).
Let’s see how we can download and visualize street network data from a district of Kamppi in Helsinki, Finland. Osmnx makes it really easy to do that as it allows you to specify an address to retrieve the OpenStreetMap data around that area. In fact, osmnx uses the same Nominatim Geocoding API to achieve this which we tested during the previous tutorial.
- Let’s retrieve OpenStreetMap (OSM) data by specifying
"Kamppi, Helsinki, Finland"
as the address where the data should be downloaded:
In [2]:
import osmnx as ox
import matplotlib.pyplot as plt
%matplotlib inline
# Specify the name that is used to seach for the data
place_name = "Kamppi, Helsinki, Finland"
# Fetch OSM street network from the location
graph = ox.graph_from_place(place_name)
type(graph)
Out[2]:
networkx.classes.multidigraph.MultiDiGraph
Okey, as we can see the data that we retrieved is a special data object
called networkx.classes.multidigraph.MultiDiGraph
. A DiGraph is a
data type that stores nodes and edges with optional data, or attributes.
What we can see here is that this data type belongs to a Python module
called networkx
that can be used to create, manipulate, and study the structure,
dynamics, and functions of complex networks. Networkx module contains
algorithms that can be used to calculate shortest
paths
along road networks using e.g.
Dijkstra’s or
A* algorithm.
- Let’s see how our street network looks like. It is easy to visualize
the graph with osmnx with
plot_graph()
function. The function utilizes Matplotlib for visualizing the data, hence as a result it returns a matplotlib figure and axis objects:
In [4]:
# Plot the streets
fig, ax = ox.plot_graph(graph)
Great! Now we can see that our graph contains the nodes (blue circles) and the edges (gray lines) that connects those nodes to each other.
It is also possible to retrieve other types of OSM data features with osmnx such as buildings or points of interest (POIs).
- Let’s download the buildings with
buildings_from_place()
-function and plot them on top of our street network in Kamppi. Let’s also plot the Polygon that represents the area of Kamppi, Helsinki that can be retrieved withgdf_from_place
-function.
In [6]:
# Retrieve the footprint of our location
area = ox.gdf_from_place(place_name)
# Retrieve buildings from the area
buildings = ox.buildings_from_place(place_name)
# What types are those?
print(type(area))
print(type(buildings))
<class 'geopandas.geodataframe.GeoDataFrame'>
<class 'geopandas.geodataframe.GeoDataFrame'>
As a result we got the data as GeoDataFrames.
OSMnx has a nice function called ox.pois_from_place()
that can be
used in a similar manner as the previous function to retrieve specific
POI data from OpenStreetMap such as restaurants
. With parameter
amenities
we can pass a list of OSM amenity categories that we are
interested in retrieving.
- Let’s also retrieve restaurants that are located on the area:
In [8]:
# Retrieve restaurants
restaurants = ox.pois_from_place(place_name, amenities=['restaurant'])
# How many restaurants do we have?
len(restaurants)
Out[8]:
199
As we can see, there exist quite many restaurants in the area.
- Let’s explore what kind of attributes we have in our restaurants GeoDataFrame
In [9]:
# Available columns
restaurants.columns
Out[9]:
Index(['access:dog', 'addr:city', 'addr:country', 'addr:floor',
'addr:housename', 'addr:housenumber', 'addr:place', 'addr:postcode',
'addr:street', 'address', 'alt_name', 'amenity', 'brunch', 'capacity',
'contact:email', 'contact:foursquare', 'contact:phone',
'contact:website', 'contact:yelp', 'created_by', 'cuisine', 'delivery',
'description', 'description:en', 'diet:vegan', 'diet:vegetarian',
'element_type', 'email', 'entrance', 'established', 'geometry',
'highchair', 'internet_access', 'is_in', 'layer', 'level', 'lunch',
'name', 'name:en', 'name:fi', 'name:sv', 'name:zh', 'note', 'office',
'opening_hours', 'opening_hours:brunch', 'opening_hours:lunch',
'opening_hours:lunch_buffet', 'operator', 'osmid', 'outdoor_seating',
'phone', 'ref:vatin', 'shop', 'smoking', 'source', 'takeaway',
'toilets:wheelchair', 'url', 'was:name', 'website', 'website:en',
'wheelchair', 'wheelchair:description', 'wikidata', 'building',
'nodes'],
dtype='object')
Wow, there exists quite a lot of information related to the POIs. One of
the useful ones might be for example the name
,
address information
and opening_hours
information:
In [16]:
# Select some useful cols and print
cols = ['name', 'opening_hours', 'addr:city', 'addr:country',
'addr:housenumber', 'addr:postcode', 'addr:street']
# Print only selected cols
restaurants[cols].head(10)
Out[16]:
name | opening_hours | addr:city | addr:country | addr:housenumber | addr:postcode | addr:street | |
---|---|---|---|---|---|---|---|
60062502 | Kabuki | NaN | Helsinki | FI | 12 | 00180 | Lapinlahdenkatu |
60133792 | Ateljé Finne | NaN | Helsinki | FI | NaN | NaN | NaN |
62965963 | Empire Plaza | NaN | NaN | NaN | NaN | NaN | NaN |
62967659 | Ravintola Pääposti | NaN | Helsinki | NaN | 1 B | 00100 | Mannerheiminaukio |
68734026 | Hampton Bay | NaN | Helsinki | FI | 6 | 00120 | Hietalahdenranta |
76617692 | Johan Ludvig | NaN | Helsinki | FI | NaN | NaN | NaN |
76624339 | Ravintola Rivoletto | Mo-Th 11:00-23:00; Fr 11:00-24:00; Sa 15:00-24... | Helsinki | FI | 38 | 00120 | Albertinkatu |
76624351 | Pueblo | NaN | Helsinki | FI | NaN | NaN | Eerikinkatu |
76627823 | Atabar | NaN | Helsinki | FI | NaN | NaN | Eerikinkatu |
89074039 | Papa Albert | Mo-Th 10:00-14:00, 17:30-22:00; Fr 11:00-23:00... | Helsinki | FI | 30 | 00120 | Albertinkatu |
As we can see, there exists a lot of useful information about restaurants that can be retrieved easily with OSMnx.
We can now plot all these different OSM layers by using the familiar
plot()
function of Geopandas. As you might remember, the street
network data was not in GeoDataFrame format (it was
networkx.MultiDiGraph). Luckily, osmnx provides a convenient function
graph_to_gdfs()
that can convert the graph into two separate
GeoDataFrames where the first one contains the information about the
nodes and the second one about the edge.
- Let’s extract the nodes and edges from the graph as GeoDataFrames:
In [20]:
# Retrieve nodes and edges
nodes, edges = ox.graph_to_gdfs(graph)
print("Nodes:\n", nodes.head(), '\n')
print("Edges:\n", edges.head(), '\n')
print("Type:", type(edges))
Nodes:
highway osmid ref x y \
3216400385 turning_circle 3216400385 NaN 24.934 60.1676
1372233731 crossing 1372233731 NaN 24.9293 60.1623
3216400394 NaN 3216400394 NaN 24.9339 60.1676
1372233748 NaN 1372233748 NaN 24.9299 60.1632
3216400409 NaN 3216400409 NaN 24.9337 60.1678
geometry
3216400385 POINT (24.9340047 60.1675525)
1372233731 POINT (24.9292726 60.1622912)
3216400394 POINT (24.9339275 60.1676411)
1372233748 POINT (24.9299486 60.1632382)
3216400409 POINT (24.9337467 60.1678235)
Edges:
access bridge geometry \
0 NaN NaN LINESTRING (24.9340047 60.1675525, 24.9339332 ...
1 NaN NaN LINESTRING (24.9292726 60.1622912, 24.9294092 ...
2 NaN NaN LINESTRING (24.9292726 60.1622912, 24.9291752 ...
3 NaN NaN LINESTRING (24.9292726 60.1622912, 24.9293 60....
4 NaN NaN LINESTRING (24.9292726 60.1622912, 24.9293795 ...
highway junction key lanes length maxspeed name oneway \
0 residential NaN 0 NaN 11.431 30 Kansakoulukuja False
1 footway NaN 0 NaN 9.323 NaN NaN False
2 footway NaN 0 NaN 6.855 NaN NaN False
3 primary_link NaN 0 2 34.508 40 NaN True
4 primary NaN 0 2 38.872 40 Hietalahdenkatu True
osmid ref service tunnel u v
0 15240373 NaN NaN NaN 3216400385 301360890
1 86533507 NaN NaN NaN 1372233731 1005727584
2 86533507 NaN NaN NaN 1372233731 298367080
3 15103120 NaN NaN NaN 1372233731 292859610
4 86533509 NaN NaN NaN 1372233731 311043714
Type: <class 'geopandas.geodataframe.GeoDataFrame'>
Nice! Now, as we can see, we have our graph as GeoDataFrames and we can plot them using the same functions and tools as we have used before.
Note: There are also other ways of retrieving the data from OpenStreetMap with osmnx such as passing a Polygon to extract the data from that area, or passing a Point coordinates and retrieving data around that location with specific radius. Take a look of this tutorial to find out how to use those features of osmnx.
- Let’s create a map out of the streets, buildings, restaurants, and the area Polygon but let’s exclude the nodes (to keep the figure clearer).
In [34]:
# Plot the footprint
ax = area.plot(facecolor='black')
# Plot street edges
edges.plot(ax=ax, linewidth=1, edgecolor='#BC8F8F')
# Plot buildings
buildings.plot(ax=ax, facecolor='khaki', alpha=0.7)
# Plot restaurants
restaurants.plot(ax=ax, color='green', alpha=0.7, markersize=10)
plt.tight_layout()
Cool! Now we have a map where we have plotted the restaurants, buildings, streets and the boundaries of the selected region of ‘Kamppi’ in Helsinki. And all of this required only a few lines of code. Pretty neat!