Re-projecting data

A map projection is a systematic transformation of the latitudes and longitudes into a plain surface. As map projections of gis-layers are fairly often defined differently (i.e. they do not match), it is a common procedure to redefine the map projections to be identical in both layers. It is important that the layers have the same projection as it makes it possible to analyze the spatial relationships between layer, such as conduct the Point in Polygon spatial query (which we will try next).

Defining a projection and changing it is easy in Geopandas. Let’s continue working with our address points, and change the Coordinate Reference System (CRS) from WGS84 into a projection called ETRS GK-25 (EPSG:3879) which uses a Gauss-Krüger projection that is (sometimes) used in Finland.

In [1]: import geopandas as gpd

# Filepath to the addresses Shapefile
In [2]: fp = "/home/geo/addresses.shp"

# Read data
In [3]: data = gpd.read_file(fp)
  • Let’s check what is the current CRS of our layer
In [4]: data.crs
Out[4]: {'init': 'epsg:4326'}

Okey, so it is WGS84 (i.e. EPSG: 4326).

  • Let’s also check the values in our geometry column
In [5]: data['geometry'].head()
Out[5]: 
0           POINT (24.9301701 60.1683731)
1           POINT (24.9418933 60.1698665)
2    POINT (24.9774004 60.18735880000001)
3    POINT (25.0919641 60.21448089999999)
4           POINT (24.9214846 60.1565781)
Name: geometry, dtype: object

Okey, so they indeed look like lat-lon values.

  • Let’s convert those geometries into ETRS GK-25 projection (EPSG: 3879). Changing the projection is really easy to do in Geopandas with .to_crs() -function. As an input for the function, you should define the column containing the geometries, i.e. geometry in this case, and a epgs value of the projection that you want to use.
  • Note: there is also possibility to pass the projection information as proj4 strings or dictionaries, see more here
# Let's take a copy of our layer
In [6]: data_proj = data.copy()

# Reproject the geometries by replacing the values with projected ones
In [7]: data_proj['geometry'] = data_proj['geometry'].to_crs(epsg=3879)
  • Let’s see how they look now
In [8]: data_proj['geometry'].head()
Out[8]: 
0    POINT (25496123.30852197 6672833.941567578)
1    POINT (25496774.28242895 6672999.698581985)
2     POINT (25498746.0795546 6674947.404346379)
3    POINT (25505098.34340289 6677972.568484426)
4    POINT (25495639.56049686 6671520.343245601)
Name: geometry, dtype: object

And here we go, the numbers have changed! Now we have successfully changed the projection of our layer into a new one.

  • Let’s still compare the layers visually
import matplotlib.pyplot as plt

# Plot the WGS84
data.plot(markersize=6, color="red");

# Add title
plt.title("WGS84 projection");

# Remove empty white space around the plot
plt.tight_layout()

# Plot the one with ETRS GK-25 projection
data_proj.plot(markersize=6, color="blue");

# Add title
plt.title("ETRS GK-25 projection");

# Remove empty white space around the plot
plt.tight_layout()

_images/wgs84.png _images/projected.png

Indeed, they look different and our re-projected one looks much better in Finland (not so stretced as in WGS84).

  • Now we still need to change the crs of our GeoDataFrame into EPSG 3879 as now we only modified the values of the geometry column. We can take use of fiona’s from_epsg -function.
In [9]: from fiona.crs import from_epsg

# Determine the CRS of the GeoDataFrame
In [10]: data_proj.crs = from_epsg(3879)

# Let's see what we have
In [11]: data_proj.crs
Out[11]: {'init': 'epsg:3879', 'no_defs': True}

Note

The above works for most EPSG codes but as ETRS GK-25 projection is a rather rare one, we still need to make sure that .prj file is having correct coordinate system information. We do that by passing a proj4 dictionary (below) into it (otherwise the .prj file of the Shapefile might be empty):

# Pass the coordinate information
In [12]: data_proj.crs = {'y_0': 0, 'no_defs': True, 'x_0': 25500000, 'k': 1, 'lat_0': 0, 'units': 'm', 'lon_0': 25, 'ellps': 'GRS80', 'proj': 'tmerc'}

# Check that it changed
In [13]: data_proj.crs
Out[13]: 
{'ellps': 'GRS80',
 'k': 1,
 'lat_0': 0,
 'lon_0': 25,
 'no_defs': True,
 'proj': 'tmerc',
 'units': 'm',
 'x_0': 25500000,
 'y_0': 0}
  • Finally, let’s save our projected layer into a Shapefile so that we can use it later.
# Ouput file path
outfp = r"/home/geo/addresses_epsg3879.shp"

# Save to disk
data_proj.to_file(outfp)