Masking / clipping rasterΒΆ

One common task in raster processing is to clip raster files based on a Polygon. The following example shows how to clip a large raster based on a bounding box around Helsinki Region.

In [1]: import rasterio

In [2]: from rasterio.plot import show

In [3]: from rasterio.plot import show_hist

In [4]: from rasterio.mask import mask

In [5]: from shapely.geometry import box

In [6]: import geopandas as gpd

In [7]: from fiona.crs import from_epsg

In [8]: import pycrs
  • Specify input and output filepaths
# Filepaths
In [9]: fp = r"C:\HY-DATA\HENTENKA\CSC\Data\p188r018_7t20020529_z34__LV-FIN.tif"

In [10]: out_tif = r"C:\HY-DATA\HENTENKA\CSC\Data\Helsinki_masked_p188r018_7t20020529_z34__LV-FIN.tif"
  • Open the raster in read mode
In [11]: data = rasterio.open(fp)
  • Plot the data
In [12]: show((data, 4), cmap='terrain')
Out[12]: <matplotlib.axes._subplots.AxesSubplot at 0x20c927d76d8>
../../_images/large_raster.png

Okey, as you can see, we have a huge raster file where we can see the coastlines of Finland and Estonia. What we want to do next is to create a bounding box around Helsinki region and clip the raster based on that.

  • Next, we need to create a bounding box with Shapely.
# WGS84 coordinates
In [13]: minx, miny = 24.60, 60.00

In [14]: maxx, maxy = 25.22, 60.35

In [15]: bbox = box(minx, miny, maxx, maxy)
  • Insert the bbox into a GeoDataFrame
In [16]: geo = gpd.GeoDataFrame({'geometry': bbox}, index=[0], crs=from_epsg(4326))
  • Re-project into the same coordinate system as the raster data
In [17]: geo = geo.to_crs(crs=data.crs.data)
  • Next we need to get the coordinates of the geometry in such a format that rasterio wants them. This can be conducted easily with following function
def getFeatures(gdf):
    """Function to parse features from GeoDataFrame in such a manner that rasterio wants them"""
    import json
    return [json.loads(gdf.to_json())['features'][0]['geometry']]
  • Get the geometry coordinates by using the function.
In [18]: coords = getFeatures(geo)

In [19]: print(coords)
[{'type': 'Polygon', 'coordinates': [[[735275.3533476054, 6658919.843253607], [732783.5561207401, 6697846.086795722], [698608.1329965619, 6695816.080575279], [700733.5832412266, 6656875.248540204], [735275.3533476054, 6658919.843253607]]]}]

Okey, so rasterio wants to have the coordinates of the Polygon in this kind of format.

  • Now we are ready to clip the raster with the polygon using the coords variable that we just created. Clipping the raster can be done easily with the mask function that we imported in the beginning from rasterio, and specifying clip=True.
In [20]: out_img, out_transform = mask(raster=data, shapes=coords, crop=True)
  • Next, we need to modify the metadata. Let’s start by copying the metadata from the original data file.
# Copy the metadata
In [21]: out_meta = data.meta.copy()

In [22]: print(out_meta)
{'driver': 'GTiff', 'dtype': 'uint8', 'nodata': None, 'width': 8877, 'height': 8106, 'count': 7, 'crs': CRS({'init': 'epsg:32634'}), 'transform': (600466.5, 28.5, 0.0, 6784966.5, 0.0, -28.5), 'affine': Affine(28.5, 0.0, 600466.5,
       0.0, -28.5, 6784966.5)}
  • Next we need to parse the EPSG value from the CRS so that we can create a Proj4 string using PyCRS library (to ensure that the projection information is saved correctly).
# Parse EPSG code
In [23]: epsg_code = int(data.crs.data['init'][5:])

In [24]: print(epsg_code)
32634
  • Now we need to update the metadata with new dimensions, transform (affine) and CRS (as Proj4 text)
In [25]: out_meta.update({"driver": "GTiff",
   ....:                  "height": out_img.shape[1],
   ....:                  "width": out_img.shape[2],
   ....:                  "transform": out_transform,
   ....:                  "crs": pycrs.parser.from_epsg_code(epsg_code).to_proj4()}
   ....:                          )
   ....: 
  • Finally, we can save the clipped raster to disk with following command.
In [26]: with rasterio.open(out_tif, "w", **out_meta) as dest:
   ....:     dest.write(out_img)
   ....: 
  • Let’s still check that the result is correct by plotting our new clipped raster.
In [27]: clipped = rasterio.open(out_tif)

In [28]: show((clipped, 5), cmap='terrain')
Out[28]: <matplotlib.axes._subplots.AxesSubplot at 0x20c994f0a90>
../../_images/raster_big_clipped.png

Great, it worked! This is how you can easily clip (mask) raster files with rasterio.