Introduction to Python GIS

General overview of the course

During the next three intensive days you will learn how to deal with spatial data and analyze it using “pure” Python.

Learning objectives

At the end of the course you should have a basic idea how to conduct following GIS tasks in Python:

  • Read / write spatial data from/to different file formats
  • Deal with different projections
  • Conduct different geometric operations and spatial queries
  • Convert addresses to points (+ vice versa) i.e. do geocoding
  • Reclassify your data based on different criteria
  • Know how to fetch data from OpenStreetMap easily with Python
  • Know the basics of raster processing in Python
  • Visualize data and create (interactive) maps, such as following:
texas_unemployment.py example

Why Python for GIS?

Python is extremely useful language to learn in terms of GIS since many (or most) of the different GIS Software packages (such as ArcGIS, QGIS, PostGIS etc.) provide an interface to do analysis using Python scripting. During this course, we will mostly focus on doing GIS without any third party softwares such as ArcGIS. Why? There are several reasons for doing GIS using Python without any additional software:

  • Everything is free: you don’t need to buy and expensive license for ArcGIS (for example)
  • You will learn and understand much more deeply how different geoprocessing operations work
  • Python is highly efficient: used for analysing Big Data
  • Python is highly flexible: supports all data formats that you can imagine
  • Using Python (or any other open-source programming language) supports open source softwares/codes and open science by making it possible for everyone to reproduce your work, free-of-charge.
  • Plug-in and chain different third-party softwares to build e.g. a fancy web-GIS applications as you want (using e.g. GeoDjango with PostGIS as a back-end)

What tools are available for doing GIS in pure Python?

Short answer: Many!

During the course we will familiarize ourselves with a punch of Python modules that are useful when doing data analysis and different GIS tasks.

One drawback when compared to using a specific GIS-software such as ArcGIS, is that GIS tools are spread under different Python modules and created by different developers. This means that you need to familiarize yourself with many different modules (and their documentation), whereas e.g. in ArcGIS everything is packaged under a same module called arcpy.

Below is a list of useful libraries (and links to their docs) that helps you get going when doing data analysis or GIS in Python. If you are interested or when you start using these modules in your own work, it is highly recommended to read the documentation from the web pages of the module that you use:

  • Data analysis & visualization:
    • Numpy –> Fundamental package for scientific computing with Python
    • Pandas –> High-performance, easy-to-use data structures and data analysis tools
    • Scipy –> A collection of numerical algorithms and domain-specific toolboxes, including signal processing, optimization and statistics
    • Statsmodels –> Statistical models for Python
    • Scikit-learn –> Machine learning for Python (classification, regression, clustering, etc.)
    • Matplotlib –> Basic plotting library for Python
    • Seaborn –> Statistical data visualization
    • Bokeh –> Interactive visualizations for the web (also maps)
    • Plotly –> Interactive visualizations (also maps) for the web (commercial - free for educational purposes)
    • Dash –> Building analytical web applications with Python (no Javascript required)
  • GIS:
    • GDAL –> Fundamental package for processing vector and raster data formats (many modules below depend on this). Used for raster processing.
    • Geopandas –> Working with geospatial data in Python made easier, combines the capabilities of pandas and shapely.
    • Shapely –> Python package for manipulation and analysis of planar geometric objects (based on widely deployed GEOS).
    • Fiona –> Reading and writing spatial data (alternative for geopandas).
    • Pyproj –> Performs cartographic transformations and geodetic computations (based on PROJ.4).
    • PyCRS –> Working eaily with different CRS specifications (EPSG, ESRI, Proj4)
    • Pysal –> Library of spatial analysis functions written in Python.
    • Geopy –> Geocoding library: coordinates to address <-> address to coordinates.
    • GeoViews –> Interactive Maps for the web.
    • Geoplot –> High-level geospatial data visualization library for Python.
    • GeoNotebook –> Desktop GIS-like environment for visualizing and interacting with spatial data using Python (based on Jupyter Notebooks)
    • OSMnx –> Python for street networks. Retrieve, construct, analyze, and visualize street networks from OpenStreetMap
    • Networkx –> Network analysis and routing in Python (e.g. Dijkstra and A* -algorithms), see this post.
    • Cartopy –> Make drawing maps for data analysis and visualisation as easy as possible.
    • Scipy.spatial –> Spatial algorithms and data structures.
    • Rtree –> Spatial indexing for Python for quick spatial lookups.
    • Rasterio –> Clean and fast and geospatial raster I/O for Python.
    • Rasterstats –> A module for summarizing geospatial raster datasets based on vector geometries (e.g. conduct zonal statistics).
    • RSGISLib –> Remote Sensing and GIS Software Library for Python.

Install to your own computer!

See directions how to install these modules to your own computer from here