Chapter 4. Geospatial Python Toolbox

The first three chapters of this book covered the history of geospatial analysis, the types of geospatial data used by analysts, and the major software and libraries found within the geospatial industry. We used some simple Python examples here and there to illustrate certain points, but we focused mainly on the field of geospatial analysis independent of any specific technology.

Starting here, we will be using Python to conquer geospatial analysis and we will continue with that approach for the rest of the book. In this chapter, we'll discover the Python libraries used to access the different types of data found in the Vector data and Raster data sections of Chapter 2, Geospatial Data. Some of these libraries are pure Python and some are bindings to the different software packages found in Chapter 3, The Geospatial Technology Landscape.

We will examine pure Python solutions whenever possible. Python is a very capable programming language, but some operations, particularly in remote sensing, are too computationally intensive and therefore impractical using pure Python or other interpreted languages. Fortunately, nearly every aspect of geospatial analysis can be addressed in some way through Python even if it is binding to a highly efficient C, C++, or other compiled-language library.

We will avoid using broad scientific libraries which cover other domains beyond geospatial analysis to keep solutions as simple as possible. There are many reasons to use Python for geospatial analysis, but one of the strongest arguments is its portability. Python is a ubiquitous programming language officially available as a compiled installation on over 20 platforms according to the python.org website. It comes as standard with most Linux distributions and is available on most major smart phone operating systems as well. The Python source distribution usually compiles on any platform supporting C.

Furthermore, Python has been ported to Java as the Jython distribution and to the .NET Common Language Runtime (CLR) as IronPython. Python also has versions such as Stackless Python for massively concurrent programs. There are versions of Python designed too to run on cluster computers for distributed processing. Python is also available on many hosted application servers that do not allow you to install custom executables such as the Google App Engine platform, which has a Python API. Modules written in pure Python using the standard library will almost always run on any of the platforms that we just mentioned.

Each time you add a third-party module which relies on bindings to external libraries in other languages, you reduce Python's inherent portability. You also add a layer of complexity to fundamentally change the code by adding another language to the mix. Pure Python keeps things simple. Also Python bindings to external libraries tend to be automatically or semi-automatically generated. These automatically generated bindings are very generic and esoteric, and they simply connect Python to a C or C++ API using the method names from that API instead of following the best practices for Python. There are, of course, notable exceptions to this approach driven by project requirements which may include speed, unique library features, or frequently updated libraries where an automatically generated interface is preferable.

Installing third-party Python modules

We'll make a distinction between modules which are included as part of Python's standard library and modules which must be installed. In Python, the words module and library are used interchangeably. To install libraries, you either get them from the Python Package Index (PyPI) or in the case of a lot of geospatial modules, you download a specialized installer. PyPI acts as the official software repository for libraries and offers some easy-to-use setup programs, which simplify installing packages. You can use the easy_install program, which is especially good on Windows, or the pip program more commonly found on Linux and Unix systems. Once it's installed, you can then install third-party packages simply by running the following code:

easy_install <package name>

For installing pip, you run the following code:

pip install <package name>

Links will be provided to installers and instructions for packages not available on PyPI. You can manually install third-party Python modules by downloading the Python source code and putting it in your current working directory, or you can put it in your Python site-packages directory. These two directories are available in Python's search path when you try to import a module. If you put a module in your current working directory, it'll only be available when you start Python from that directory.

If you put it in your site-packages directory, it'll be available every time you start Python. The site-packages directory is specifically meant for third-party modules. To locate the site-packages directory for your installation, you ask Python's sys module. The sys module has a path attribute that is a list of all directories in Python's search path. The site-packages directory should be the last one which you can locate by specifying an index of -1, as shown in the following code:

>>> import sys
>>> sys.path[-1]
'C:\Python34\lib\site-packages'

If that call doesn't return the site-packages path, just look at the entire list to locate it, as shown in the following code:

>>> sys.path
['', 'C:\WINDOWS\system32\python34.zip', 'C:\Python34\DLLs', 'C:\Python34\lib', 'C:\Python34\lib\plat-win
', 'C:\Python34\lib\lib-tk', 'C:\Python34', 'C:\Python34\lib\site-packages']

These installation methods will be used for the rest of the book. You can find the latest Python version, source code for your platform installation, and compilation instructions at http://python.org/download/.

Note

The Python virtualenv module allows you to easily create an isolated copy of Python for a specific project without affecting your main Python installation or other projects. Using this module, you can have different projects with different versions of the same library. Once you have a working code base, you can then keep it isolated from changes to the modules you used or even Python itself. The virtualenv module is simple to use and can be used for any example in this book; however, explicit instructions on its use are not included. To get started with virtualenv, follow this simple guide:

http://docs.python-guide.org/en/latest/dev/virtualenvs/

Installing GDAL

The Geospatial Data Abstraction Library (GDAL), which includes OGR, is critical to many of the examples in this book and is also one of the more complicated Python setups as well. For these reasons, we'll discuss it separately here. The latest GDAL bindings are available on PyPI, however the installation requires a few more steps because of additional resources needed by the GDAL library.

There are three ways to install GDAL for use with Python:

  • Compile it from source code
  • Install it as part of a larger software package
  • Install a binary distribution and then Python bindings

If you have experience with compiling C libraries as well as the required compiler software, then the first option gives you the most control. However, it is not recommended if you just want to get going as quickly as possible because even experienced software developers can find compiling GDAL and the associated Python bindings challenging. Instructions for compiling GDAL on leading platforms can be found at http://trac.osgeo.org/gdal/wiki/BuildHints. There are also basic build instructions on the PyPI GDAL page. Have a look at https://pypi.python.org/pypi/GDAL.

The second option is by far the quickest and the easiest. The Open Source Geospatial Foundation (OSGeo) distributes an installer called OSGeo4W which installs all of the top open source geospatial packages on Windows at the click of a button. If you are on Linux, there is another package with distributions for both Linux and Windows called FWTools. OSGeo4W can be found at http://trac.osgeo.org/osgeo4w/.

FWTools is available online at http://fwtools.maptools.org/.

While these packages are the easiest to work with, they come with their own version of Python. If you already have Python installed, then having another Python distribution just to use certain libraries may be problematic. In that case, the third option may be for you.

The third option installs a pre-compiled binary specific to your Python version. This method is the best compromise between ease of installation and customization. The catch is you must make sure the binary distributions and the corresponding Python bindings are compatible with each other, your Python version, and in many cases your operating system configuration.

Windows

The installation of GDAL for Python on Windows becomes easier and easier each year. To install GDAL on Windows, you must first check whether you are running the 32-bit or 64-bit version of Python. To do so, just start your Python interpreter at a command prompt, as shown in the following code:

Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:15:05) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.

So, based on this instance, we see Python is version 3.4.2 for win32, which means it is the 32-bit version. Once you have this information, go to the following URL:

http://www.lfd.uci.edu/~gohlke/pythonlibs/#gdal

This web page contains Python Windows binaries and bindings for nearly every open source scientific library. On that web page, in the GDAL section, find the release that matches your version of Python. The release names use the abbreviation cp for C Python followed by the major Python version number and either win32 for 32-bit Windows or win_amd64 for 64-bit Windows. In the previous example, we would download the file named GDAL‑1.11.3‑cp34‑none‑win32.whl.

This download package is the newer Python pip wheel format. To install it, simply open a command prompt and type the following code:

pip install GDAL‑1.11.3‑cp34‑none‑win32.whl

Once the package is installed, open your Python interpreter and run the following commands to verify that GDAL is installed by checking its version:

Python 3.4.2 (v3.4.2:ab2c023a9432, Oct 6 2014, 22:15:05) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from osgeo import gdal
>>> gdal.__version__
1.11.3

GDAL should return its version as 1.11.3.

Tip

If you have trouble installing modules using easy_install or pip and PyPI, try to download and install the wheel package from the same site as the GDAL example.

Linux

GDAL installation on Linux varies widely by distribution. The following gdal.org binaries web page lists installation instructions for several distributions:

http://trac.osgeo.org/gdal/wiki/DownloadingGdalBinaries

Typically, your package manager will install both GDAL and Python bindings. For example, on Ubuntu, to install GDAL you run the following code:

sudo apt-get install gdal-bin

Then to install the Python bindings, you run the following command:

sudo apt-get install python-gdal

Also, most Linux distributions are set up to compile software already and their instructions are much simpler than those on Windows. Depending on the installation, you may have to import gdal and ogr as part of the osgeo package as shown in the following command:

>>> from osgeo import gdal
>>> from osgeo import ogr

Mac OS X

To install GDAL on OS X, you can also use the Homebrew package management system available at http://brew.sh/.

Alternatively, you can use the MacPorts package management system available at https://www.macports.org/.

Both of these systems are well documented and contain GDAL packages for Python 3. You really only need them for libraries which require a properly compiled binary written in C which has a lot of dependencies and includes many of the scientific and geospatial libraries.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset