Using the GDAL library to load and query rasters

Now that you have gdal installed, import it using:

from osgeo import gdal

GDAL 2 is the most recent version. If you have an older version of gdal installed, you may need to import it using the following code:

import gdal

If this is the case, you may want to look into upgrading your version of gdal. Once you have gdal imported, you can open a raster image. First, let's get an image from the web. The Earth Data Analysis Center at the University of New Mexico maintains the Resource Geographic Information System (RGIS). In it, you will find New Mexico GIS data. Browse to http://rgis.unm.edu/ and from the Get Data link, Select Shaded Relief, General, and New Mexico. Then, download the Color Shaded Relief of New Mexico (Georeferenced TIFF) file.

When you extract the ZIP file, you will have several files. We are only interested in nm_relief_color.tif. The following code will open TIF using gdal:

nmtif = gdal.Open(r'C:DesktopColorRelief
m_relief_color.tif')
print(nmtif.GetMetadata())

The previous code opens TIF. It is very similar to opening any file in Python, except you used gdal.Open instead of the standard Python library open. The next line prints the metadata from the TIF, and the output is shown as follows:

{'AREA_OR_POINT': 'Area', 'TIFFTAG_DATETIME': '2002:12:18 8:10:06', 'TIFFTAG_RESOLUTIONUNIT': '2 (pixels/inch)', 'TIFFTAG_SOFTWARE': 'IMAGINE TIFF Support
Copyright 1991 - 1999 by ERDAS, Inc. All Rights Reserved
@(#)$RCSfile: etif.c $ $Revision: 1.9.3.3 $ $Date: 2002/07/29 15:51:11EDT $', 'TIFFTAG_XRESOLUTION': '96', 'TIFFTAG_YRESOLUTION': '96'}

The previous metadata gives you some basic information such as dates created and revised, the resolution, and pixels per inch. One characteristic of the data we are interested in is the projection. To find it, use the following code:

nmtif.GetProjection()

Using the GetProjection method on the TIF, you will see that we didn't find any. The output of the code is as follows:

'LOCAL_CS[" Geocoding information not available Projection Name = Unknown Units = other GeoTIFF Units = other",UNIT["unknown",1]]'

If you open this TIF in QGIS, you will get a warning that the CRS is undefined and it will default to epsg:4326. I know that the image is projected and we can find this out by looking at the nm_relief_color.tif.xml file. If you scroll to the bottom, you will see the values under the XML tag <cordsysn>, as follows:

 <cordsysn>
<geogcsn>GCS_North_American_1983</geogcsn>
<projcsn>NAD_1983_UTM_Zone_13N</projcsn>
</cordsysn>

If you look up the projection at spatialreference.org, you will find that it is EPSG:26913. We can use gdal to set the projection, as shown in the following code:

from osgeo import osr
p=osr.SpatialReference()
p.ImportFromEPSG(26913)
nmtif.SetProjection(p.ExportToWkt())
nmtif.GetProjection()

The previous code imports the osr library. It then uses the library to create a new SpatialReference. Next, it imports a known reference using ImportFromEPSG and passes 26913. It then uses SetProjection, passing the WKT for EPSG:26913. Lastly, it calls GetProjection so that we can see that the code worked. The results are as follows:

'PROJCS["NAD83 / UTM zone 13N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26913"]]'

The previous output is the WKT for EPSG:26913.

Open QGIS and the TIF will load with no warnings. I can add a copy of the Albuquerque streets to it and they will appear exactly where they should. Both sets of data are in EPSG:26913. The following image shows the TIF and the streets in the center of New Mexico-Albuquerque:

Tif of NM with Streets shapefile

Now that we have added a projection, we can save a new version of the TIF:

geoTiffDriver="GTiff"
driver=gdal.GetDriverByName(geoTiffDriver)
out=driver.CreateCopy("copy.tif",nmtif,strict=0)

To see that the new file has the spatial reference, use the following code:

out.GetProjection()

The previous code will output the well-known text (WKT) for EPSG:26913, as follows:

 'PROJCS["NAD83 / UTM zone 13N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6269"]], PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian", -105],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1, AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26913"]]'

A color raster dataset has three bands—red, green, and blue. You can get each of the bands individually using the following code:

nmtif.RasterCount 

The previous code will return 3. Unlike an array, the bands are indexed 1-n, so a three band raster will have indexes 1, 2, and 3. You can grab a single band by passing the index to GetRasterBand(), which is shown in the following code:

band=nmtif.GetRasterBand(1)

Now that you have a raster band, you can perform queries on it and you can lookup values at positions. To find the value at a specified row and column, you can use the following code:

values=band.ReadAsArray()

Now, values is an array, so you can lookup values by index notation, as follows:

values[1100,1100]

The previous code will return a value of 216. In a single band array, this would be helpful, but in a colored image, you would most likely want to know the color at a location. This would require knowing the value of all three bands. You can do that by using the following code:

one= nmtif.GetRasterBand(1).ReadAsArray()
two = nmtif.GetRasterBand(2).ReadAsArray()
three= nmtif.GetRasterBand(3).ReadAsArray()
print(str(one[1100,1100])+","+ str(two[1100,1100])+","+str(three[1100,1100]))

The previous code returns the values—216, 189, 157. These are the RGB values of the pixel. These three values are composited—overlayed on each other, which, should be the color shown in the following image:

The color represented by the three bands at [1100,1100]

With a band, you have access to several methods for obtaining information about the band. You can get the mean and standard deviation of the values, as shown in the following code:

one=nmtif.GetRasterBand(1)
two=nmtif.GetRasterBand(2)
three=nmtif.GetRasterBand(3)
one.ComputeBandStats()
two.ComputeBandStats()
three.ComputeBandStats()

The output is shown as follows:

(225.05771967375847, 34.08382839593031)
(215.3145137636133, 37.83657996026153)
(195.34890652292185, 53.08308166590347)

You can also get the minimum and maximum values from a band, as shown in the following code:

print(str(one.GetMinimum())+","+str(one.GetMaximum()))

The result should be 0.0 and 255.0

You can also get the description of the band. The following code shows you how to get and set the description:

two.GetDescription()    # returns 'band_2'
two.SetDescription("The Green Band")
two.GetDescription() # returns "The Green Band"

The most obvious thing you may want to do with a raster dataset is to view the raster in Jupyter Notebook. There are several ways to load images in a Jupyter notebook, one being using HTML and an <img>. In the following code, you are shown how to plot the image using matplotlib:

import numpy as np
from matplotlib.pyplot import imshow
%matplotlib inline

data_array=nmtif.ReadAsArray()
x=np.array(data_array[0])
# x.shape ---> 6652,6300
w, h =6652, 6300
image = x.reshape(x.shape[0],x.shape[1])
imshow(image, cmap='gist_earth')

The previous code imports numpy and matplotlib.pyploy.imshow.

NumPy is a popular library for working with arrays. When dealing with rasters, which are arrays, you will benefit from having a strong understanding of the library. Packt published several books on NumPy such as NumPy Cookbook, NumPy Beginners Guide, and Learning NumPy Array, and this would be a good place to start learning more.

It then sets plotting an inline for this notebook. The code then reads in the TIF as an array. It then makes a numpy array from the first band.

Bands are indexed 1-n, but once read in as an array, they become indexed at 0.

To isolate the first band, the code reshapes the array using the width and height. Using x.shape, you can get them both, and if you index, you can get each one individually. Lastly, using imshow, the code plots the image using the color map for gist_earth. The image will display in Jupyter as follows:

Tif in Jupyter using imshow

Now that you know how to load a raster and perform basic operations, you will learn how to create a raster in the following section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset