LIDAR stands for Light Detection and Ranging. It is similar to radar-based images but uses finite laser beams, which hit the ground hundreds of thousands of times per second to collect a huge amount of very fine (x,y,z
) locations as well as time and intensity. The intensity value is what really separates LIDAR from other data types. For example, the asphalt rooftop of a building may be of the same elevation as the top of a nearby tree, but the intensities will be different. Just like the remote sensing radiance values in a multispectral satellite image allow us to build classification libraries, the intensity values of LIDAR data allow us to classify and colorize LIDAR data.
The high volume and precision of LIDAR actually makes it difficult to use. A LIDAR dataset is referred to as a point cloud because the shape of the dataset is usually irregular as the data is three-dimensional with outlying points. There are not many software packages that effectively visualize point clouds. Furthermore, an irregular-shaped collection of finite points is just hard to interact with, even when using appropriate software.
For these reasons, one of the most common operations on LIDAR data is to project the data and resample it to a regular grid. We'll do exactly this using a small LIDAR dataset. This dataset is approximately 7 MB uncompressed and contains over 600,000 points. The data captures some easily identifiable features, such as buildings, trees, and cars in parking lots. You can download the zipped dataset at http://git.io/vOERW.
The file format is a very common binary format specific to LIDAR called LAS, which is short for laser. Unzip this file to your working directory. In order to read this format, we'll use a pure Python library called laspy
. You can install the version for Python 3 using the following command:
pip install http://git.io/vOER9
With laspy
installed, we are ready to create a grid from LIDAR. This script is fairly straightforward. We loop through the (x,y) point locations in the LIDAR data and project them to our grid with a cell size of one meter. Due to the precision of the LIDAR data, we'll end up with multiple points in a single cell. We average these points to create a common elevation value. Another issue that we have to deal with is data loss. Whenever you resample the data, you lose information. In this case, we'll end up with NODATA holes in the middle of the raster. To deal with this issue, we fill these holes with average values from the surrounding cells, which is a form of interpolation.
We only need two modules, both available on PyPI, as shown in the following code:
from laspy.file import File import numpy as np # Source LAS file source = "lidar.las" # Output ASCII DEM file target = "lidar.asc" # Grid cell size (data units) cell = 1.0 # No data value for output DEM NODATA = 0 # Open LIDAR LAS file las = File(source, mode="r") # xyz min and max min = las.header.min max = las.header.max # Get the x axis distance im meters xdist = max[0] - min[0] # Get the y axis distance in meters ydist = max[1] - min[1] # Number of columns for our grid cols = int(xdist) / cell # Number of rows for our grid rows = int(ydist) / cell cols += 1 rows += 1 # Track how many elevation # values we aggregate count = np.zeros((rows, cols)).astype(np.float32) # Aggregate elevation values zsum = np.zeros((rows, cols)).astype(np.float32) # Y resolution is negative ycell = -1 * cell # Project x, y values to grid projx = (las.x - min[0]) / cell projy = (las.y - min[1]) / ycell # Cast to integers and clip for use as index ix = projx.astype(np.int32) iy = projy.astype(np.int32) # Loop through x, y, z arrays, add to grid shape, # and aggregate values for averaging for x, y, z in np.nditer([ix, iy, las.z]): count[y, x] += 1 zsum[y, x] += z # Change 0 values to 1 to avoid numpy warnings, # and NaN values in array nonzero = np.where(count > 0, count, 1) # Average our z values zavg = zsum / nonzero # Interpolate 0 values in array to avoid any # holes in the grid mean = np.ones((rows, cols)) * np.mean(zavg) left = np.roll(zavg, -1, 1) lavg = np.where(left > 0, left, mean) right = np.roll(zavg, 1, 1) ravg = np.where(right > 0, right, mean) interpolate = (lavg + ravg) / 2 fill = np.where(zavg > 0, zavg, interpolate) # Create our ASCII DEM header header = "ncols {} ".format(fill.shape[1]) header += "nrows {} ".format(fill.shape[0]) header += "xllcorner {} ".format(min[0]) header += "yllcorner {} ".format(min[1]) header += "cellsize {} ".format(cell) header += "NODATA_value {} ".format(NODATA) # Open the output file, add the header, save the array with open(target, "wb") as f: f.write(bytes(header, 'UTF-8')) # The fmt string ensures we output floats # that have at least one number but only # two decimal places np.savetxt(f, fill, fmt="%1.2f")
The result of our script is an ASCIIGRID, which looks like the following image when viewed in OpenEV. Higher elevations are lighter while lower elevations are darker. Even in this form, you can see buildings, trees, and cars:
If we assigned a heat map color ramp, the colors give you a sharper sense of the elevation differences:
So, what happens if we run this output DEM through our shaded relief script from earlier? There's a big difference between straight-sided buildings and sloping mountains. If you change the input and output names in the shaded relief script to process the LIDAR DEM, we get the following slope result:
The gently rolling slope of the mountainous terrain is reduced to outlines of major features in the image. In the aspect image, the changes are so sharp and over such short distances that the output image is very chaotic to view, as shown in the following screenshot:
Despite the difference in these images and the coarser but smoother mountain versions, we still get a very nice shaded relief, which visually resembles a black and white photograph:
The previous DEM images in this chapter were visualized using QGIS and OpenEV. We can also create output images in Python by introducing some new functions of the Python Imaging Library (PIL), which we didn't use in the previous chapters. In this example, we'll use the PIL.ImageOps
module, which has functions for histogram equalization and automatic contrast enhancement. We'll use PIL's fromarray()
method to import the data from NumPy. Let's see how close we can get to the output of the desktop GIS programs pictured in this chapter with the help of the following code:
import numpy as np try: import Image import ImageOps except: from PIL import Image, ImageOps # Source gridded LIDAR DEM file source = "lidar.asc" # Output image file target = "lidar.bmp" # Load the ASCII DEM into a numpy array arr = np.loadtxt(source, skiprows=6) # Convert array to numpy image im = Image.fromarray(arr).convert("RGB") # Enhance the image: # equalize and increase contrast im = ImageOps.equalize(im) im = ImageOps.autocontrast(im) # Save the image im.save(target)
As you can see in the following screenshot, the enhanced shaded relief has a sharper relief than the previous version:
Now let's colorize our shaded relief. We'll use the built-in Python colorsys
module for color space conversion. Normally, we specify colors as RGB values. However, to create a color ramp for a heat map scheme, we'll use HSV values, which stand for Hue, Saturation, and Value, to generate our colors. The advantage of HSV is that you can tweak the H value as a degree between zero and 360 on a color wheel. Using a single value for hue allows you to use a linear ramping equation, which is much easier than trying to deal with combinations of three separate RGB values. The following image taken from the online magazine, Qt Quarterly, illustrates the HSV color model:
The colorsys
module lets you switch back and forth between the HSV and RGB values. The module returns percentages for RGB values, which must then be mapped to the 0-255 scale for each color.
In the following code, we'll convert the ASCII DEM to a PIL image, build our color palette, apply the color palette to the grayscale image, and save the image:
import numpy as np try: import Image import ImageOps except: from PIL import Image, ImageOps import colorsys # Source LIDAR DEM file source = "lidar.asc" # Output image file target = "lidar.bmp" # Load the ASCII DEM into a numpy array arr = np.loadtxt(source, skiprows=6) # Convert the numpy array to a PIL image. # Use black and white mode so we can stack # three bands for the color image. im = Image.fromarray(arr).convert('L') # Enhance the image im = ImageOps.equalize(im) im = ImageOps.autocontrast(im) # Begin building our color ramp palette = [] # Hue, Saturation, Value # color space starting with yellow. h = .67 s = 1 v = 1 # We'll step through colors from: # blue-green-yellow-orange-red. # Blue=low elevation, Red=high-elevation step = h / 256.0 # Build the palette for i in range(256): rp, gp, bp = colorsys.hsv_to_rgb(h, s, v) r = int(rp * 255) g = int(gp * 255) b = int(bp * 255) palette.extend([r, g, b]) h -= step # Apply the palette to the image im.putpalette(palette) # Save the image im.save(target)
The code produces the following image with higher elevations in warmer colors and lower elevations in cooler colors:
In this image, we actually get more variation than the default QGIS version. We could potentially improve this image with a smoothing algorithm that would blend the colors where they meet and soften the image visually. As you can see, we have the full range of our color ramp expressed from cool to warm colors as the elevation change increases.
The following example is our most sophisticated example yet. A triangulated irregular network (TIN) is a vector representation of a point dataset in a vector surface of points connected as triangles. An algorithm determines which points are absolutely necessary to accurately represent the terrain as opposed to a raster, which stores a fixed number of cells over a given area and may repeat elevation values in adjacent cells that could be more efficiently stored as a polygon. A TIN can also be resampled more efficiently on the fly than a raster, which requires less computer memory and processing power when using TIN in a GIS. The most common type of TIN is based on Delaunay triangulation, which includes all the points without redundant triangles.
The Delaunay triangulation is very complex. We'll use a pure Python library written by Bill Simons as a part of Steve Fortune's Delaunay triangulation algorithm called voronoi.py
to calculate the triangles in our LIDAR data. You can download the script to your working directory or site-packages directory from http://git.io/vOEuJ.
This script reads the LAS file, generates the triangles, then loops through them, and writes out a shapefile. For this example, we'll use a clipped version of our LIDAR data to reduce the area to process. If we run our entire dataset of 600,000 plus points, the script will run for hours and generate over half a million triangles. You can download the clipped LIDAR dataset as a ZIP file from the following URL:
We have several status messages that print while the script runs because of the time-intensive nature of the following example, which can take several minutes to be complete. We'll be storing the triangles as PolygonZ
types, which allow the vertices to have a z
elevation value. Unzip the LAS file and run the following code to generate a shapefile called mesh.shp
:
import pickle import os import time import math import numpy as np import shapefile # laspy for Python 3: pip install http://git.io/vOER9 from laspy.file import File # voronoi.py for Python 3: pip install http://git.io/vOEuJ import voronoi # Source LAS file source = "clippedLAS.las" # Output shapefile target = "mesh" # Triangles pickle archive archive = "triangles.p" # Pyshp archive pyshp = "mesh_pyshp.p" class Point: """Point class required by the voronoi module""" def __init__(self, x, y): self.px = x self.py = y def x(self): return self.px def y(self): return self.py # The triangle array holds tuples # 3 point indicesused to retrieve the points. # Load it from a pickle # file or use the voronoi module # to create the triangles. triangles = None if os.path.exists(archive): print("Loading triangle archive...") f = open(archive, "rb") triangles = pickle.load(f) f.close() # Open LIDAR LAS file las = File(source, mode="r") else: # Open LIDAR LAS file las = File(source, mode="r") points = [] print("Assembling points...") # Pull points from LAS file for x, y in np.nditer((las.x, las.y)): points.append(Point(x, y)) print("Composing triangles...") # Delaunay Triangulation triangles = voronoi.computeDelaunayTriangulation(points) # Save the triangles to save time if we write more than # one shapefile. f = open(archive, "wb") pickle.dump(triangles, f, protocol=2) f.close() print("Creating shapefile...") w = None if os.path.exists(pyshp): f = open(pyshp, "rb") w = pickle.load(f) f.close() else: # PolygonZ shapefile (x, y, z, m) w = shapefile.Writer(shapefile.POLYGONZ) w.field("X1", "C", "40") w.field("X2", "C", "40") w.field("X3", "C", "40") w.field("Y1", "C", "40") w.field("Y2", "C", "40") w.field("Y3", "C", "40") w.field("Z1", "C", "40") w.field("Z2", "C", "40") w.field("Z3", "C", "40") tris = len(triangles) # Loop through shapes and # track progress every 10 percent last_percent = 0 for i in range(tris): t = triangles[i] percent = int((i/(tris*1.0))*100.0) if percent % 10.0 == 0 and percent > last_percent: last_percent = percent print("{} % done - Shape {}/{} at {}".format(percent, i, tris, time.asctime())) part = [] x1 = las.x[t[0]] y1 = las.y[t[0]] z1 = las.z[t[0]] x2 = las.x[t[1]] y2 = las.y[t[1]] z2 = las.z[t[1]] x3 = las.x[t[2]] y3 = las.y[t[2]] z3 = las.z[t[2]] # Check segments for large triangles # along the convex hull which is acommon # artificat in Delaunay triangulation max = 3 if math.sqrt((x2-x1)**2+(y2-y1)**2) > max: continue if math.sqrt((x3-x2)**2+(y3-y2)**2) > max: continue if math.sqrt((x3-x1)**2+(y3-y1)**2) > max: continue part.append([x1, y1, z1, 0]) part.append([x2, y2, z2, 0]) part.append([x3, y3, z3, 0]) w.poly(parts=[part]) w.record(x1, x2, x3, y1, y2, y3, z1, z2, z3) print("Saving shapefile...") # Pickle the Writer in case something # goes wrong. Be sure to delete this # file to recreate theshapefile. f = open(pyshp, "wb") pickle.dump(w, f, protocol=2) f.close() w.save(target) print("Done.")
The following image shows a zoomed-in version of the TIN over the colorized LIDAR data: