Working with LIDAR

LIDAR stands for Light Detection and Ranging. It is similar to radar-based images but uses finite laser beams, which hit the ground hundreds of thousands of times per second to collect a huge amount of very fine (x,y,z) locations as well as time and intensity. The intensity value is what really separates LIDAR from other data types. For example, the asphalt rooftop of a building may be of the same elevation as the top of a nearby tree, but the intensities will be different. Just like the remote sensing radiance values in a multispectral satellite image allow us to build classification libraries, the intensity values of LIDAR data allow us to classify and colorize LIDAR data.

The high volume and precision of LIDAR actually makes it difficult to use. A LIDAR dataset is referred to as a point cloud because the shape of the dataset is usually irregular as the data is three-dimensional with outlying points. There are not many software packages that effectively visualize point clouds. Furthermore, an irregular-shaped collection of finite points is just hard to interact with, even when using appropriate software.

For these reasons, one of the most common operations on LIDAR data is to project the data and resample it to a regular grid. We'll do exactly this using a small LIDAR dataset. This dataset is approximately 7 MB uncompressed and contains over 600,000 points. The data captures some easily identifiable features, such as buildings, trees, and cars in parking lots. You can download the zipped dataset at http://git.io/vOERW.

The file format is a very common binary format specific to LIDAR called LAS, which is short for laser. Unzip this file to your working directory. In order to read this format, we'll use a pure Python library called laspy. You can install the version for Python 3 using the following command:

pip install http://git.io/vOER9

Creating a grid from LIDAR

With laspy installed, we are ready to create a grid from LIDAR. This script is fairly straightforward. We loop through the (x,y) point locations in the LIDAR data and project them to our grid with a cell size of one meter. Due to the precision of the LIDAR data, we'll end up with multiple points in a single cell. We average these points to create a common elevation value. Another issue that we have to deal with is data loss. Whenever you resample the data, you lose information. In this case, we'll end up with NODATA holes in the middle of the raster. To deal with this issue, we fill these holes with average values from the surrounding cells, which is a form of interpolation.

We only need two modules, both available on PyPI, as shown in the following code:

from laspy.file import File
import numpy as np

# Source LAS file
source = "lidar.las"

# Output ASCII DEM file
target = "lidar.asc"

# Grid cell size (data units)
cell = 1.0

# No data value for output DEM
NODATA = 0

# Open LIDAR LAS file
las = File(source, mode="r")

# xyz min and max
min = las.header.min
max = las.header.max

# Get the x axis distance im meters
xdist = max[0] - min[0]

# Get the y axis distance in meters
ydist = max[1] - min[1]

# Number of columns for our grid
cols = int(xdist) / cell

# Number of rows for our grid
rows = int(ydist) / cell

cols += 1
rows += 1

# Track how many elevation
# values we aggregate
count = np.zeros((rows, cols)).astype(np.float32)
# Aggregate elevation values
zsum = np.zeros((rows, cols)).astype(np.float32)

# Y resolution is negative
ycell = -1 * cell

# Project x, y values to grid
projx = (las.x - min[0]) / cell
projy = (las.y - min[1]) / ycell
# Cast to integers and clip for use as index
ix = projx.astype(np.int32)
iy = projy.astype(np.int32)

# Loop through x, y, z arrays, add to grid shape,
# and aggregate values for averaging
for x, y, z in np.nditer([ix, iy, las.z]):
    count[y, x] += 1
    zsum[y, x] += z

# Change 0 values to 1 to avoid numpy warnings,
# and NaN values in array
nonzero = np.where(count > 0, count, 1)
# Average our z values
zavg = zsum / nonzero

# Interpolate 0 values in array to avoid any
# holes in the grid
mean = np.ones((rows, cols)) * np.mean(zavg)
left = np.roll(zavg, -1, 1)
lavg = np.where(left > 0, left, mean)
right = np.roll(zavg, 1, 1)
ravg = np.where(right > 0, right, mean)
interpolate = (lavg + ravg) / 2
fill = np.where(zavg > 0, zavg, interpolate)

# Create our ASCII DEM header
header = "ncols        {}
".format(fill.shape[1])
header += "nrows        {}
".format(fill.shape[0])
header += "xllcorner    {}
".format(min[0])
header += "yllcorner    {}
".format(min[1])
header += "cellsize     {}
".format(cell)
header += "NODATA_value      {}
".format(NODATA)

# Open the output file, add the header, save the array
with open(target, "wb") as f:
    f.write(bytes(header, 'UTF-8'))
    # The fmt string ensures we output floats
    # that have at least one number but only
    # two decimal places
    np.savetxt(f, fill, fmt="%1.2f")

The result of our script is an ASCIIGRID, which looks like the following image when viewed in OpenEV. Higher elevations are lighter while lower elevations are darker. Even in this form, you can see buildings, trees, and cars:

Creating a grid from LIDAR

If we assigned a heat map color ramp, the colors give you a sharper sense of the elevation differences:

Creating a grid from LIDAR

So, what happens if we run this output DEM through our shaded relief script from earlier? There's a big difference between straight-sided buildings and sloping mountains. If you change the input and output names in the shaded relief script to process the LIDAR DEM, we get the following slope result:

Creating a grid from LIDAR

The gently rolling slope of the mountainous terrain is reduced to outlines of major features in the image. In the aspect image, the changes are so sharp and over such short distances that the output image is very chaotic to view, as shown in the following screenshot:

Creating a grid from LIDAR

Despite the difference in these images and the coarser but smoother mountain versions, we still get a very nice shaded relief, which visually resembles a black and white photograph:

Creating a grid from LIDAR

Using PIL to visualize LIDAR

The previous DEM images in this chapter were visualized using QGIS and OpenEV. We can also create output images in Python by introducing some new functions of the Python Imaging Library (PIL), which we didn't use in the previous chapters. In this example, we'll use the PIL.ImageOps module, which has functions for histogram equalization and automatic contrast enhancement. We'll use PIL's fromarray() method to import the data from NumPy. Let's see how close we can get to the output of the desktop GIS programs pictured in this chapter with the help of the following code:

import numpy as np

try:
    import Image
    import ImageOps
except:
    from PIL import Image, ImageOps

# Source gridded LIDAR DEM file
source = "lidar.asc"

# Output image file
target = "lidar.bmp"

# Load the ASCII DEM into a numpy array
arr = np.loadtxt(source, skiprows=6)

# Convert array to numpy image
im = Image.fromarray(arr).convert("RGB")

# Enhance the image:
# equalize and increase contrast
im = ImageOps.equalize(im)
im = ImageOps.autocontrast(im)

# Save the image
im.save(target)

As you can see in the following screenshot, the enhanced shaded relief has a sharper relief than the previous version:

Using PIL to visualize LIDAR

Now let's colorize our shaded relief. We'll use the built-in Python colorsys module for color space conversion. Normally, we specify colors as RGB values. However, to create a color ramp for a heat map scheme, we'll use HSV values, which stand for Hue, Saturation, and Value, to generate our colors. The advantage of HSV is that you can tweak the H value as a degree between zero and 360 on a color wheel. Using a single value for hue allows you to use a linear ramping equation, which is much easier than trying to deal with combinations of three separate RGB values. The following image taken from the online magazine, Qt Quarterly, illustrates the HSV color model:

Using PIL to visualize LIDAR

The colorsys module lets you switch back and forth between the HSV and RGB values. The module returns percentages for RGB values, which must then be mapped to the 0-255 scale for each color.

In the following code, we'll convert the ASCII DEM to a PIL image, build our color palette, apply the color palette to the grayscale image, and save the image:

import numpy as np
try:
    import Image
    import ImageOps
except:
    from PIL import Image, ImageOps
import colorsys

# Source LIDAR DEM file
source = "lidar.asc"

# Output image file
target = "lidar.bmp"

# Load the ASCII DEM into a numpy array
arr = np.loadtxt(source, skiprows=6)

# Convert the numpy array to a PIL image.
# Use black and white mode so we can stack
# three bands for the color image.
im = Image.fromarray(arr).convert('L')

# Enhance the image
im = ImageOps.equalize(im)
im = ImageOps.autocontrast(im)

# Begin building our color ramp
palette = []

# Hue, Saturation, Value
# color space starting with yellow.
h = .67
s = 1
v = 1

# We'll step through colors from:
# blue-green-yellow-orange-red.
# Blue=low elevation, Red=high-elevation
step = h / 256.0

# Build the palette
for i in range(256):
    rp, gp, bp = colorsys.hsv_to_rgb(h, s, v)
    r = int(rp * 255)
    g = int(gp * 255)
    b = int(bp * 255)
    palette.extend([r, g, b])
    h -= step

# Apply the palette to the image
im.putpalette(palette)

# Save the image
im.save(target)

The code produces the following image with higher elevations in warmer colors and lower elevations in cooler colors:

Using PIL to visualize LIDAR

In this image, we actually get more variation than the default QGIS version. We could potentially improve this image with a smoothing algorithm that would blend the colors where they meet and soften the image visually. As you can see, we have the full range of our color ramp expressed from cool to warm colors as the elevation change increases.

Creating a triangulated irregular network

The following example is our most sophisticated example yet. A triangulated irregular network (TIN) is a vector representation of a point dataset in a vector surface of points connected as triangles. An algorithm determines which points are absolutely necessary to accurately represent the terrain as opposed to a raster, which stores a fixed number of cells over a given area and may repeat elevation values in adjacent cells that could be more efficiently stored as a polygon. A TIN can also be resampled more efficiently on the fly than a raster, which requires less computer memory and processing power when using TIN in a GIS. The most common type of TIN is based on Delaunay triangulation, which includes all the points without redundant triangles.

The Delaunay triangulation is very complex. We'll use a pure Python library written by Bill Simons as a part of Steve Fortune's Delaunay triangulation algorithm called voronoi.py to calculate the triangles in our LIDAR data. You can download the script to your working directory or site-packages directory from http://git.io/vOEuJ.

This script reads the LAS file, generates the triangles, then loops through them, and writes out a shapefile. For this example, we'll use a clipped version of our LIDAR data to reduce the area to process. If we run our entire dataset of 600,000 plus points, the script will run for hours and generate over half a million triangles. You can download the clipped LIDAR dataset as a ZIP file from the following URL:

http://git.io/vOE62

We have several status messages that print while the script runs because of the time-intensive nature of the following example, which can take several minutes to be complete. We'll be storing the triangles as PolygonZ types, which allow the vertices to have a z elevation value. Unzip the LAS file and run the following code to generate a shapefile called mesh.shp:

import pickle
import os
import time
import math
import numpy as np
import shapefile

# laspy for Python 3: pip install http://git.io/vOER9
from laspy.file import File

# voronoi.py for Python 3: pip install http://git.io/vOEuJ
import voronoi

# Source LAS file
source = "clippedLAS.las"

# Output shapefile
target = "mesh"

# Triangles pickle archive
archive = "triangles.p"

# Pyshp archive
pyshp = "mesh_pyshp.p"


class Point:
    """Point class required by the voronoi module"""
    def __init__(self, x, y):
        self.px = x
        self.py = y

    def x(self):
        return self.px

    def y(self):
        return self.py

# The triangle array holds tuples
# 3 point indicesused to retrieve the points.
# Load it from a pickle
# file or use the voronoi module
# to create the triangles.
triangles = None

if os.path.exists(archive):
    print("Loading triangle archive...")
    f = open(archive, "rb")
    triangles = pickle.load(f)
    f.close()
    # Open LIDAR LAS file
    las = File(source, mode="r")
else:
    # Open LIDAR LAS file
    las = File(source, mode="r")
    points = []
    print("Assembling points...")
    # Pull points from LAS file
    for x, y in np.nditer((las.x, las.y)):
        points.append(Point(x, y))
    print("Composing triangles...")
    # Delaunay Triangulation
    triangles = voronoi.computeDelaunayTriangulation(points)
    # Save the triangles to save time if we write more than
    # one shapefile.
    f = open(archive, "wb")
    pickle.dump(triangles, f, protocol=2)
    f.close()

print("Creating shapefile...")
w = None
if os.path.exists(pyshp):
    f = open(pyshp, "rb")
    w = pickle.load(f)
    f.close()
else:
    # PolygonZ shapefile (x, y, z, m)
    w = shapefile.Writer(shapefile.POLYGONZ)
    w.field("X1", "C", "40")
    w.field("X2", "C", "40")
    w.field("X3", "C", "40")
    w.field("Y1", "C", "40")
    w.field("Y2", "C", "40")
    w.field("Y3", "C", "40")
    w.field("Z1", "C", "40")
    w.field("Z2", "C", "40")
    w.field("Z3", "C", "40")
    tris = len(triangles)
    # Loop through shapes and
    # track progress every 10 percent
    last_percent = 0
    for i in range(tris):
        t = triangles[i]
        percent = int((i/(tris*1.0))*100.0)
        if percent % 10.0 == 0 and percent > last_percent:
            last_percent = percent
            print("{} % done - Shape {}/{} at {}".format(percent, i, tris, time.asctime()))
        part = []
        x1 = las.x[t[0]]
        y1 = las.y[t[0]]
        z1 = las.z[t[0]]
        x2 = las.x[t[1]]
        y2 = las.y[t[1]]
        z2 = las.z[t[1]]
        x3 = las.x[t[2]]
        y3 = las.y[t[2]]
        z3 = las.z[t[2]]
        # Check segments for large triangles
        # along the convex hull which is acommon
        # artificat in Delaunay triangulation
        max = 3
        if math.sqrt((x2-x1)**2+(y2-y1)**2) > max:
            continue
        if math.sqrt((x3-x2)**2+(y3-y2)**2) > max:
            continue
        if math.sqrt((x3-x1)**2+(y3-y1)**2) > max:
            continue
        part.append([x1, y1, z1, 0])
        part.append([x2, y2, z2, 0])
        part.append([x3, y3, z3, 0])
        w.poly(parts=[part])
        w.record(x1, x2, x3, y1, y2, y3, z1, z2, z3)
print("Saving shapefile...")
# Pickle the Writer in case something
# goes wrong. Be sure to delete this
# file to recreate theshapefile.
f = open(pyshp, "wb")
pickle.dump(w, f, protocol=2)
f.close()
w.save(target)
print("Done.")

The following image shows a zoomed-in version of the TIN over the colorized LIDAR data:

Creating a triangulated irregular network
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset