In the next example, we'll enter the world of hydrology. Flooding is one of the most common and devastating natural disasters, which affects nearly every population on the globe. Geospatial models are a powerful tool in estimating the impact of a flood and mitigating that impact before it happens. We often hear in the news that a river is reaching flood stage. But that information is meaningless if we can't understand the actual impact. Hydrological flood models are expensive to develop and can be very complex. These models are essential for engineers building flood control systems. However, first responders and potential flood victims are only interested to know about the impact of an impending flood.
We can understand the flooding impact in an area using a very simple and easy-to-comprehend tool called a flood inundation model. This model starts with a single point and floods an area with the maximum volume of water that a flood basin can hold at a particular flood stage. Usually this analysis is the worst-case scenario. Hundreds of other factors go into calculating how much water will enter into a basin from a river topping flood stage. But, we can still learn a lot from this simple first-order model.
The following image is a Digital Elevation Model (DEM) with a source point displayed as a yellow star near Houston, Texas. In real-world analysis, this point would likely be a stream gauge where you would have data about the river's water level.
As mentioned in the Elevation data section in Chapter 1, Learning Geospatial Analysis with Python, the Shuttle Radar Topography Mission (SRTM) dataset provides a nearly global DEM that you can use for these types of models. More information on SRTM data can be found at the following link:
You can download the ASCII Grid data in EPSG:4326 and a shapefile containing the point as a ZIP file from the following URL:
The given shapefile image here is just for reference and has no role in this model:
The algorithm we are introducing in this example is called a flood fill algorithm, which is not really surprising. This algorithm is well known in the field of Computer Science and is used in the classic computer game Mine Sweeper to clear empty squares on the board when a user clicks on a square. It is also the method used for the well-known paint bucket tool in graphics programs such as Adobe Photoshop, where the paint bucket is used to fill an area of adjacent pixels of the same color with a different color. There are many ways to implement this algorithm. One of the oldest and most common ways is to recursively crawl through each pixel of the image. The problem with recursion is that you end up processing pixels more than once and then creating an unnecessary amount of work. The resource usage for a recursive flood fill can easily crash a program on even a moderately-sized image.
This script uses a four-way queue-based flood fill that may visit a cell more than once but ensures we only process a cell once. The queue contains unique, unprocessed cells by using Python's built-in set
type, which only holds unique values. We use two sets called fill that contain the cells we need to fill, and filled that contain processed cells.
This example executes the following steps:
numpy
array.1
, 0
) array (that is, binary array) with flooded pixels as 1
.Note that because this example can take a minute or two to run on a slower machine, we'll use print
statements throughout the script as a simple way to track progress. Once again, we'll break this script up with explanations for clarity.
We use ASCII Grids in this example, which means the engine for this model is completely in NumPy. We start off by defining the floodFill()
function, which is the heart and soul of this model. The following Wikipedia article on flood fill algorithms provides an excellent overview of the different approaches:
http://en.wikipedia.org/wiki/Flood_fill
Flood fill algorithms start at a given cell and then check the neighboring cells for similarity. The similarity factor might be color or, in our case, elevation. If the neighboring cell is of the same or lower elevation as the current cell, then that cell is marked for checks of its neighbor until the entire grid is checked. NumPy isn't designed to crawl over an array in this way, but it is still efficient in handling multi-dimensional arrays overall. We step through each cell and check its neighbors to the north, south, east, and west. Any of those cells which can be flooded are added to the filled set and their neighbors added to the fill set to be checked by the algorithm.
As mentioned earlier, if you try to add the same value to a set twice, it just ignores the duplicate entry and maintains a unique list. By using sets in an array, we efficiently check a cell only once because the fill set contains unique cells, as shown in the following script:
import numpy as np from linecache import getline def floodFill(c, r, mask): """ Crawls a mask array containing only 1 and 0 values from the starting point (c=column, r=row - a.k.a. x, y) and returns an array with all 1 values connected to the starting cell. This algorithm performs a 4-way check non-recursively. """ # cells already filled filled = set() # cells to fill fill = set() fill.add((c, r)) width = mask.shape[1]-1 height = mask.shape[0]-1 # Our output inundation array flood = np.zeros_like(mask, dtype=np.int8) # Loop through and modify the cells which # need to be checked. while fill: # Grab a cell x, y = fill.pop() if y == height or x == width or x < 0 or y < 0: # Don't fill continue if mask[y][x] == 1: # Do fill flood[y][x] = 1 filled.add((x, y)) # Check neighbors for 1 values west = (x-1, y) east = (x+1, y) north = (x, y-1) south = (x, y+1) if west not in filled: fill.add(west) if east not in filled: fill.add(east) if north not in filled: fill.add(north) if south not in filled: fill.add(south) return flood
In the remainder of the script, we load our terrain data from an ASCII Grid and define our output grid file name, and then we execute the algorithm on the terrain data. The seed of the flood-fill algorithm is an arbitrary point as sx
and sy
within the lower elevation areas. In a real-world application, these points would likely be a known location such as a stream gauge or a breach in a dam. In the final step, we save the output grid, as shown here:
source = "terrain.asc" target = "flood.asc" print("Opening image...") img = np.loadtxt(source, skiprows=6) print("Image opened") # Mask elevations lower than 70 meters. wet = np.where(img < 70, 1, 0) print("Image masked") # Parse the header using a loop and # the built-in linecache module hdr = [getline(source, i) for i in range(1, 7)] values = [float(h.split(" ")[-1].strip()) for h in hdr] cols, rows, lx, ly, cell, nd = values xres = cell yres = cell * -1 # Starting point for the # flood inundation in pixel coordinates sx = 2582 sy = 2057 print("Beginning flood fill") fld = floodFill(sx, sy, wet) print("Finished flood fill") header = "" for i in range(6): header += hdr[i] print("Saving grid") # Open the output file, add the hdr, save the array with open(target, "wb") as f: f.write(bytes(header, 'UTF-8')) np.savetxt(f, fld, fmt="%1i") print("Done!")
The image in the following screenshot shows the flood inundation output over a classified version of the DEM with lower elevation values in brown, mid-range values in green, and higher values in gray and white. The flood raster , which includes all areas less than 70 meters, is colored blue. This image was created with QGIS but could be displayed in ArcGIS as EPSG:4326. You could also use GDAL to save the flood raster grid as an 8-bit TIFF or JPEG just like the NDVI example to view it in a standard graphics program:
The image in the following screenshot is nearly identical except for the filtered mask from which the inundation was derived; this is displayed in yellow by generating a file for the array called wet
instead of fld
, to show the non-contiguous regions that were not included as part of a flood. These areas are not connected to the source point, so are unlikely to be reached during a flood event:
By changing the elevation value, you can create additional flood inundation rasters. We started with an elevation of 70
meters. If we increase that value to 90
, we can expand the flood. The following screenshot shows a flood event at both 70
and 90
meters. The 90-meter inundation is the lighter blue polygon. You can take bigger or smaller steps and show different impacts as different layers:
This model is an excellent and useful visualization. However, you could take this analysis even further by using GDAL's polygonize()
method on the flood mask, as we did with the island in the Extracting features from images section in Chapter 6, Python and Remote Sensing. This operation will give you a vector flood polygon. Then you could use the principles we discussed in the Performing selections section in Chapter 5, Python and Geographic Information Systems, to select buildings using the polygon to determine the population impact. You could also combine that flood polygon with the dot-density example in Dot density calculations section in Chapter 5, Python and Geographic Information Systems, to assess potential population impact of a flood. The possibilities are endless.