© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. Paluszek et al.Practical MATLAB Deep Learninghttps://doi.org/10.1007/978-1-4842-7912-0_9

9. Terrain-Based Navigation

Michael Paluszek1  , Stephanie Thomas2   and Eric Ham2  
(1)
Plainsboro, NJ, USA
(2)
Princeton, NJ, USA
 

9.1 Introduction

Before the widespread availability of GPS, Loran, and other electronic navigation aids, pilots used visual cues from terrain to navigate. Now everyone uses GPS. We want to return to the good old days of terrain-based navigation. We will design a system that will be able to match terrain with a database. It will then use that information to determine where it is flying.

9.2 Modeling Our Aircraft

9.2.1 Problem

We want a three-dimensional aircraft model that can change direction.

9.2.2 Solution

Write the equations of motion for three-dimensional flight.

9.2.3 How It Works

The motion of a point mass through three-dimensional space has three degrees of freedom. Our aircraft model is therefore given three degrees of spatial freedom. The velocity vector is expressed as a wind-relative magnitude (V ) with directional components for heading (ψ) and flight path angle (γ). The position is a direct integral of the velocity and is expressed in y =  North, x =  East, h =  Vertical coordinates. In addition, the engine thrust is modeled as a first-order system where the time constant can be changed to approximate the engine response times of different aircraft.

Figure 9.1 shows a diagram of the velocity vector in the North-East-Up coordinate system. The time derivatives are taken in this frame. This is not a purely inertial coordinate system, because it is rotating with the Earth. However, the rate of rotation of the Earth is sufficiently small compared to the aircraft turning rates so that it can be safely neglected.
Figure 9.1

Velocity in North-East-Up coordinates.

The point-mass aircraft equations of motion are
$$displaystyle egin{aligned} egin{array}{rcl}{} dot{v} & = &displaystyle left(Tcosalpha -D -mgsingamma
ight)/m - f_v end{array} end{aligned} $$
(9.1)
$$displaystyle egin{aligned} egin{array}{rcl}{} dot{gamma} & = &displaystyle frac{1}{mv}left((L+Tsinalpha)cosphi-mgcosgamma + f_gamma
ight) end{array} end{aligned} $$
(9.2)
$$displaystyle egin{aligned} egin{array}{rcl} dot{psi} & = &displaystyle frac{1}{mvcosgamma}left((L+Tsinalpha)sinphi-f_{psi}
ight) end{array} end{aligned} $$
(9.3)
$$displaystyle egin{aligned} egin{array}{rcl} dot{x_e} & = &displaystyle vcosgammasinpsi + W_x end{array} end{aligned} $$
(9.4)
$$displaystyle egin{aligned} egin{array}{rcl} dot{y_n} & = &displaystyle vcosgammacospsi + W_y end{array} end{aligned} $$
(9.5)
$$displaystyle egin{aligned} egin{array}{rcl} dot{h} & = &displaystyle vsingamma + W_h end{array} end{aligned} $$
(9.6)
$$displaystyle egin{aligned} egin{array}{rcl} dot{m}& =&displaystyle -frac{T}{u_e} end{array} end{aligned} $$
(9.7)
where v is the true airspeed, T is the thrust, L is the lift, g is the acceleration of gravity, γ is the air-relative flight path angle, ψ is the air-relative heading (measured clockwise from North), ϕ is the bank angle, x and y are the East and North positions, respectively, and h is the altitude. The mass is the total of dry mass and fuel mass. The terms $$left {f_v,f_gamma ,f_psi 
ight }$$ represent additional forces due to modeling uncertainty, and the terms $$left {W_x,W_y,W_h
ight }$$ are wind speed components. If the vertical wind speed is zero, then γ = 0 produces level flight. α, ϕ, and T are the controls. Figure 9.2 shows the longitudinal symbols for the aircraft. γ is the angle between the velocity vector and local horizontal. α is the angle of attack which is between the nose of the aircraft and the velocity vector. The wings may be oriented, or have airfoils, that give lift at zero angles of attack. Drag is opposite velocity and lift is perpendicular to drag. Lift must balance gravity and any downward component of drag; otherwise, the aircraft will descend.
Figure 9.2

Aircraft model showing lift, drag, and gravity.

We are using a very simple aerodynamic model. The lift coefficient is defined as
$$displaystyle egin{aligned} c_L = c_{L_alpha}alpha end{aligned} $$
(9.8)
The lift coefficient is a nonlinear function of the angle of attack. It has a maximum angle of attack above which the wing stalls and all lift is lost. For a flat plate, $$ c_{L_alpha } = 2pi $$. The drag coefficient is
$$displaystyle egin{aligned} c_D = c_{D_0} + frac{c_L^2}{pi A_R epsilon} end{aligned} $$
(9.9)
where AR is the aspect ratio and ? is the Oswald efficiency factor which is typically from 0.8 to 0.95. The efficiency factor is how efficiently lift is coupled to drag. If it is less than one, it means that the lift produces more lift-induced drag than the ideal. The aspect ratio is the ratio of the wingspan (from the point nearest the fuselage to the tip) and the chord (the length from the front to the back of the wing).
The dynamic pressure, the pressure due to the motion of the aircraft, is
$$displaystyle egin{aligned} q = frac{1}{2}
ho v^2 end{aligned} $$
(9.10)
where v is the speed and ρ is the atmospheric density. This is the pressure on your hand if you stick it out of the window of a moving car. The lift and drag forces are
$$displaystyle egin{aligned} egin{array}{rcl} L = qc_Ls end{array} end{aligned} $$
(9.11)
$$displaystyle egin{aligned} egin{array}{rcl} D = qc_Ds end{array} end{aligned} $$
(9.12)
where s is the wetted area. The wetted area is the surface of the aircraft that produces lift and drag. We make it the same for lift and drag, but in a real aircraft some parts of the aircraft cause drag (like the nose) but don’t produce any lift. In essence, we assume the aircraft is all wing.
We create a right-hand-side function for the model. This will be called by the numerical integration function. The following has the dynamical model.
The default data structure is defined in the subfunction, DefaultDataStructure. The data structure includes both constant parameters and control inputs.
We use a modified exponential atmosphere for the density:
We want to maintain a force balance so that the speed of the aircraft is constant and the aircraft does not change its flight path angle. For example, in level flight the aircraft would not ascend or descend. We need to control the aircraft in level flight so that the velocity stays constant and γ = 0 for any ϕ. The relevant equations are
$$displaystyle egin{aligned} egin{array}{rcl}{} 0& = &displaystyle Tcosalpha -D end{array} end{aligned} $$
(9.13)
$$displaystyle egin{aligned} egin{array}{rcl} 0 & = &displaystyle (L+Tsinalpha)cosphi-mg end{array} end{aligned} $$
(9.14)
We need to find T and α given ϕ.
A simple way is to use fminsearch. It will call RHSPointMassAircraft and numerically find controls that, for a given ψ, h and v have zero time derivatives. The following code finds the equilibrium angle of attack and thrust. RHS is called by fminsearch. It returns a scalar cost that is a quadratic of the acceleration (time derivative of velocity) and a derivative of the flight path angle. Our initial guess is a value of thrust that balances the drag. Even with an angle of attack guess of zero, it converges with the default set of parameters opt = optimset(’fminsearch’).
The demo is for a Gulfstream 350 flying at 250 m/s and 10 km altitude.
The results of the demo, in the following, are quite reasonable.
With these values, the plane will turn without changing altitude or airspeed. We simulate the Gulfstream in the script AircraftSim.m. The first part runs our equilibrium computation demo.
The next part does the simulation. It breaks the loop if the aircraft altitude is less than zero, that is, it crashes. We call RHSPointMassAircraft once to get the lift and drag value for plotting. It is then called by RungeKutta to do the numerical integration. @ denotes a pointer to the function.
The remainder produces three plots. The first plot is the states that are numerically integrated. The next gives the controls, lift, and drag. The final plot shows the planar trajectory. We do unit conversions since degrees and kilometers are a bit clearer.

As you can see in Figure 9.3, the radius of the turn is 15 km as expected. The drag and lift remain constant. In practice, we would have a velocity and flight path angle control system to handle disturbances or parameter variations. For our deep learning example, we just use the ideal dynamics. Figure 9.4 shows the simulation outputs.

Figure 9.3 will provide a nice trajectory for our deep learning examples. You can change the aircraft simulation to produce other trajectories.
Figure 9.3

Aircraft trajectory.

Figure 9.4

Simulation outputs. States (the integrated quantities) are on the top. Lift, drag, and the controls ϕ, α, and T are on the bottom.

9.3 Generating Terrain

9.3.1 Problem

We want to create an artificial terrain model from a set of terrain “tiles.” A tile is a segment of terrain from a bigger picture, much like bathroom tiles make up a bathroom wall, unless, of course, you have the modern fiberglass shower.

9.3.2 Solution

Find images of terrain and tile them together. There are many sources of terrain tiles. Google Earth is one.

9.3.3 How It Works

We start by compiling a database of terrain tiles. We have them in the folder terrain in our MATLAB package. A segment of the terrain folder is shown in Figure 9.5. This is just one way to get terrain tiles. There are online sources for downloading tiles. Also, many flight simulator games have extensive terrain libraries. The name of the folder is latitude longitude. For example, -10-10 is −10 degrees latitude and −10 degrees longitude. Our database only extends to ± 60 degrees latitude. The first block creates a list of the folders in terrain. An important thing with this code is that your script needs to be in the correct directory. We don’t do any fancy directory searching.
Figure 9.5

A segment of the terrain folder.

The next code block finds the indices for the desired tiles.
The following code creates the filenames based on our latitudes and longitudes. We just create correctly formatted strings. This shows one way to create strings. Notice we use %d to create integers. It automatically makes them the right length. We need to check for positive and negative so that the + and - signs are correct.
The next block reads in the image, flips it upside down, and scales the image. The images happen to be north down and south up. We first change directory to be in terrain and then cd to go into each folder. cd .. changes directories back into terrain.
The next block of code calls image to draw each image in the correct spot on the 3 by 3 tiled map.
The subfunction ScaleImage scales the image by doing a mean of the pixels that are scaled down to one pixel. At the very end, we cd .., putting us into the original directory.
Figure 9.6

Terrain tiled image of the Middle East.

The demo picks a latitude and longitude in the Middle East. The result is that the 3 by 3 tiled image is shown in Figure 9.6. We can’t use this image for the neural net because of its very low resolution.

9.4 Close-Up Terrain

9.4.1 Problem

We want higher-resolution terrain.

9.4.2 Solution

Specialize the terrain code to produce a small segment of higher-resolution terrain suitable for experiments with a commercial drone.

9.4.3 How It Works

The preceding terrain code would work well for an orbiting satellite, but not so well for a drone. Per FAA regulations, the maximum altitude for small unmanned aircraft is 400 feet or about 122 meters. A satellite in a low Earth orbit (LEO) typically has an altitude of 300–500 km. Thus, drones are typically about 2500–4000 times closer to the surface than a satellite! We take the code and specialize it to read in just four images. It is much simpler than CreateTerrain and is less flexible. If you want to change it, you will need to change the code in the file.
Figure 9.7

Close-up terrain.

We don’t have any options for scaling. This runs the function:

Figure 9.7 shows the terrain. It is 2 degrees by 2 degrees.

9.5 Building the Camera Model

9.5.1 Problem

We want to build a camera model for our deep learning system. We want a model that emulates the function of a drone-mounted camera. Ultimately, we will use this camera model as part of a terrain-based navigation system, and we’ll apply deep learning techniques to do terrain navigation.

9.5.2 Solution

We will model a pinhole camera and create a high-altitude aircraft. A pinhole camera is a lowest-order approximation to a real optical system. We’ll then build the simulation and demonstrate the camera.

9.5.3 How It Works

We’ve already created an aircraft simulation in Recipe 9.2. The addition will be the terrain model and the camera model. A pinhole camera is shown in Figure 9.8. A pinhole camera has an infinite depth of field, and the images are rectilinear.

A point P(x, y, z) is mapped to the imaging plane by the relationships
$$displaystyle egin{aligned} egin{array}{rcl} u = frac{fx}{h} end{array} end{aligned} $$
(9.15)
$$displaystyle egin{aligned} egin{array}{rcl} v = frac{fy}{h} end{array} end{aligned} $$
(9.16)
where u and v are coordinates in the focal plane, f is the focal length, and h is the distance from the pinhole to the point along the axis normal to the focal plane. This assumes that the z-axis of the coordinate frame x, y, z is aligned with the boresight of the camera. The angle that is seen by the imaging chip is
$$displaystyle egin{aligned} 	heta = 	an^{-1}left(frac{w}{2f}
ight) end{aligned} $$
(9.17)
where f is the focal length. The shorter the focal length, the larger the image. The pinhole camera does not have any depth of field, but that is unimportant for this far-field imaging problem. The field of view of a pinhole camera is limited only by the sensing element. Most cameras have lenses, and the images are not perfect across the imaging array. This presents practical problems that need to be solved in real machine vision systems.

We want our camera to see 16 pixels by 16 pixels from the terrain image in Figure 9.7. We will assume a flight altitude of 10 km. Figure 9.9 gives the dimensions.

We are not simulating a particular camera. Instead, our camera model is producing 16 by 16 pixel maps given an input of a position. The output is a data structure with the x and y coordinates and an image. If no inputs are given, it will create a tiled map of the image. We scaled the image in the GraphicConverter app (Lemke Software GMBH) so that it is exactly
and saved it in the file TerrainClose.jpg. The numbers are x pixels, y pixels, and three layers for red, green, and blue. The third index is for the red, blue, and green matrices. This is a three-dimensional matrix, typical for color images.
Figure 9.8

Pinhole camera.

Figure 9.9

Pinhole camera with dimensions.

The code is shown as follows. We convert everything to pixels, get the image using [~,~,i] = getimage(h), and get the segment.

The first part of the code is to provide defaults for the user.
The next part computes the pixels.
The remainder displays the image.
The demo draws the source image and then the camera image. Both are shown in Figure 9.10.
Figure 9.10

Terrain camera source image and camera view. The camera view is 16 × 16 pixels.

The terrain image from the camera is blurry because it has so few pixels.

9.6 Plotting the Trajectory

9.6.1 Problem

We want to plot our trajectory over an image.

9.6.2 Solution

Create a function to draw the image and plot the trajectory on top.

9.6.3 How It Works

We write a function that reads in an image and plots the trajectory on top. We scale the image using image. The x-dimension is set and the y-dimension is scaled to match.
Figure 9.11

Trajectory plot.

The demo draws a circle over our terrain image. This is shown in Figure 9.11.
While the deep learning system will analyze all of the pixels in the image, it is interesting to see how the mean values of the pixel colors vary across an image. This is shown in Figure 9.12. The x-axis is the image number, going by rows of constant y. As can be seen, there is considerable variation even in nearby images. This indicates that there is sufficient information in each image for our deep learning system to be able to find locations. It also shows that it might be possible just to use mean values to identify the location. Remember that each image varies from the previous by only 16 pixels.
Figure 9.12

Mean red, green, blue values for the images.

9.7 Creating the Training Images

9.7.1 Problem

We want to create training images for our terrain model.

9.7.2 Solution

We build a script to read in the 64 by 64 bit image and create training images.

9.7.3 How It Works

We first create a 64-bit version of our terrain, using any image processing app. We’ve already done that, and it is saved as TerrainClose64.jpg. The following script reads in the image and generates training images by displacing the index one pixel at a time. We save the images in the folder TerrainImages. We also create labels. Each image is a different label. For each terrain snippet, we create nN copies with noise. Thus, there will be nN images with the label l. We add noise with the code
since the noise must be uint8 like the image. You’ll get an error if you don’t convert to uint8. You can also select different strides, that is, moving the images more than one pixel. The first code sets up the image processing. We choose 16-bit images because (after the next step of training) there is enough information in each image to classify each one. We tried 8 bits but it didn’t converge.
This line is very important. It makes sure the names correspond to distinct images. We will make copies of each image for training purposes.
We do some directory manipulations here.
The image splitting is done in this code. We add noise, if desired.
Figure 9.13 shows that the images cover the area. We also verified that the sum of R, G, and B was different for each image. This indicates that there is enough information for the machine learning algorithm.
Figure 9.13

This figure shows that the images cover the landscape.

9.8 Training and Testing

9.8.1 Problem

We want to create and test a convolutional neural network. The neural net will be trained to associate images with an x and y location.

9.8.2 Solution

We create and test a convolutional neural network in TerrainNeuralNet.m. This will be trained on the images created earlier and will be able to return the x and y coordinates. Convolutional neural networks are widely used for image identification.

9.8.3 How It Works

This example is much like the one in Chapter 3. The difference is that each image is a separate category. This is like face identification where each category is a different person.

We have an image layer to read in each image. We next convolve them with filters. The weights of the filters are determined during the learning. We normalize the outputs and pass through the ReLU activation function. Pooling compresses the data. Padding sets the output size equal to the input size. As seen by the layers printout, no padding is needed since the images are all the same size. The first layer has eight 3 by 3 pixel filters. The second layer has 32 3 by 3 pixel filters. The final set of layers is used to classify the images. As noted in the previous section, each image has a unique “class” which is associated with its location. We use a constant learning rate. The batch size is smaller than the default.

Figure 9.14 shows some of the images. Figure 9.15 shows the training window. It can categorize the images after seven epochs. The difference between the two adjacent images is only 16 pixels. It isn’t a lot of data, but the neural net can categorize each image with 100% accuracy.
Figure 9.14

These selected terrain images show what the neural net is classifying.

In each epoch in Figure 9.15, it is processing all of the training data.
Figure 9.15

Training window.

We get 100% accuracy. You can explore changing the number of layers and trying different activation functions. Training takes a few minutes.

9.9 Simulation

9.9.1 Problem

We want to test our deep learning algorithm using our terrain model.

9.9.2 Solution

We build a simulation using the trained neural net.

9.9.3 How It Works

We reproduce the simulation from the previous section and remove some unneeded output so that we can focus on the neural net. We read in the trained neural net.
The neural net classifies the image obtained by the camera. We convert the category into an integer using int32. The subplot displays the image the neural net identifies as matching the camera image and the camera image. The simulation loop stops if your altitude, x(6), is less than 1.

Figure 9.16 shows the trajectory and the camera view. We simulate one full circle.

The identified terrain segment and the path, based on the neural network location, are shown in Figure 9.17. The neural net classifies the terrain it is seeing. The location of each image is read out and used to plot the trajectory.

The 2D trajectory is shown in Figure 9.18 for a circular path. We make sure we are in the regions where each image is a pixel change. On the edges, there is one image border. If we were in that region, the resolution would be below. The trajectory from the images is reasonably close to the actual trajectory. Better results would require higher resolution. In practice, the measured positions would be inputs to a Kalman filter [42] that modeled the aircraft dynamics, given earlier in this chapter. An input to the Kalman filter could be a 3-axis accelerometer (see Chapter 7). This would smooth the trajectory and improve accuracy.
Figure 9.16

The camera view and trajectory. This is one full circle. The two images are one pixel different.

Figure 9.17

The aircraft path and the identified terrain segments. “Image” in the bottom plot refers to the image index.

Figure 9.18

The aircraft path, blue, and the identified terrain segments, red.

This chapter showed how a neural network can be used to identify terrain for aircraft navigation. We simplified things by flying at a constant altitude, using a pinhole camera model with fixed image orientation, and ignoring clouds and other complications. We used a convolutional neural network to train the neural net with good results. As noted, higher-resolution images and a Kalman filter would produce a smoother trajectory.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset