Chapter 5

Sensors Used in Mobile Systems

Abstract

Sensors are one of the key elements in mobile robotics. Together with other essential elements, they enable mobile robots to autonomously perform their actions, such as trajectory tracking, to locate and track targets, act safely by preventing collisions, and to localize and map the environment. Although they play a vital role they are also a limiting factor in mobile robotics because perfect, robust, and available sensors that would directly measure robot location are usually not available. Therefore, this chapter starts by introducing the different transformations that are later needed to relate local sensor-measured information to the information in robot coordinates. Then the main methods and sensors used to estimate robot pose are presented. They need to be a part of every robot localization system. The chapter ends with brief overview of sensors, their classifications, and main characteristics.

Keywords

Sensors; Pose estimation; Transformations; Dead reckoning; Active markers; Features

5.1 Introduction

Wheeled mobile robots need to sense the environment using sensors in order to autonomously perform their mission. Sensors are used to cope with uncertainties and disturbances that are always present in the environment and in all robot subsystems. Mobile robots do not have exact knowledge of the environment, and they also have imperfect knowledge about their motion models (uncertainty of the map, unknown motion models, unknown dynamics, etc.). The outcomes of the actions are also uncertain due to nonideal actuators. The main purpose of the sensors is therefore to lower these uncertainties and to enable estimation of robot states as well as the states of the environment.

Usually there is no elegant single sensor solution (especially for indoor use) that would be accurate enough and robust in measuring desired information such as the robot pose. The pose is necessary to localize the robot, which is one of the most important challenges in mobile robotics. The developers therefore need to rely on more sensors and on a fusion of their measured information. Such approaches benefit in higher quality and more robust information. Robot pose estimation usually combines relative sensors and absolute sensors. Relative sensors provide information that is given relatively to the robot coordinates, while the information of absolute sensors is defined in some global coordinate system (e.g., Earth coordinates).

Using sensors, a mobile robot can sense the states of the environment. The measured sensor information needs to be analyzed and interpreted adequately. In the real world measurements are changing dynamically (e.g., change of illumination and different sound or light absorption on surfaces). The measurement error is often modeled statistically by a probability density function for which symmetric distribution is usually supposed or even normal distribution. However, this assumption may not always be correct (e.g., ultrasonic sensor distance measurement can be larger than the true distance due to multiple reflections of the ultrasound path from the sensor transmitter to the sensor receiver).

First, the coordinate system transformations are briefly described. They are needed to correctly map sensor measurements to the robot and to estimate relevant information in robot-related coordinates. Then, the main localization methods for estimation of robot pose in the environment using specific sensors are explained. Finally, a short overview of sensors used in mobile robotics is given.

5.2 Coordinate Frame Transformations

Sensors that are mounted on the robot are usually not in the robot’s center or in the origin of the robot’s coordinate frame. Their position and orientation on the robot is described by a translation vector and rotation according to the robot’s frame. Those transformations are needed to relate measured quantities in the sensor frame to robot coordinates.

With these transformations we can describe how the sensed direction vector (e.g., accelerometer, magnetometer) or sensed position coordinates (e.g., laser range scanner or camera) are expressed in the robot coordinates. Furthermore, mobile robots are moving in space, and therefore, their poses or movements can be described by appropriate transformations.

Here, a short general overview for 3D space is given, although in wheeled mobile robots two dimensions are usually sufficient (e.g., motion on the plane described by two translations and one rotation). First some notations for rotation transformation will be described, and later a translation will be included.

5.2.1 Orientation and Rotation

Orientation of some local reference frame (e.g., sensor) according to the reference frame (e.g., robot) is described by a rotation matrix R:

R=u1u2u3v1v2v3w1w2w3

si1_e  (5.1)

where unit vectors of a local coordinate system u, v, and w are expressed in the reference frame by u = [u1, u2, u3]T, v = [v1, v2, v3]T, w = [w1, w2, w3]T, where u ×v = w. The rows of R are components of body unit vectors along the reference coordinate unit vectors x, y, and z. The elements of matrix R are cosine of the angles among the axis of both coordinate systems; therefore, matrix R is also called the direction cosine matrix or DCM. Because vectors u, v, and w are orthonormal the inverse of R equals to the transpose of R (detR=1si2_e and R−1 =RT). The DCM has nine parameters to describe three degrees of freedom, but those parameters are not mutually independent but are defined by six constraint relations (sum of squared elements of each line in R is one and a dot product of any couple of the lines in R is zero).

The vector in a local (body) coordinate frame vL is expressed from a global coordinate (reference) using the following transformation

vL=RGLvG

si3_e  (5.2)

Matrix RGLsi4_e therefore transforms vector description (given by its components) in the global coordinates to the vector description in the local coordinates.

Basic rotation transformations are obtained by rotation around axis x, y, and z by elementary rotation matrices:

Rx(φ)=1000cos(φ)sin(φ)0sin(φ)cos(φ)

si5_e  (5.3)

Ry(θ)=cos(θ)0sin(θ)010sin(θ)0cos(θ)

si6_e  (5.4)

Rz(ψ)=cos(ψ)sin(ψ)0sin(ψ)cos(ψ)0001

si7_e  (5.5)

where Raxis(angle) is rotation around the axis for a given angle. Successive rotations are described by the product of the rotation matrix where the order of rotations is important. DCM is the elementary presentation of the rigid body orientation; however, in some cases other presentations are more appropriate. Therefore two additional methods will be presented in the following.

Euler Angles

Euler angles describe orientation of some rigid body by rotation around axes x, y, and z. Those angles are marked as

 φ: roll angle (around x-axis),

 θ: pitch angle (around y-axis), and

 ψ: yaw or heading angle (around z-axis).

There are 12 possible rotation combinations around axes x, y, and z [1]. The most often used is the combination 3–2–1, where the orientation of somebody coordinate frame according to the reference coordinate frame is obtained from the initial pose where both coordinate frames are aligned and then the body frame is rotated in the following order:

1. First is rotation around z-axis for yaw angle ψ,

2. then rotation around newly obtained y for a pitch angle θ, and

3. finally rotation around newly obtained x for a roll angle φ.

The DCM of this rotation transformation is

R=Rx(φ)Ry(θ)Rz(ψ)=cosθcosψcosθsinψsinθsinφsinθcosψcosφsinψsinφsinθsinψ+cosφcosψsinφcosθcosφsinθcosψ+sinφsinψcosφsinθsinψsinφcosψcosφcosθ

si8_e

and the Euler presentation is as follows:

φ=arctanR23R33θ=arcsin(R13)ψ=arctanR12R11

si9_e  (5.6)

Parameterization using Euler angles is not redundant (three parameters for three degree of freedom). However, its main drawback is that it becomes singular at rotation θ = π/2 where rotation around z- and x-axes have the same effect (they coincide). This effect is known as Gimbal lock in classic airplane gyroscopes. In rotation simulation using Euler angles notation, this singularity appears due to division by cosθsi10_e (see Eq. 5.47 in Section 5.2.3).

Quaternions

Quaternions present orientation in 3D space using four parameters and one constraint equation without singularity problems such as in Euler angles presentation. Mathematically the quaternions are a noncommutative extension of the complex numbers defined by

q=q0+q1i+q2j+q3k

si11_e  (5.7)

where complex elements i, j in k are related by expression i2 =j2 =k2 = i j k = −1. q0 is the scalar part of the quaternion and q1i + q2j + q3k is termed the vector part. The quaternion norm is defined by

q=qq*=q02+q12+q22+q32

si12_e  (5.8)

where q* is the conjugated quaternion obtained by q* = q0q1iq2jq3k. The inverse quaternion is calculated using its conjugate and norm value as follows:

q1=q*q2

si13_e  (5.9)

Rotation in space is parameterized using unit quaternions by

q0=cosΔφ/2q1=e1sinΔφ/2q2=e2sinΔφ/2q3=e3sinΔφ/2

si14_e  (5.10)

where vector eT = [e1, e2, e3] is the unit vector of the rotation axis and Δφ rotation angle around that axis. For unit quaternion it holds q02+q12+q22+q32=1si15_e.

Transformation

qv=q1qvq

si16_e  (5.11)

rotates vector v = [x, y, z]T expressed by quaternion

qv=xi+yj+zk

si17_e  (5.12)

(or equivalently qv = [0, x, y, z]T) around axis e for angle Δφ to the vector v′ = [x′, y′, z′]T expressed by quaternion

qv=xi+yj+zk

si18_e  (5.13)

where ∘ denotes the product of quaternions defined in Eqs. (5.16), (5.17).

An additional advantage of quaternions is the relatively easy combination of successive rotations. Product of two DCM can be equivalently written by the product of two quaternions [1]. Having two quaternions,

q=q0+q1i+q2j+q3k

si19_e  (5.14)

and

q=q0+q1i+q2j+q3k

si20_e  (5.15)

and rotating some vector from its initial orientation first by rotation defined by q′ and then for rotation defined by q, we obtain the following:

q=qq=(q0q0q1q1q2q2q3q3)+i(q1q0+q2q3q3q2+q0q1)+j(q1q3+q2q0+q3q1+q0q2)+k(q1q2q2q1+q3q0+q0q3)

si21_e  (5.16)

or in vector-matrix form

q0q1q2q3=q0q1q2q3q1q0q3q2q2q3q0q1q3q2q1q0q0q1q2q3

si22_e  (5.17)

The relation between the quaternion and DCM presentation is given by

R(q)=q02+q12q22q322(q0q3+q1q2)2(q1q3q0q2)2(q1q2q3q0)q02q12+q22q322(q0q1+q2q3)2(q0q2+q1q3)2(q2q3q0q1)q02q12q22+q32

si23_e  (5.18)

or in opposite direction

q0=121+R11+R22+R33

si24_e  (5.19)

q1=14q0R23R32

si25_e  (5.20)

q2=14q0R31R13

si26_e  (5.21)

q1=14q0R12R21

si27_e  (5.22)

If q0 in Eq. (5.19) is close to zero then the transform from DCM to quaternion (relations 5.195.22) is singular. In this case we can calculate the quaternion using equivalent form (relations 5.235.26) without numerical problems:

q0=14TR32R23

si28_e  (5.23)

q1=T

si29_e  (5.24)

q2=14TR12+R21

si30_e  (5.25)

q3=14TR13+R31

si31_e  (5.26)

where T=121+R11R22R33si32_e

The relation between quaternions and Euler angles (notation 3–2–1) is obtained by matrices Rx(φ), Ry(θ), and Rz(ψ), which suits to quaternions [cos(φ/2)+isin(φ/2)]si33_e, [cos(θ/2)+jsin(θ/2)]si34_e in [cos(ψ/2)+ksin(ψ/2)]si35_e. The quaternion for rotation 3–2–1 is

q=[cos(ψ/2)+ksin(ψ/2)][cos(θ/2)+jsin(θ/2)][cos(φ/2)+isin(φ/2)]

si36_e  (5.27)

or in vector form

q=cos(φ/2)cos(θ/2)cos(ψ/2)+sin(φ/2)sin(θ/2)sin(ψ/2)sin(φ/2)cos(θ/2)cos(ψ/2)cos(φ/2)sin(θ/2)sin(ψ/2)cos(φ/2)sin(θ/2)cos(ψ/2)+sin(φ/2)cos(θ/2)sin(ψ/2)cos(φ/2)cos(θ/2)sin(ψ/2)sin(φ/2)sin(θ/2)cos(ψ/2)

si37_e  (5.28)

The opposite transformation is

φ=arctan2(q1q0+q2q3)q02q12q22+q32

si38_e  (5.29)

θ=arcsin2(q1q3q2q0)

si39_e  (5.30)

ψ=arctan2(q3q0+q1q2)q02+q12q22q32

si40_e  (5.31)

Example 5.1

A sensor (magnetometer) coordinate system is rotated according to a robot coordinate system. Lets mark the sensor coordinate system as local (L) and the robot coordinate system as global (G). The orientation of the sensor according to the robot is described by two rotations. Suppose that initially (L) is aligned to (G) then rotating (L) around x-axis for αx = 90 degree and finally around new y-axis for αy = 45 degree. Answer the following:

1. What is the orientation of the sensor expressed by the DCM (RGLsi4_e) according to robot coordinates?

2. Determine Euler angles (notation 3–2–1) for this transformation.

3. Determine quaternion qGLsi42_e that describes this transformation.

4. The magnetometer measures direction vector v = [0, 0, 1]T of the magnetic field. What is the presentation of this vector in robot coordinates?

Solution

1. Final orientation is described by the rotation matrix obtained from successive rotations where multiplication order is important.

RGL=Ry(αy)Rx(αx)=cos(αy)0sin(αy)010sin(αy)0cos(αy)1000cos(αx)sin(αx)0sin(αx)cos(αx)=0.70710.707100010.70710.70710

si43_e

Rows of matrix RGLsi4_e are components of the sensor axis expressed in robot coordinates, which is seen from a graphical presentation of this rotation.

u05-01-9780128042045

2. Euler angles for 3–2–1 notation are calculated from matrix R=RGLsi45_e:

φ=arctanR23R33=90degreeθ=arcsin(R13)=0degreeψ=arctanR12R11=45degree

si46_e

where the rotation matrix is obtained by RGL=Rx(φ)Ry(θ)Rz(ψ)si47_e and where elementary rotations Rx(φ), Ry(θ), and Rz(ψ) are defined in Eq. (5.5).

u05-02-9780128042045

3. Quaternion qGLsi42_e is obtained by the product of quaternions for successive rotations:

qGL=qxqy

si49_e

where qx is according to Eq. (5.10) defined by rotation angle Δφx = 90 degree around rotation axis ex = [1, 0, 0]T and qy with rotation angle Δφy = 45 degree around rotation axis ey = [0, 1, 0]T

qx=cosΔφx/2ex1sinΔφx/2ex2sinΔφx/2ex3sinΔφx/2=0.70710.707100

si50_e

qy=cosΔφz/2ey1sinΔφy/2ey2sinΔφy/2ey3sinΔφy/2=0.923900.38270

si51_e

The final quaternion (see quaternion product definition in Eq. 5.17) reads

qGL=qxqy=0.65330.65330.27060.2706

si52_e

which suits to the rotation angle Δφ=2arccos(q0)=2arccos(0.6533)=98.41si53_e degree around the rotation axis e=1sinΔφ2[q1,q2,q3]T=[0.8630,0.3574,0.3574]Tsi54_e

4. The measured direction vector vL = [0, 0, 1]T is in robot coordinates:

vG=RLGvL=[0.70711,0.70711,0]T

si55_e

where RLG=(RGL)1=RGLTsi56_e.
The same is obtained by quaternions where the vector after rotation is

qvG=(qLG)1qvLqLG

si57_e

where qvL=[0,vLT]Tsi58_e and qLG=(qGL)1si59_e (see Eq. 5.9). Products of quaternions are defined in Eq. (5.17). The solution is

qvG=0.65330.65330.27060.270600010.65330.65330.27060.2706=00.70710.70710

si60_e

which is vector vG = [0.7071, −0.7071, 0]T (only the vector part of quaternion is considered).

Example 5.2

Initially global coordinate frame G and local coordinate frame L are aligned; then frame L is rotated around x-axis for angle αx = 90 degree and then again around newly obtained z-axis for angle αz = 90 degree. Answer the following:

1. What is the orientation of frame L expressed by the DCM (RGLsi4_e) according to frame G?

2. Determine Euler angles (notation 3–2–1) for this transformation.

3. Determine quaternion qGLsi42_e that describes this transformation.

4. The vector in global coordinates is vG = [0, 0, 1]T. What is the presentation of this vector in local coordinates?

Solution

1. The DCM and graphical presentation for this rotation are

RGL=001100010

si63_e

u05-03-9780128042045

2. Euler angles (3–2–1) are

φ=arctanR23R33=undefinedθ=arcsinR13=90degreeψ=arctanR12R11=undefined

si64_e

Notice that θ = −90 degree, which means that parameterization using Euler angles is singular, and therefore φ and ψ are not defined. Using Euler angles this orientation (rotation) cannot be described.

3. Quaternion qGLsi42_e is

qGL=qxqz

si66_e

where

qx=0.70710.707100qz=0.7071000.7071

si67_e

and

qGL=qxqz=0.50.50.50.5

si68_e

which result in the rotation angle Δφ=2arccos(0.5)=120si69_e degree around axis of rotation e=1sinΔφ2[q1,q2,q3]T=[0.5774,0.5774,0.5774]Tsi70_e.

4. Vector vG = [0, 0, 1]T reads in local coordinates as

vL=RGLvG=[1,0,0]T

si71_e

or with quaternions

qvL=(qGL)1qvGqGL=0100

si72_e

where the vector part is vL = [1, 0, 0]T.

5.2.2 Translation and Rotation

To make presentation more general lets mark sensor coordinates by L (as local coordinates) and robot coordinates by G (as global coordinates). Sensor location according to robot coordinates is described by translation vector tLG=[tx,ty,tz]si73_e and rotation matrix RGLsi4_e. Translation tLGsi75_e describes the position of the local coordinates’ origin in global coordinates, and rotation matrix RGLsi4_e describes the local coordinate frame orientation according to global (robot) coordinates. A point pG given in global coordinates can be described by local coordinates using transformation

pL=RGLpGtLG

si77_e  (5.32)

and its inverse transformation is given by

pG=(RGL)1pL+tLG=RGLTpL+tLG

si78_e  (5.33)

Example 5.3

A robot has a laser range finder (LRF) sensor that measures the position of the closest obstacle point pL = [1, 0.5, 0.4]T m in sensor coordinates. It also has a magnetometer that measures Earth’s magnetic field vector vLT=[22,1,42]nTsi79_e.

The laser range sensor is translated according to a robot (global) coordinate frame by t1 = [0.1, 0, 0.25]T and rotated (according to a robot coordinate frame) around z-axis for 30 degree. The magnetometer translation is t2 = [0, 0.1, 0.2], and rotation is described by roll, pitch, and yaw angles φ = 0 degree, θ = 10 degree, and ψ = 20 degree (Euler notation 3–2–1), respectively.

Answer the following:

1. What are the closest obstacle point coordinates in the robot coordinate system?

2. How is Earth’s magnetic field vector presented by robot coordinates?

Solution

1. The DCM is

RGL=cos(30°)sin(30°)0sin(30°)cos(30°)0001=0.8660.500.50.86600001

si80_e

The point expressed in robot coordinates reads

pG=(RGL)TpL+t1=[0.716,0.933,0.65]T

si81_e

2. For vector transformations only rotation matters so the magnetic vector components in robot coordinates are obtained by rotation transformation

RGL=Rx(0°)Ry(10°)Rz(20°)=0.92540.33680.17360.34200.939700.16320.05940.9848

si82_e

vG=(RGL)TvL=[26.8705,10.8443,37.5417]T

si83_e

5.2.3 Kinematics of Rotating Frames

This section will introduce how rigid body orientation presented by quaternions or the DCM is related to angular rates around local axes of the rigid body. The rigid body is rotating around its x-, y-, and z-axes with angular velocities ωx, ωy, and ωz, respectively. Therefore, the orientation of the rigid body (e.g., robot or sensor unit) is changing according to the reference coordinate system.

Rotational Kinematics Expressed by Quaternions

Time dependency of rigid body rotation (given by differential equation) can be derived from the product definition of two quaternions (5.17). If orientation q(t) of the rigid body at time t is known then its orientation in time q(t + Δt) can be written as

q(t+Δt)=q(t)Δq(t)

si84_e  (5.34)

where Δq(t) defined the change of the rigid body orientation from q(t) to q(t + Δt). In other words Δq(t) is the orientation of the body at time t + Δt relative to its orientation at time t. The final orientation of the body q(t + Δt) is therefore obtained by first rotating the body according to some reference frame for rotation q(t), and then also for rotation Δq(t) according to q(t). Δq(t) is defined by relation (5.10)

Δq(t)=cosΔφ/2exsinΔφ/2eysinΔφ/2ezsinΔφ/2

si85_e  (5.35)

where e(t) = [ex, ey, ez]T is the rotation axis expressed in rigid body local coordinates at time t and Δφ is the rotation angle during time period Δt. Assuming e(t) and Δφ are constant during the period Δt, the product of quaternions (5.34) can be reformulated using definition (5.17) as follows:

q(t+Δt)=cosΔφ2I+sinΔφ20exeyezex0ezeyeyez0exezeyex0q(t)

si86_e  (5.36)

where I is a 4 × 4 identity matrix. For a short interval Δt we can approximate Δφωx2+ωy2+ωz2Δtsi87_e where ω(t) = [ωx, ωy, ωz]T is the vector of current angular rates, which can also be written in the form ω(t)=ωx2+ωy2+ωz2esi88_e. For small angles, Eq. (5.36) can be approximated by

q(t+Δt)I+ΔtΩ2q(t)

si89_e  (5.37)

where

Ω=0ωxωyωzωx0ωzωyωyωz0ωxωzωyωx0

si90_e  (5.38)

A differential equation that describes rigid body orientation is obtained by limiting Δt toward zero:

dqdt=limΔt0q(t+Δt)q(t)Δt=12Ωq

si91_e  (5.39)

where angular rates in Ω are given in the rigid body coordinates.

Rotational Kinematics Expressed by DCM

The differential equation for rigid body orientation presentation given by the DCM is derived in the following. Defining similarly as in Eq. (5.34),

R(t+Δt)=ΔR(t)R(t)

si92_e  (5.40)

where R(t) is the orientation of the rigid body at time t, R(t + Δt) is the orientation at time t + Δt, and ΔR(t) is the change of orientation (orientation of the body at time t = t + Δt) according to the orientation at time t.

Change of orientation ΔR(t) is defined as

ΔR(t)=ett+ΔtΩdt

si93_e  (5.41)

where

Ω=0ωzωyωz0ωxωyωx0

si94_e  (5.42)

and ω(t) = [ωx, ωy, ωz]T is the vector of the current angular rates of the body.

Assuming Ω′ is a constant matrix in time period Δt we can approximate tt+ΔtΩ(t)dtΩΔt=Bsi95_e. The exponent in relation (5.41) written in the Taylor series becomes

ΔR(t)=eB=I+B+B22!+B33!+=I+B+B22!σ2B3!σ2B24!+σ4B5!+=I+sinσσB+1cosσσ2B2

si96_e  (5.43)

where I is the 3 × 3 identity matrix and σ=Δtωx2+ωy2+ωz2si97_e. For small angles σ relation (5.43) approximates to

ΔR(t)=I+B=I+ΩΔt

si98_e  (5.44)

which can be used to predict rotation matrix (5.40):

R(t+Δt)=I+ΩΔtR(t)

si99_e  (5.45)

The final differential equation is obtained by limiting Δt to zero:

dRdt=limΔt0R(t+Δt)R(t)Δt=ΩR

si100_e  (5.46)

For the sake of completeness, also the equivalent presentation of rotation parameterization using Euler angles (notation 3–2–1) is given:

φ.=ωx+ωysinφtanθ+ωzcosφtanθθ.=ωycosφωzsinφΨ.=ωysinφ+ωzcosφcosθ

si101_e  (5.47)

where it can be seen that the notation becomes singular (the first end the third equation of Eq. (5.47) at θ = ±π/2).

5.2.4 Projective Geometry

Projection is the transformation of a space with N > 0 dimensions into a space with M < N dimensions. Normally, under a general projective transformation some information is inevitably lost. However, if multiple projections of an object are available, it is in some cases possible to reconstruct the original object in N-dimensional space. Two of the most common projective transformations are perspective projection and parallel projection (linear transformation with a focal point in infinity).

According to the pinhole camera model, the 3D point pCT=[xCyCzC]si102_e given in the camera frame C is projected to the 2D point pPT=[xPyP]si103_e in the image (picture) frame P as follows (see Fig. 5.1):

p_P=1zCSpC

si104_e  (5.48)

f05-01-9780128042045
Fig. 5.1 Pinhole camera model.

where p_Psi105_e is the representation of the point pP in homogeneous coordinates, that is, p_PT=[xPyP1]si106_e. Matrix SR3×R3si107_e describes the internal camera model:

S=αxfγcx0αyfcy001

si108_e  (5.49)

The intrinsic camera parameters contained in S are the focal length f; the scaling factors αx and αy in horizontal and vertical directions, respectively; the optical axis image center (cx, cy); and the skewness γ. The camera model parameters are normally obtained or refined in the process of camera calibration. The perspective camera model (5.48) is nonlinear because of the term zC1si109_e (the inverse of the object distance along the z-axis in the camera frame C). Although points, lines, and general conics are invariant under perspective transformation (i.e., the points transform into points, the lines into lines, and conics into conics), the projected image is a somewhat distorted representation of reality. In general, the angles between the lines and distance ratios are not preserved (e.g., the parallel lines do not transform into parallel lines). In some particular camera configurations the perspective projection can be approximated with an appropriate linear model [2]; this can enable, for example, simple camera calibration. The camera model (5.48) can also be written as

p_PSpC

si110_e  (5.50)

The image of an object is formed on the screen behind the camera optical center (at the distance of focal length f along the negative zC semiaxis), and it is rotated for 180 degree and scaled down. When creating a visual presentation of the camera model a virtual unrotated image can be assumed to be formed in front of the camera’s optical center (on the positive zC semiaxis at the same distance from the optical center as a real image) as it is shown in Fig. 5.1.

Example 5.4

The internal camera model (5.49) has the following parameters: αxf = αyf = 1000, no skewness (γ = 0), and the image center is in the middle of the image with dimensions of 1024 × 768. Project the following set of 3D points given the camera frame to the image:

pCT{[114],[115],[014],[114],[415]}

si111_e

Solution

The projection of 3D points to an image according to Eq. (5.48) is given in Listing 5.1. The results are also shown graphically on the left-hand side of Fig. 5.2. Notice that the last point does not appear in the image since it is out of the image screen’s boundaries. Moreover, the fourth point also should not appear in the image since the point is behind the camera. Remember that the mathematical model (5.48) does not take visibility constraints into account. Therefore additional checks need to be made that ensure that only visible points are projected to the image: that is, the projected points must be inside the image boundaries and the point to be projected must be in front of the camera. Therefore, only the first three points actually appear in the image (see right-hand side of Fig. 5.2).

f05-02-9780128042045
Fig. 5.2 Projected points from Example 5.4. On the left there are points projected to the image according to the camera model (5.48), and on the right there are points that are actually projected.

Listing 5.1

Solution of Example 5.4

1 %   I n t r i n s i c  camera parameters and image screen  s i z e

2 alphaF  =  1000;  %  alpha*f , in px/m

3 s  =   [1024;   768];   %  screen size , in px

4 c  =  s /2;   %  image centre , in px

5 S  =   [ alphaF ,  0 , c (1) ;  0 , alphaF ,  c (2) ;  0 , 0 ,  1 ] ;   %  Camera model

6

7 %  Set of 3− Dn  points in camera frame

8 pC  =  [−1   1   4;   1   1   5;   0  −1   4;   −1   1   −4;   4   1   5 ] . ’ ;

9

10 %  Projection of points to image frame

11 pP  =  (S*pC) ./ repmat (pC( 3 , : ) , 3 , 1)

pP =
2627125127621312
634584134134584
11111

t9010

From Eq. (5.48) it is clear that every point in the 3D space is transformed to a point in 2D space, but this does not hold for the inverse transformation. Since the perspective transformation causes loss of the scene depth, the point in the image space can only be back-projected to a ray in 3D space if no additional information is available. The scene can be reconstructed if scene depth is somehow recovered. There are various techniques that enable 3D reconstruction that can be based on depth cameras, structured light, visual cues, motion, and more. The position of the point in a 3D scene can also be recovered from corresponding images of the 3D point obtained from multiple (at least two) views. Three-dimensional reconstruction is therefore possible with a stereo camera.

Multiview Geometry

Multiview geometry is not important only because it enables scene reconstruction but some of the properties can be exploited in the design of machine vision algorithms (e.g., image point matching and image-based camera pose estimation). Consider that the rotation matrix RC2C1si112_e and translation vector tC2C1si113_e describe the relative pose between two cameras (Fig. 5.3). Under the assumption that camera centers do not coincide (tC2C10si114_e), the relation (5.52) can be obtained after a short mathematical manipulation (cameras with identical internal parameters are assumed):

f05-03-9780128042045
Fig. 5.3 Two-view perspective-camera geometry.

pC1=RC2C1pC2+tC2C1,

si115_e  (5.51)

S1p_P1RC2C1S1p_P2+tC2C1tC2C1×S1p_P1tC2C1×RC2C1S1p_P20=p_P1TSTtC2C1×RC2C1S1p_P20=p_P1TFp_P2

si116_e  (5.52)

Notice that the cross product between the vectors aT = [a1a2a3] and bT = [b1b2b3] is written as a ×b = [a]×b, where [a]× is a skew-symmetric matrix:

[a]×=0a3a2a30a1a2a10

si117_e  (5.53)

The matrix F is known as a fundamental matrix that describes the epipolar constraint (5.52): the point p_P1si118_e is on the line Fp_P2si119_e in the first image and the point p_P2si120_e lies on the line FTp_P1si121_e in the second image. Another important relation in machine vision is pC1TEpC2=0si122_e, where E=[tC2C1]×RC2C1si123_e is known as the essential matrix. The relation between the essential and fundamental matrix is E =STF S.

The epipolar constraint can be used to improve matching of image points in multiple images that belong to the same world point given the known pose between two calibrated cameras. Since the corresponding pair of the point p_P1si118_e in the first image must lie on the line FTp_P1si121_e in the second image, the 2D searching over the entire image area is simplified to a 1D search along the epipolar line, and therefore the matching process can be sped up significantly and it can also be made more robust, since matches that do not satisfy the epipolar constraint are rejected.

Example 5.5

A set of 3D points is observed from two cameras with relative displacement described by the rotation matrix RC2C1=Rx(30°)Ry(60°)si126_e and translation vector tC2C1=[412]Tsi127_e. Both cameras have the same internal camera model (5.49): αxf = αyf = 1000, no skewness (γ = 0), and image center is in the middle of the image with the dimensions of 1024 × 768. A set of projected points into the image of the first camera is

pP1T{[262634],[762634],[512134],[443457],[412284]}

si128_e

and a set of point projected into the image of the second camera is

pP1T{[259409],[397153],[488513],[730569],[115214]}

si129_e

The points in both images are shown in Fig. 5.4. Determine all possible image point correspondences taking into account the epipolar constraint. For all the matched point pairs reconstruct the position of the points in the 3D space with respect to both camera frames.

f05-04-9780128042045
Fig. 5.4 Points in the two images from Example 5.5.

Solution

When matching points between the images from two cameras with a known relative pose between them, the points that belong to the same world point must satisfy the epipolar constraint (5.52). Eq. (5.52) has the form lTp_=0si130_e, that is, the homogeneous point p_T=[xy1]si131_e lies on the line lT = [a b c], since ax + by + c = 0 is a line equation. Therefore, the epipolar constraint (5.52) can be utilized to find the epipolar lines lP2si132_e in the second image for every point p_P1si118_e from the first image, and vice-versa, lines lP1si134_e for every point p_P2si120_e. The epipolar lines for the first and the second image are shown in Fig. 5.5. From Figs. 5.4 and 5.5 four possible point matches can be found graphically.

f05-05-9780128042045
Fig. 5.5 Points and epipolar lines in the two images from Example 5.5.

The implementation of algorithm for point matching is given in Listing 5.2 (functions rotX and rotY are Matlab implementations of Eqs. 5.3, 5.4, respectively). The resulting point matches are collected in the index matrix pairs.

Listing 5.2

Point matching from Example 5.52

1 %   I n t r i n s i c  camera parameters and image screen  s i z e

2 alphaF  =  1000;  %  alpha*f , in px/m

3 s  =   [1024;   768];   %  screen size , in px

4 c  =  s /2;   %  image centre , in px

5 S  =   [ alphaF ,  0 , c (1) ;  0 , alphaF ,  c (2) ;  0 , 0 ,  1 ] ;   %  Camera model

6

7 %  Relative pose between the cameras

8 R   =  rotX (pi/6)*rotY (pi/3) ;  t  =   [ 4 ;  −1;   2 ] ;

9

10 %  Set of points

11 pP1  =   [262 ,  634; 762 , 634; 512 , 134; 443 , 457; 412 ,  2 8 4 ] . ’ ;

12 pP2  =   [259 ,  409; 397 , 153; 488 , 513; 730 , 569; 115 ,  2 1 4 ] . ’ ;

13

14 N1  =   s i z e (pP1 ,  2) ;  N2  =   s i z e (pP2 ,  2) ;   %  Number of points

15

16 %  Fundamental matrix

17 tx  =   [0 ,  − t (3) ,   t (2) ;   t (3) ,  0 , − t (1) ;   − t (2) ,   t (1) ,   0 ] ;

18 F   =  S . ’∖ tx* R /S ;

19

20 epsilon   =  1e−2;   %  Distance error tolerance

21 pP1  =   [ pP1 ;  ones (1 ,N1) ] ;  pP2  =   [ pP2 ;  ones (1 ,N2) ] ;   %  Homogenous

22  points

23

24 %  Epipolar  l i n e s  in frame P1 associated with the points in frame P2

25 lP1  =  F*pP2 ;

26 %  Epipolar  l i n e s  in frame P2 associated with the points in frame P1

27 lP2  =  F. ’*pP1 ;

28

29 %  Find point pairs (evaluate fundamental constraint in frame P2)

30 pairs   =   [ ] ;

31 for  i   =   1:N1

32  d  =  abs( lP2 ( : , i ) . ’*pP2) ;

33  k  =  find(d <epsilon ) ;

34  if  ˜isempty(k) ,  pairs   =   [ pairs ,   [ i ;  k (1) ] ] ;  end

35 end

pairs   =

 1           2           3           5

 3           4           2           1

Singular Cases

The relation (5.52) becomes singular if there is no translation between the camera centers. In this case the following relation can be obtained from Eq. (5.51) if tC2C1si113_e is set to zero:

p_P1SRC2C1S1p_P2

si137_e  (5.54)

A similar equation form can also be obtained in a case where all the points in the 3D space are confined to a single common plane. Without loss of generality, the plane zW = 0 can be selected:

p_P1SrW,1C1rW,2C1tWC1rW,1C2rW,2C2tWC21S1p_P2

si138_e  (5.55)

where RWC=[rW,1CrW,2CrW,3C]si139_e. The transformation from the world plane to the picture plane is

p_PS[rW,1CrW,2CtWC]p_W

si140_e  (5.56)

where in this particular case p_WT=[xWyW1]si141_e. Eqs. (5.54)(5.56) all have similar form: p_Hp_si142_e. The matrix H is known as homography.

3D Reconstruction

In a stereo camera configuration the 3D position of a point can be reconstructed from both images of the point. This procedure requires finding corresponding image points among the views, which is one of the fundamental problems in machine vision. But once point correspondences have been established and the relative pose between the camera views is known (the translation between the cameras must be different from zero) and also the internal camera model is known, the 3D position of the point can be estimated. If both camera models are the same, the point depth in both camera frames (zC1si143_e and zC2si144_e) can be estimated by solving the following set of equations (e.g., using least squares method):

S1p_P1zC1RC2C1S1p_P2zC2=tC2C1

si145_e  (5.57)

Example 5.6

For all the obtained point matches between two images from Example 5.52 reconstruct the 3D positions of the world points with respect to the first and the second camera frame.

Solution

Three-dimensional positions of the points can be obtained if the system of Eq. (5.57) is solved for every pair of matched image points from Example 5.5, and the solution is inserted into the inverse transformation of Eq. (5.48). The implementation of the solution in Matlab is given in Listing 5.3. The reconstructed 3D points in the first and the second camera frames are saved to variables pC1 and pC2, respectively.

Listing 5.3

3D point reconstruction from Example 5.57

1 M =  s i z e ( pairs ,  2) ;

2 pC1  =  zeros(3 , M ) ;

3 for  i   =   1: M

4   a  =   pairs (1 , i ) ;  b  =   pairs (2 , i ) ;

5   c1  =  S∖pP1 ( : , a) ;

6   c2  =  − R *( S ∖ pP2 ( : , b ) ) ;

7   psi   =   [ c1 ,  c2 ]∖ t ;

8   pC1 ( : , i )  =   psi (1) *c1 ;

9 end

10 pC2  =   R. ’*(pC1− repmat ( t ,   1 ,   M ) ) ;

pC1   =

 −0.9989         0.9994             0       −0.2999

  0.9989         0.9994       −1.0005       −0.2999

  3.9956         3.9978        4.0018        2.9994

pC2   =

 −0.1372         0.8638       −0.4988       −1.0973

  0.7333         0.7327       −1.0013        0.1066

  5.6930         3.9635        4.3308        4.3316

The reconstruction problem simplifies in a canonical stereo configuration of two cameras, where one camera frame is only translated along the x-axis for baseline distance b (as shown in Fig. 5.6), since the epipolar lines are parallel and the epipolar line of the point pP1si146_e passes through that point (not only the point pP2si147_e in the other image). In the case of a digital image, this means that the matching point of a point in the first image is in the same image row in the second image. For the sake of simplicity, set αx = αy = α and γ = 0 in the camera model (5.49). The position of a 3D world point in the first camera frame can then be obtained from two images of the world point according to

pC1T=bdxP1cxyP1cyαf

si148_e  (5.58)

where the disparity was introduced as d=xP1xP2si149_e. The disparity contains information about scene depth as it can be seen from the last element in Eq. (5.58): zC1=αfbd1si150_e. For all the points that are in front of the camera the disparity is positive, d ≥ 0.

f05-06-9780128042045
Fig. 5.6 Canonical stereo setup.

Example 5.7

Assume a canonical stereo camera setup with baseline distance b = 0.2 and the following internal camera model parameters: αxf = αyf = 1000, cx = 512, and cy = 384. Reconstruct the position of a 3D point that projects to the points pP1T=[351522]si151_e and pP2T=[236522]si152_e in the first and second camera image, respectively.

Solution

Since this is a canonical stereo camera setup the solution can be obtained directly from Eq. (5.58) (see Listing 5.4).

Listing 5.4

3D point reconstruction from Example 5.58

1 %   I n t r i n s i c  camera parameters and image screen  s i z e

2 alphaF  =  1000;  %  alpha*f , in px/m

3 c  =   [512;   384];   %  image centre , in px

4 S  =   [ alphaF ,  0 , c (1) ;  0 , alphaF ,  c (2) ;  0 , 0 ,  1 ] ;   %  Camera model

5

6 b  =   0 . 2 ;   %  Stereo camera baseline , in m

7

8 %  Matched points

9 pP1  =   [351;   522];

10 pP2  =   [236;   522];

11

12 %  3D  reconstruction

13 d  =  pP1(1)− pP2 (1) ;

14 pC1   =   b / d *[pP1− c ;   alphaF ]

pC1   =

 −0.2800

  0.2400

  1.7391

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset