Engineering, physics, and computer sciences, in general, but particularly solid mechanics, aerodynamics, aeronautics, fluid flow, heat flow, electrostatics, quantum physics, laser technology, robotics as well as other areas have applications that require an understanding of vector calculus. This field encompasses vector differential calculus and vector integral calculus. Indeed, the engineer, physicist, and mathematician need a good grounding in these areas as provided by the carefully chosen material of Chaps. 9 and 10.
Forces, velocities, and various other quantities may be thought of as vectors. Vectors appear frequently in the applications above and also in the biological and social sciences, so it is natural that problems are modeled in 3-space. This is the space of three dimensions with the usual measurement of distance, as given by the Pythagorean theorem. Within that realm, 2-space (the plane) is a special case. Working in 3-space requires that we extend the common differential calculus to vector differential calculus, that is, the calculus that deals with vector functions and vector fields and is explained in this chapter.
Chapter 9 is arranged in three groups of sections. Sections 9.1–9.3 extend the basic algebraic operations of vectors into 3-space. These operations include the inner product and the cross product. Sections 9.4 and 9.5 form the heart of vector differential calculus. Finally, Secs. 9.7–9.9 discuss three physically important concepts related to scalar and vector fields: gradient (Sec. 9.7), divergence (Sec. 9.8), and curl (Sec. 9.9). They are expressed in Cartesian coordinates in this chapter and, if desired, expressed in curvilinear coordinates in a short section in App. A3.4.
We shall keep this chapter independent of Chaps. 7 and 8. Our present approach is in harmony with Chap. 7, with the restriction to two and three dimensions providing for a richer theory with basic physical, engineering, and geometric applications.
Prerequisite: Elementary use of second- and third-order determinants in Sec. 9.3.
Sections that may be omitted in a shorter course: 9.5, 9.6.
References and Answers to Problems: App. 1 Part B, App. 2.
In engineering, physics, mathematics, and other areas we encounter two kinds of quantities. They are scalars and vectors.
A scalar is a quantity that is determined by its magnitude. It takes on a numerical value, i.e., a number. Examples of scalars are time, temperature, length, distance, speed, density, energy, and voltage.
In contrast, a vector is a quantity that has both magnitude and direction. We can say that a vector is an arrow or a directed line segment. For example, a velocity vector has length or magnitude, which is speed, and direction, which indicates the direction of motion. Typical examples of vectors are displacement, velocity, and force, see Fig. 164 as an illustration.
More formally, we have the following. We denote vectors by lowercase boldface letters a, b, v, etc. In handwriting you may use arrows, for instance, (in place of a), , etc.
A vector (arrow) has a tail, called its initial point, and a tip, called its terminal point. This is motivated in the translation (displacement without rotation) of the triangle in Fig. 165, where the initial point P of the vector a is the original position of a point, and the terminal point Q is the terminal position of that point, its position after the translation. The length of the arrow equals the distance between P and Q. This is called the length (or magnitude) of the vector a and is denoted by |a|. Another name for length is norm (or Euclidean norm).
A vector of length 1 is called a unit vector.
Of course, we would like to calculate with vectors. For instance, we want to find the resultant of forces or compare parallel forces of different magnitude. This motivates our next ideas: to define components of a vector, and then the two basic algebraic operations of vector addition and scalar multiplication.
For this we must first define equality of vectors in a way that is practical in connection with forces and other applications.
DEFINITION Equality of Vectors
Two vectors a and b are equal, written a = b, if they have the same length and the same direction [as explained in Fig. 166; in particular, note (B)]. Hence a vector can be arbitrarily translated; that is, its initial point can be chosen arbitrarily.
We choose an xyz Cartesian coordinate system1 in space (Fig. 167), that is, a usual rectangular coordinate system with the same scale of measurement on the three mutually perpendicular coordinate axes. Let a be a given vector with initial point P: (x1, y1, z1) and terminal point Q: (x2, y2, z2). Then the three coordinate differences
are called the components of the vector a with respect to that coordinate system, and we write simply a = [a1, a2, a3]. See Fig. 168.
The length |a| of a can now readily be expressed in terms of components because from (1) and the Pythagorean theorem we have
EXAMPLE 1 Components and Length of a Vector
The vector a with initial point P: (4, 0, 2) and terminal point Q: (6, −1, 2) has the components
Hence a = [2, −1, 0]. (Can you sketch a, as in Fig. 168?) Equation (2) gives the length
If we choose (−1, 5, 8) as the initial point of a, the corresponding terminal point is (1, 4, 8).
If we choose the origin (0, 0, 0) as the initial point of a, the corresponding terminal point is (2, −1, 0); its coordinates equal the components of a. This suggests that we can determine each point in space by a vector, called the position vector of the point, as follows.
A Cartesian coordinate system being given, the position vector r of a point A: (x, y, z) is the vector with the origin (0, 0, 0) as the initial point and A as the terminal point (see Fig. 169). Thus in components, r = [x, y, z]. This can be seen directly from (1) with x1 = y1 = z1 = 0.
Furthermore, if we translate a vector a, with initial point P and terminal point Q, then corresponding coordinates of P and Q change by the same amount, so that the differences in (1) remain unchanged. This proves
THEOREM 1 Vectors as Ordered Triples of Real Numbers
A fixed Cartesian coordinate system being given, each vector is uniquely determined by its ordered triple of corresponding components. Conversely, to each ordered triple of real numbers (a1, a2, a3) there corresponds precisely one vector a = [a1, a2, a3], with (0, 0, 0) corresponding to the zero vector 0, which has length 0 and no direction.
Hence a vector equation a = b is equivalent to the three equations a1 = b1, a2 = b2, a3 = b3 for the components.
We now see that from our “geometric” definition of a vector as an arrow we have arrived at an “algebraic” characterization of a vector by Theorem 1. We could have started from the latter and reversed our process. This shows that the two approaches are equivalent.
Calculations with vectors are very useful and are almost as simple as the arithmetic for real numbers. Vector arithmetic follows almost naturally from applications. We first define how to add vectors and later on how to multiply a vector by a number.
DEFINITION Addition of Vectors
The sum a + b of two vectors a = [a1, a2, a3] and b = [b1, b2, b3] is obtained by adding the corresponding components,
Geometrically, place the vectors as in Fig. 170 (the initial point of b at the terminal point of a); then a + b is the vector drawn from the initial point of a to the terminal point of b.
For forces, this addition is the parallelogram law by which we obtain the resultant of two forces in mechanics. See Fig. 171.
Figure 172 shows (for the plane) that the “algebraic” way and the “geometric way” of vector addition give the same vector.
Basic Properties of Vector Addition. Familiar laws for real numbers give immediately
Properties (a) and (b) are verified geometrically in Figs. 173 and 174. Furthermore, −a denotes the vector having the length |a| and the direction opposite to that of a.
In (4b) we may simply write u + v + w, and similarly for sums of more than three vectors. Instead of a + a we also write 2a, and so on. This (and the notation −a used just before) motivates defining the second algebraic operation for vectors as follows.
DEFINITION Scalar Multiplication (Multiplication by a Number)
The product ca of any vector a = [a1, a2, a3] and any scalar c (real number c) is the vector obtained by multiplying each component of a by c,
Geometrically, if a ≠ 0, then ca with c > 0 has the direction of a and with c < 0 the direction opposite to a. In any case, the length of ca is |ca| = |c||a|, and ca = 0 if a = 0 or c = 0 (or both). (See Fig. 175.)
Basic Properties of Scalar Multiplication. From the definitions we obtain directly
You may prove that (4) and (6) imply for any vector a
Instead of b + (−a) we simply write b − a (Fig. 176).
EXAMPLE 2 Vector Addition. Multiplication by Scalars
With respect to a given coordinate system, let
Then −a = [−4, 0, −1], 7a = [28, 0, 7], and
Unit Vectors i, j, k. Besides a = [a1, a2, a3] another popular way of writing vectors is
In this representation, i, j, k are the unit vectors in the positive directions of the axes of a Cartesian coordinate system (Fig. 177). Hence, in components,
and the right side of (8) is a sum of three vectors parallel to the three axes.
EXAMPLE 3 ijk Notation for Vectors
In Example 2 we have , and so on.
All the vectors a = [a1, a2, a3] = a1i + a2j + a3k (with real numbers as components) form the real vector space R3 with the two algebraic operations of vector addition and scalar multiplication as just defined. R3 has dimension 3. The triple of vectors i, j, k is called a standard basis of R3. Given a Cartesian coordinate system, the representation (8) of a given vector is unique.
Vector space R3 is a model of a general vector space, as discussed in Sec. 7.9, but is not needed in this chapter.
1–5 COMPONENTS AND LENGTH
Find the components of the vector v with initial point P and terminal point Q. Find |v|. Sketch |v|. Find the unit vector u in the direction of v.
6–10 Find the terminal point Q of the vector v with components as given and initial point P. Find |v|.
11–18 ADDITION, SCALAR MULTIPLICATION
Let a = [3, 2, 0] = 3i + 2j; b = [−4, 6, 0] = 4i + 6j, c = [5, −1, 8] = 5i − j + 8k, d = [0, 0, 4] = 4k.
Find:
21–25 FORCES, RESULTANT
Find the resultant in terms of components and its magnitude.
26–37 FORCES, VELOCITIES
(a) The diagonals of a parallelogram bisect each other.
(b) The line through the midpoints of adjacent sides of a parallelogram bisects one of the diagonals in the ratio 1 : 3.
(c) Obtain (b) from (a).
(d) The three medians of a triangle (the segments from a vertex to the midpoint of the opposite side) meet at a single point, which divides the medians in the ratio 2 : 1.
(e) The quadrilateral whose vertices are the midpoints of the sides of an arbitrary quadrilateral is a parallelogram.
(f) The four space diagonals of a parallelepiped meet and bisect each other.
(g) The sum of the vectors drawn from the center of a regular polygon to its vertices is the zero vector.
The inner product or dot product can be motivated by calculating work done by a constant force, determining components of forces, or other applications. It involves the length of vectors and the angle between them. The inner product is a kind of multiplication of two vectors, defined in such a way that the outcome is a scalar. Indeed, another term for inner product is scalar product, a term we shall not use here. The definition of the inner product is as follows.
DEFINITION Inner Product (Dot Product) of Vectors
The inner product or dot product a • b (read “a dot b”) of two vectors a and b is the product of their lengths times the cosine of their angle (see Fig. 178),
The angle γ, 0 γ π, between a and b is measured when the initial points of the vectors coincide, as in Fig. 178. In components, a = [a1, a2, a3], b = [b1, b2, b3], and
The second line in (1) is needed because γ is undefined when a = 0 or b = 0. The derivation of (2) from (1) is shown below.
Orthogonality. Since the cosine in (1) may be positive, 0, or negative, so may be the inner product (Fig. 178). The case that the inner product is zero is of particular practical interest and suggests the following concept.
A vector a is called orthogonal to a vector b if a • b = 0. Then b is also orthogonal to a, and we call a and b orthogonal vectors. Clearly, this happens for nonzero vectors if and only if cos γ = 0; thus γ = π/2 (90°). This proves the important
THEOREM 1 Orthogonality Criterion
The inner product of two nonzero vectors is 0 if and only if these vectors are perpendicular.
Length and Angle. Equation (1) with b = a gives a • a = |a|2. Hence
From (3) and (1) we obtain for the angle γ between two nonzero vectors
EXAMPLE 1 Inner Product. Angle Between Vectors
Find the inner product and the lengths of a = [1, 2, 0] and b = [3, −2, 1] as well as the angle between these vectors.
Solution. , and (4) gives the angle
From the definition we see that the inner product has the following properties. For any vectors a, b, c and scalars q1, q2,
Hence dot multiplication is commutative as shown by (5b). Furthermore, it is distributive with respect to vector addition. This follows from (5a) with q1 = 1 and q2 = 1:
Furthermore, from (1) and |cos γ| 1 we see that
Using this and (3), you may prove (see Prob. 16)
Geometrically, (7) with < says that one side of a triangle must be shorter than the other two sides together; this motivates the name of (7).
A simple direct calculation with inner products shows that
Equations (6)–(8) play a basic role in so-called Hilbert spaces, which are abstract inner product spaces. Hilbert spaces form the basis of quantum mechanics, for details see [GenRef7] listed in App. 1.
Derivation of (2) from (1). We write a = a1i + a2j + a3k and b = b1i + b2j + b3k, as in (8) of Sec. 9.1. If we substitute this into a • b and use (5a*), we first have a sum of 3 × 3 = 9 products
Now i, j, k are unit vectors, so that i • i = j • j = k • k = 1 by (3). Since the coordinate axes are perpendicular, so are i, j, k, and Theorem 1 implies that the other six of those nine products are 0, namely, i • j = j • i = j • k = k • j = k • i = i • k = 0. But this reduces our sum for a • b to (2).
Typical applications of inner products are shown in the following examples and in Problem Set 9.2.
EXAMPLE 2 Work Done by a Force Expressed as an Inner Product
This is a major application. It concerns a body on which a constant force p acts. (For a variable force, see Sec. 10.1.) Let the body be given a displacement d. Then the work done by p in the displacement is defined as
that is, magnitude |p| of the force times length |d| of the displacement times the cosine of the angle α between p and d (Fig. 179). If α < 90°, as in Fig. 179, then W > 0. If p and d are orthogonal, then the work is zero (why?). If α > 90°, then W < 0, which means that in the displacement one has to do work against the force. For example, think of swimming across a river at some angle α against the current.
EXAMPLE 3 Component of a Force in a Given Direction
What force in the rope in Fig. 180 will hold a car of 5000 lb in equilibrium if the ramp makes an angle of 25° with the horizontal?
Solution. Introducing coordinates as shown, the weight is a = [0, −5000] because this force points downward, in the negative y-direction. We have to represent a as a sum (resultant) of two forces, a = c + p, where c is the force the car exerts on the ramp, which is of no interest to us, and p is parallel to the rope. A vector in the direction of the rope is (see Fig. 180)
The direction of the unit vector u is opposite to the direction of the rope so that
Since |u| = 1 and cos γ > 0, we see that we can write our result as
We can also note that γ = 90° − 25° = 65° is the angle between a and p so that
Answer: About 2100 lb.
Example 3 is typical of applications that deal with the component or projection of a vector a in the direction of a vector b(≠0). If we denote by p the length of the orthogonal projection of a on a straight line l parallel to b as shown in Fig. 181, then
Here p is taken with the plus sign if pb has the direction of b and with the minus sign if pb has the direction opposite to b.
Multiplying (10) by |b|/|b| = 1, we have a • b in the numerator and thus
If b is a unit vector, as it is often used for fixing a direction, then (11) simply gives
Figure 182 shows the projection p of a in the direction of b (as in Fig. 181) and the projection q = |b| cos γ of b in the direction of a.
By definition, an orthonormal basis for 3-space is a basis {a, b, c} consisting of orthogonal unit vectors. It has the great advantage that the determination of the coefficients in representations v = l1a + l2b + l3c of a given vector v is very simple. We claim that l1 = a • v, l2 = b • v, l3 = c • v. Indeed, this follows simply by taking the inner products of the representation with a, b, c, respectively, and using the orthonormality of the basis, a • v = l1a • a + l2a • b + l3a • c = l1, etc.
For example, the unit vectors i, j, k in (8), Sec. 9.1, associated with a Cartesian coordinate system form an orthonormal basis, called the standard basis with respect to the given coordinate system.
EXAMPLE 5 Orthogonal Straight Lines in the Plane
Find the straight line L1 through the point P: (1, 3) in the xy-plane and perpendicular to the straight line L2: x − 2y + 2 = 0; see Fig. 183.
Solution. The idea is to write a general straight line L1: a1x + a2y = c as a • r = c with a = [a1, a2] ≠ 0 and r = [x, y], according to (2). Now the line through the origin and parallel to L1 is a • r = 0. Hence, by Theorem 1, the vector a is perpendicular to r. Hence it is perpendicular to and also to L1 because L1 and are parallel. a is called a normal vector of L1 (and of ).
Now a normal vector of the given line x − 2y + 2 = 0 is b = [1, −2]. Thus L1 is perpendicular to L2 if b • a = a1 − 2a2 = 0, for instance, if a = [2, 1]. Hence L1 is given by 2x + y = c. It passes through P: (1, 3) when 2 · 1 + 3 = c = 5. Answer: y = −2x + 5. Show that the point of intersection is (x, y) = (1.6, 1.8).
EXAMPLE 6 Normal Vector to a Plane
Find a unit vector perpendicular to the plane 4x + 2y + 4z = −7.
Solution. Using (2), we may write any plane in space as
where a = [a1, a2, a3] ≠ 0 and r = [x, y, z]. The unit vector in the direction of a is (Fig. 184)
Dividing by |a|, we obtain from (13)
From (12) we see that p is the projection of r in the direction of n. This projection has the same constant value c/|a| for the position vector r of any point in the plane. Clearly this holds if and only if n is perpendicular to the plane. n is called a unit normal vector of the plane (the other being −n).
Furthermore, from this and the definition of projection, it follows that |p| is the distance of the plane from the origin. Representation (14) is called Hesse's2 normal form of a plane. In our case, a = [4, 2, 4], c = −7, , and the plane has the distance from the origin.
1–10 INNER PRODUCT
Let a = [1, −3, 5], b = [4, 0, 8], c = [−2, 9, 1]. Find:
11–16 GENERAL PROBLEMS
17–20 WORK
Find the work done by a force p acting on a body if the body is displaced along the straight segment from A to B. Sketch and p. Show the details.
22–30 ANGLE BETWEEN VECTORS
Let a = [1, 1, 0], b = [3, 2, 1], and c = [1, 0, 2]. Find the angle between:
31–35 ORTHOGONALITY is particularly important, mainly because of orthogonal coordinates, such as Cartesian coordinates, whose natural basis [Eq. (9), Sec. 9.1], consists of three orthogonal unit vectors.
36–40 COMPONENT IN THE DIRECTION OF A VECTOR
Find the component of a in the direction of b. Make a sketch.
We shall define another form of multiplication of vectors, inspired by applications, whose result will be a vector. This is in contrast to the dot product of Sec. 9.2 where multiplication resulted in a scalar. We can construct a vector v that is perpendicular to two vectors a and b, which are two sides of a parallelogram on a plane in space as indicated in Fig. 185, such that the length |v| is numerically equal to the area of that parallelogram. Here then is the new concept.
DEFINITION Vector Product (Cross Product, Outer Product) of Vectors
The vector product or cross product a × b (read “a cross b”) of two vectors a and b is the vector v denoted by
I. If a = 0 or b = 0, then we define v = a × b = 0.
II. If both vectors are nonzero vectors, then vector v has the length
where γ is the angle between a and b as in Sec. 9.2.
Furthermore, by design, a and b form the sides of a parallelogram on a plane in space. The parallelogram is shaded in blue in Fig. 185. The area of this blue parallelogram is precisely given by Eq. (1), so that the length |v| of the vector v is equal to the area of that parallelogram.
III. If a and b lie in the same straight line, i.e., a and b have the same or opposite directions, then γ is 0° or 180° so that sin γ = 0. In that case |v| = 0 so that v = a × b = 0.
IV. If cases I and III do not occur, then v is a nonzero vector. The direction of v = a × b is perpendicular to both a and b such that a, b, v—precisely in this order (!)—form a right-handed triple as shown in Figs. 185–187 and explained below.
Another term for vector product is outer product.
Remark. Note that I and III completely characterize the exceptional case when the cross product is equal to the zero vector, and II and IV the regular case where the cross product is perpendicular to two vectors.
Just as we did with the dot product, we would also like to express the cross product in components. Let a = [a1, a2, a3] and b = [b1, b2, b3]. Then v = [υ1, υ2, υ3] = a × b has the components
Here the Cartesian coordinate system is right-handed, as explained below (see also Fig. 188). (For a left-handed system, each component of v must be multiplied by −1. Derivation of (2) in App. 4.)
Right-Handed Triple. A triple of vectors a, b, v is right-handed if the vectors in the given order assume the same sort of orientation as the thumb, index finger, and middle finger of the right hand when these are held as in Fig. 186. We may also say that if a is rotated into the direction of b through the angle γ(<π), then v advances in the same direction as a right-handed screw would if turned in the same way (Fig. 187).
Right-Handed Cartesian Coordinate System. The system is called right-handed if the corresponding unit vectors i, j, k in the positive directions of the axes (see Sec. 9.1) form a right-handed triple as in Fig. 188a. The system is called left-handed if the sense of k is reversed, as in Fig. 188b. In applications, we prefer right-handed systems.
How to Memorize (2). If you know second- and third-order determinants, you see that (2) can be written
and v = [υ1, υ2, υ3] = υ1i + υ2j + υ3k is the expansion of the following symbolic determinant by its first row. (We call the determinant “symbolic” because the first row consists of vectors rather than of numbers.)
For a left-handed system the determinant has a minus sign in front.
For the vector product v = a × b of a = [1, 1, 0] and b = [3, 0, 0] in right-handed coordinates we obtain from (2)
We confirm this by (2**):
To check the result in this simple case, sketch a, b, and v. Can you see that two vectors in the xy-plane must always have their vector product parallel to the z-axis (or equal to the zero vector)?
THEOREM 1 General Properties of Vector Products
(a) For every scalar l,
(b) Cross multiplication is distributive with respect to vector addition; that is,
(c) Cross multiplication is not commutative but anticommutative; that is,
(d) Cross multiplication is not associative; that is, in general,
so that the parentheses cannot be omitted.
Equation (4) follows directly from the definition. In (5α), formula (2*) gives for the first component on the left
By (2*) the sum of the two determinants is the first component of (a × b) + (a × c), the right side of (5α). For the other components in (5α) and in 5(β), equality follows by the same idea.
Anticommutativity (6) follows from (2**) by noting that the interchange of Rows 2 and 3 multiplies the determinant by −1. We can confirm this geometrically if we set a × b = v and b × a = w; then |v| = |w| by (1), and for b, a, w to form a right-handed triple, we must have w = −v.
Finally, i ×(i × j) = i × k = −j, whereas (i × i) × j = 0 × j = 0 (see Example 2). This proves (7).
In mechanics the moment m of a force p about a point Q is defined as the product m = |p|d, where d is the (perpendicular) distance between Q and the line of action L of p (Fig. 190). If r is the vector from Q to any point A on L, then d = |r| sin γ, as shown in Fig. 190, and
Since γ is the angle between r and p, we see from (1) that m = |r × p|. The vector
is called the moment vector or vector moment of p about Q. Its magnitude is m. If m ≠ 0, its direction is that of the axis of the rotation about Q that p has the tendency to produce. This axis is perpendicular to both r and p.
Find the moment of the force p about the center Q of a wheel, as given in Fig. 191.
Solution. Introducing coordinates as shown in Fig. 191, we have
(Note that the center of the wheel is at y = −1.5 on the y-axis.) Hence (8) and (2**) give
This moment vector m is normal, i.e., perpendicular to the plane of the wheel. Hence it has the direction of the axis of rotation about the center Q of the wheel that the force p has the tendency to produce. The moment m points in the negative z-direction, This is, the direction in which a right-handed screw would advance if turned in that way.
EXAMPLE 5 Velocity of a Rotating Body
A rotation of a rigid body B in space can be simply and uniquely described by a vector w as follows. The direction of w is that of the axis of rotation and such that the rotation appears clockwise if one looks from the initial point of w to its terminal point. The length of w is equal to the angular speed ω (> 0) of the rotation, that is, the linear (or tangential) speed of a point of B divided by its distance from the axis of rotation.
Let P be any point of B and d its distance from the axis. Then P has the speed ωd. Let r be the position vector of P referred to a coordinate system with origin 0 on the axis of rotation. Then d = |r| sin γ, where γ is the angle between w and r. Therefore,
From this and the definition of vector product we see that the velocity vector v of P can be represented in the form (Fig. 192)
This simple formula is useful for determining v at any point of B.
Certain products of vectors, having three or more factors, occur in applications. The most important of these products is the scalar triple product or mixed product of three vectors a, b, c.
The scalar triple product is indeed a scalar since (10*) involves a dot product, which in turn is a scalar. We want to express the scalar triple product in components and as a third-order determinant. To this end, let a = [a1, a2, a2], b = [b1, b2, b3], and c = [c1, c2, c3]. Also set b × c = v = [υ1, υ2, υ3]. Then from the dot product in components [formula (2) in Sec. 9.2] and from (2*) with b and c instead of a and b we first obtain
The sum on the right is the expansion of a third-order determinant by its first row. Thus we obtain the desired formula for the scalar triple product, that is,
The most important properties of the scalar triple product are as follows.
THEOREM 2 Properties and Applications of Scalar Triple Products
(a) In (10) the dot and cross can be interchanged:
(b) Geometric interpretation. The absolute value |(a b c)| of (10) is the volume of the parallelepiped (oblique box) with a, b, c as edge vectors (Fig. 193).
(c) Linear independence. Three vectors in R3 are linearly independent if and only if their scalar triple product is not zero.
PROOF
(a) Dot multiplication is commutative, so that by (10)
From this we obtain the determinant in (10) by interchanging Rows 1 and 2 and in the result Rows 2 and 3. But this does not change the value of the determinant because each interchange produces a factor −1, and (−1)(−1) = 1. This proves (11).
(b) The volume of that box equals the height h = |a| cos γ| (Fig. 193) times the area of the base, which is the area |b × c| of the parallelogram with sides b and c. Hence the volume is
as given by the absolute value of (11).
(c) Three nonzero vectors, whose initial points coincide, are linearly independent if and only if the vectors do not lie in the same plane nor lie on the same straight line.
This happens if and only if the triple product in (b) is not zero, so that the independence criterion follows. (The case of one of the vectors being the zero vector is trivial.)
A tetrahedron is determined by three edge vectors a, b, c, as indicated in Fig. 194. Find the volume of the tetrahedron in Fig. 194, when a = [2, 0, 3], b = [0, 4, 1], c = [5, 6, 0].
Solution. The volume V of the parallelepiped with these vectors as edge vectors is the absolute value of the scalar triple product
Hence V = 72. The minus sign indicates that if the coordinates are right-handed, the triple a, b, c is left-handed. The volume of a tetrahedron is of that of the parallelepiped (can you prove it?), hence 12.
Can you sketch the tetrahedron, choosing the origin as the common initial point of the vectors? What are the coordinates of the four vertices?
This is the end of vector algebra (in space R3 and in the plane). Vector calculus (differentiation) begins in the next section.
1–10 GENERAL PROBLEMS
11–23 VECTOR AND SCALAR TRIPLE PRODUCTS
With respect to right-handed Cartesian coordinates, let, a = [2, 1, 0], b = [−3, 2, 0], c = [1, 4, −2], and d = [5, −1, 3]. Showing details, find:
25–35 APPLICATIONS
Our discussion of vector calculus begins with identifying the two types of functions on which it operates. Let P be any point in a domain of definition. Typical domains in applications are three-dimensional, or a surface or a curve in space. Then we define a vector function v, whose values are vectors, that is,
that depends on points P in space. We say that a vector function defines a vector field in a domain of definition. Typical domains were just mentioned. Examples of vector fields are the field of tangent vectors of a curve (shown in Fig. 195), normal vectors of a surface (Fig. 196), and velocity field of a rotating body (Fig. 197). Note that vector functions may also depend on time t or on some other parameters.
Similarly, we define a scalar function f, whose values are scalars, that is,
that depends on P. We say that a scalar function defines a scalar field in that three-dimensional domain or surface or curve in space. Two representative examples of scalar fields are the temperature field of a body and the pressure field of the air in Earth's atmosphere. Note that scalar functions may also depend on some parameter such as time t.
Notation. If we introduce Cartesian coordinates x, y, z, then, instead of writing v(P) for the vector function, we can write
We have to keep in mind that the components depend on our choice of coordinate system, whereas a vector field that has a physical or geometric meaning should have magnitude and direction depending only on P, not on the choice of coordinate system.
Similarly, for a scalar function, we write
We illustrate our discussion of vector functions, scalar functions, vector fields, and scalar fields by the following three examples.
EXAMPLE 1 Scalar Function (Euclidean Distance in Space)
The distance f(P) of any point P from a fixed point P0 in space is a scalar function whose domain of definition is the whole space. f(P) defines a scalar field in space. If we introduce a Cartesian coordinate system and P0 has the coordinates x0, y0, z0, then f is given by the well-known formula
where x, y, z are the coordinates of P. If we replace the given Cartesian coordinate system with another such system by translating and rotating the given system, then the values of the coordinates of P and P0 will in general change, but f(P) will have the same value as before. Hence f(P) is a scalar function. The direction cosines of the straight line through P and P0 are not scalars because their values depend on the choice of the coordinate system.
EXAMPLE 2 Vector Field (Velocity Field)
At any instant the velocity vectors v(P) of a rotating body B constitute a vector field, called the velocity field of the rotation. If we introduce a Cartesian coordinate system having the origin on the axis of rotation, then (see Example 5 in Sec. 9.3)
where x, y, z are the coordinates of any point P of B at the instant under consideration. If the coordinates are such that the z-axis is the axis of rotation and w points in the positive z-direction, then w = ωk and
An example of a rotating body and the corresponding velocity field are shown in Fig. 197.
EXAMPLE 3 Vector Field (Field of Force, Gravitational Field)
Let a particle A of mass M be fixed at a point P0 and let a particle B of mass m be free to take up various positions P in space. Then A attracts B. According to Newton's law of gravitation the corresponding gravitational force p is directed from P to P0, and its magnitude is proportional to 1/r2, where r is the distance between P and P0, say,
Here G = 6.67 · 10−8 cm3/(g · sec2) is the gravitational constant. Hence p defines a vector field in space. If we introduce Cartesian coordinates such that P0 has the coordinates x0, y0, z0 and P has the coordinates x, y, z, then by the Pythagorean theorem,
Assuming that r > 0 and introducing the vector
we have |r| = r, and (−1/r)r is a unit vector in the direction of p; the minus sign indicates that p is directed from P to P0 (Fig. 198). From this and (2) we obtain
This vector function describes the gravitational force acting on B.
The student may be pleased to learn that many of the concepts covered in (regular) calculus carry over to vector calculus. Indeed, we show how the basic concepts of convergence, continuity, and differentiability from calculus can be defined for vector functions in a simple and natural way. Most important of these is the derivative of a vector function.
Convergence. An infinite sequence of vectors a(n), n = 1, 2, …, is said to converge if there is a vector a such that
a is called the limit vector of that sequence, and we write
If the vectors are given in Cartesian coordinates, then this sequence of vectors converges to a if and only if the three sequences of components of the vectors converge to the corresponding components of a. We leave the simple proof to the student.
Similarly, a vector function v(t) of a real variable t is said to have the limit l as t approaches t0, if v(t) is defined in some neighborhood of t0 (possibly except at t0) and
Then we write
Here, a neighborhood of t0 is an interval (segment) on the t-axis containing t0 as an interior point (not as an endpoint).
Continuity. A vector function v(t) is said to be continuous at t = t0 if it is defined in some neighborhood of t0 (including at t0 itself!) and
If we introduce a Cartesian coordinate system, we may write
Then v(t) is continuous at t0 if and only if its three components are continuous at t0.
We now state the most important of these definitions.
DEFINITION Derivative of a Vector Function
A vector function v(t) is said to be differentiable at a point t if the following limit exists:
This vector v′(t) is called the derivative of v(t). See Fig. 199.
In components with respect to a given Cartesian coordinate system,
Hence the derivative v′(t) is obtained by differentiating each component separately. For instance, if v = [t, t2, 0], then v′ = [1, 2t, 0].
Equation (10) follows from (9) and conversely because (9) is a “vector form” of the usual formula of calculus by which the derivative of a function of a single variable is defined. [The curve in Fig. 199 is the locus of the terminal points representing v(t) for values of the independent variable in some interval containing t and t + Δt in (9)]. It follows that the familiar differentiation rules continue to hold for differentiating vector functions, for instance,
and in particular
The simple proofs are left to the student. In (12), note the order of the vectors carefully because cross multiplication is not commutative.
EXAMPLE 4 Derivative of a Vector Function of Constant Length
Let v(t) be a vector function whose length is constant, say, |v(t)| = c. Then |v|2 = v • v = c2, and (v • v)′ = 2v • v′ = 0, by differentiation [see (11)]. This yields the following result. The derivative of a vector function v(t) of constant length is either the zero vector or is perpendicular to v(t).
Our present discussion shows that partial differentiation of vector functions of two or more variables can be introduced as follows. Suppose that the components of a vector function
are differentiable functions of n variables t1, …, tn. Then the partial derivative of v with respect to tm is denoted by ∂v/∂tm and is defined as the vector function
Similarly, second partial derivatives are
and so on.
Various physical and geometric applications of derivatives of vector functions will be discussed in the next sections as well as in Chap. 10.
1–8 SCALAR FIELDS IN THE PLANE
Let the temperature T in a body be independent of z so that it is given by a scalar function T = T(x, t). Identify the isotherms T(x, y) = const. Sketch some of them.
9–14 SCALAR FIELDS IN SPACE
What kind of surfaces are the level surfaces f(x, y, z) = const?
Sketch figures similar to Fig. 198. Try to interpet the field of v as a velocity field.
22–25 DIFFERENTIATION
Vector calculus has important applications to curves (Sec. 9.5) and surfaces (to be covered in Sec. 10.5) in physics and geometry. The application of vector calculus to geometry is a field known as differential geometry. Differential geometric methods are applied to problems in mechanics, computer-aided as well as traditional engineering design, geodesy, geography, space travel, and relativity theory. For details, see [GenRef8] and [GenRef9] in App. 1.
Bodies that move in space form paths that may be represented by curves C. This and other applications show the need for parametric representations of C with parameter t, which may denote time or something else (see Fig. 200). A typical parametric representation is given by
Here t is the parameter and x, y, z are Cartesian coordinates, that is, the usual rectangular coordinates as shown in Sec. 9.1. To each value t = t0, there corresponds a point of C with position vector r(t0) whose coordinates are x(t0), y(t0), z(t0). This is illustrated in Figs. 201 and 202.
The use of parametric representations has key advantages over other representations that involve projections into the xy-plane and xz-plane or involve a pair of equations with y or with z as independent variable. The projections look like this:
The advantages of using (1) instead of (2) are that, in (1), the coordinates x, y, z all play an equal role, that is, all three coordinates are dependent variables. Moreover, the parametric representation (1) induces an orientation on C. This means that as we increase t, we travel along the curve C in a certain direction. The sense of increasing t is called the positive sense on C. The sense of decreasing t is then called the negative sense on C, given by (1).
Examples 1–4 give parametric representations of several important curves.
EXAMPLE 1 Circle. Parametric Representation. Positive Sense
The circle x2 + y2 = 4, z = 0 in the xy-plane with center 0 and radius 2 can be represented parametrically by
where 0 t 2π. Indeed, x2 + y2 = (2 cos t)2 + (2 sin t)2 = 4(cos2 t + sin2 t) = 4, For t = 0 we have r(0) = [2, 0], for we get and so on. The positive sense induced by this representation is the counterclockwise sense.
If we replace t with t* = −t, we have t = −t* and get
This has reversed the orientation, and the circle is now oriented clockwise.
The vector function
represents an ellipse in the xy-plane with center at the origin and principal axes in the direction of the x- and y- axes. In fact, since cos2 t + sin2 t = 1, we obtain from (3)
If b = a, then (3) represents a circle of radius a.
A straight line L through a point A with position vector a in the direction of a constant vector b (see Fig. 203) can be represented parametrically in the form
If b is a unit vector, its components are the direction cosines of L. In this case, |t| measures the distance of the points of L from A. For instance, the straight line in the xy-plane through A: (3, 2) having slope 1 is (sketch it)
A plane curve is a curve that lies in a plane in space. A curve that is not plane is called a twisted curve. A standard example of a twisted curve is the following.
The twisted curve C represented by the vector function
is called a circular helix. It lies on the cylinder x2 + y2 = a2. If c > 0, the helix is shaped like a right-handed screw (Fig. 204). If c < 0 it looks like a left-handed screw (Fig. 205). If c = 0, then (5) is a circle.
A simple curve is a curve without multiple points, that is, without points at which the curve intersects or touches itself. Circle and helix are simple curves. Figure 206 shows curves that are not simple. An example is [sin 2t, cos t, 0]. Can you sketch it?
An arc of a curve is the portion between any two points of the curve. For simplicity, we say “curve” for curves as well as for arcs.
The next idea is the approximation of a curve by straight lines, leading to tangents and to a definition of length. Tangents are straight lines touching a curve. The tangent to a simple curve C at a point P of C is the limiting position of a straight line L through P and a point Q of C as Q approaches P along C. See Fig. 207.
Let us formalize this concept. If C is given by r(t), and P and Q correspond to t and t + Δt, then a vector in the direction of L is
In the limit this vector becomes the derivative
provided r(t) is differentiable, as we shall assume from now on. If r′ (t) ≠ 0, we call r′ (t) a tangent vector of C at P because it has the direction of the tangent. The corresponding unit vector is the unit tangent vector (see Fig. 207)
Note that both r′ and u point in the direction of increasing t. Hence their sense depends on the orientation of C. It is reversed if we reverse the orientation.
It is now easy to see that the tangent to C at P is given by
This is the sum of the position vector r of P and a multiple of the tangent vector r′ of C at P. Both vectors depend on P. The variable w is the parameter in (9).
EXAMPLE 5 Tangent to an Ellipse
Find the tangent to the ellipse at P: .
Solution. Equation (3) with semi-axes a = 2 and b = 1 gives r(t) = [2 cos t, sin t]. The derivative is r′ (t) = [−2 sin t, cos t]. Now P corresponds to t = π/4 because
Hence From (9) we thus get the answer
To check the result, sketch or graph the ellipse and the tangent.
We are now ready to define the length l of a curve. l will be the limit of the lengths of broken lines of n chords (see Fig. 209, where n = 5) with larger and larger n. For this, let r(t), a t b, represent C. For each n = 1, 2, …, we subdivide (“partition”) the interval a t b by points
This gives a broken line of chords with endpoints r(t0), …, r(tn). We do this arbitrarily but so that the greatest |Δtm| = |tm − tm−1| approaches 0 as n → ∞. The lengths l1, l2, … of these chords can be obtained from the Pythagorean theorem. If r(t) has a continuous derivative r′(t), it can be shown that the sequence l1, l2, … has a limit, which is independent of the particular choice of the representation of C and of the choice of subdivisions. This limit is given by the integral
l is called the length of C, and C is called rectifiable. Formula (10) is made plausible in calculus for plane curves and is proved for curves in space in [GenRef8] listed in App. 1. The actual evaluation of the integral (10) will, in general, be difficult. However, some simple cases are given in the problem set.
The length (10) of a curve C is a constant, a positive number. But if we replace the fixed b in (10) with a variable t, the integral becomes a function of t, denoted by s(t) and called the arc length function or simply the arc length of C. Thus
Here the variable of integration is denoted by because t is now used in the upper limit.
Geometrically, s(t0) with some t0 > a is the length of the arc of C between the points with parametric values a and t0. The choice of a (the point s = 0) is arbitrary; changing a means changing s by a constant.
Linear Element ds. If we differentiate (11) and square, we have
It is customary to write
and
ds is called the linear element of C.
Arc Length as Parameter. The use of s in (1) instead of an arbitrary t simplifies various formulas. For the unit tangent vector (8) we simply obtain
Indeed, |r′ (s)| = (ds/ds) = 1 in (12) shows that r′ (s) is a unit vector. Even greater simplifications due to the use of s will occur in curvature and torsion (below).
EXAMPLE 6 Circular Helix. Circle. Arc Length as Parameter
The helix r(t) = [a cos t, a sin t, ct] in (5) has the derivative r′ (t = [−a sin t, a cos t, c]. Hence r′ • r′ = a2 + c2, a constant, which we denote by K2. Hence the integrand in (11) is constant, equal to K, and the integral is s = Kt. Thus t = s/K, so that a representation of the helix with the arc length s as parameter is
A circle is obtained if we set c = 0. Then K = a, t = s/a, and a representation with arc length s as parameter is
Curves play a basic role in mechanics, where they may serve as paths of moving bodies. Then such a curve C should be represented by a parametric representation r(t) with time t as parameter. The tangent vector (7) of C is then called the velocity vector v because, being tangent, it points in the instantaneous direction of motion and its length gives the speed ; see (12). The second derivative of r(t) is called the acceleration vector and is denoted by a. Its length |a| is called the acceleration of the motion. Thus
Tangential and Normal Acceleration. Whereas the velocity vector is always tangent to the path of motion, the acceleration vector will generally have another direction. We can split the acceleration vector into two directional components, that is,
where the tangential acceleration vector atan is tangent to the path (or, sometimes, 0) and the normal acceleration vector is normal (perpendicular) to the path (or, sometimes, 0).
Expressions for the vectors in (17) are obtained from (16) by the chain rule. We first have
where u(s) is the unit tangent vector (14). Another differentiation gives
Since the tangent vector u(s) has constant length (length one), its derivative du/ds is perpendicular to u(s), from the result in Example 4 in Sec. 9.4. Hence the first term on the right of (18) is the normal acceleration vector, and the second term on the right is the tangential acceleration vector, so that (18) is of the form (17).
Now the length |atan| is the absolute value of the projection of a in the direction of v, given by (11) in Sec. 9.2 with b = v; that is, |atan| = |a • v|/|v|. Hence atan is this expression times the unit vector (1/|v|)v in the direction of v, that is,
We now turn to two examples that are relevant to applications in space travel. They deal with the centripetal and centrifugal accelerations, as well as the Coriolis acceleration.
EXAMPLE 7 Centripetal Acceleration. Centrifugal Force
The vector function
(with fixed i and j) represents a circle C of radius R with center at the origin of the xy-plane and describes the motion of a small body B counterclockwise around the circle. Differentiation gives the velocity vector
v is tangent to C. Its magnitude, the speed, is
Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals ω, so that it is constant, too. Differentiating the velocity vector, we obtain the acceleration vector
This shows that a = −ω2r (Fig. 210), so that there is an acceleration toward the center, called the centripetal acceleration of the motion. It occurs because the velocity vector is changing direction at a constant rate. Its magnitude is constant, |a| ω2|r| = ω2R. Multiplying a by the mass m of B, we get the centripetal force ma. The opposite vector −ma is called the centrifugal force. At each instant these two forces are in equilibrium.
We see that in this motion the acceleration vector is normal (perpendicular) to C; hence there is no tangential acceleration.
EXAMPLE 8 Superposition of Rotations. Coriolis Acceleration
A projectile is moving with constant speed along a meridian of the rotating earth in Fig. 211. Find its acceleration.
Solution. Let x, y, z be a fixed Cartesian coordinate system in space, with unit vectors i, j, k in the directions of the axes. Let the Earth, together with a unit vector b, be rotating about the z-axis with angular speed ω > 0 (see Example 7). Since b is rotating together with the Earth, it is of the form
Let the projectile be moving on the meridian whose plane is spanned by b and k (Fig. 211) with constant angular speed ω > 0. Then its position vector in terms of b and k is
We have finished setting up the model. Next, we apply vector calculus to obtain the desired acceleration of the projectile. Our result will be unexpected—and highly relevant for air and space travel. The first and second derivatives of b with respect to t are
The first and second derivatives of r(t) with respect to t are
By analogy with Example 7 and because of b″ = −ω2b in (20) we conclude that the first term in a (involving ω in b″!) is the centripetal acceleration due to the rotation of the Earth. Similarly, the third term in the last line (involving γ!) is the centripetal acceleration due to the motion of the projectile on the meridian M of the rotating Earth.
The second, unexpected term −2γR sin γt b′ in a is called the Coriolis acceleration3 (Fig. 211) and is due to the interaction of the two rotations. On the Northern Hemisphere, sin γt > 0 (for t > 0; also γ > 0 by assumption), so that acor has the direction of −b′, that is, opposite to the rotation of the Earth. |acor| is maximum at the North Pole and zero at the equator. The projectile B of mass m0 experiences a force −m0 acor opposite to m0 acor, which tends to let B deviate from M to the right (and in the Southern Hemisphere, where sin γt < 0, to the left). This deviation has been observed for missiles, rockets, shells, and atmospheric airflow.
This last topic of Sec. 9.5 is optional but completes our discussion of curves relevant to vector calculus.
The curvature κ(s) of a curve C: r(s) (s the arc length) at a point P of C measures the rate of change |u′ (s)| of the unit tangent vector u(s) at P. Hence κ(s) measures the deviation of C at P from a straight line (its tangent at P). Since u(s) = r′ (s), the definition is
The torsion τ(s) of C at P measures the rate of change of the osculating plane O of curve C at point P. Note that this plane is spanned by u and u′ and shown in Fig. 212. Hence τ(s) measures the deviation of C at P from a plane (from O at P). Now the rate of change is also measured by the derivative b′ of a normal vector b at O. By the definition of vector product, a unit normal vector of O is b = u × (1/κ)u′ = u × p. Here p = (1/κ)u′ is called the unit principal normal vector and b is called the unit binormal vector of C at P. The vectors are labeled in Fig. 212. Here we must assume that κ ≠ 0; hence κ > 0. The absolute value of the torsion is now defined by
Whereas κ(s) is nonnegative, it is practical to give the torsion a sign, motivated by “right-handed” and “left-handed” (see Figs. 204 and 205). This needs a little further calculation. Since b is a unit vector, it has constant length. Hence b′ is perpendicular to b (see Example 4 in Sec. 9.4). Now b′ is also perpendicular to u because, by the definition of vector product, we have b • u = 0, b • u′ = 0. This implies
Hence if b′ ≠ 0 at P, it must have the direction of p or −p, so that it must be of the form b′ = −τp. Taking the dot product of this by p and using p • p = 1 gives
The minus sign is chosen to make the torsion of a right-handed helix positive and that of a left-handed helix negative (Figs. 204 and 205). The orthonormal vector triple u, p, b is called the trihedron of C. Figure 212 also shows the names of the three straight lines in the directions of u, p, b, which are the intersections of the osculating plane, the normal plane, and the rectifying plane.
1–10 PARAMETRIC REPRESENTATIONS
What curves are represented by the following? Sketch them.
11–20 FIND A PARAMETRIC REPRESENTATION
(a) r(t) = [2 cos t + cos 2t, 2 sin t − sin 2t] (Steiner's hypocycloid).
(b) r(t) = [cos t + k cos 2t, sin t − k sin 2t] with k = 10, 2, 1, , 0, , −1.
(c) r(t) = [cos t, sin 5t] (a Lissajous curve).
(d) r(t) = [cos t, sin kt]. For what k's will it be closed?
(e) r(t) = [R sin ωt + ωRt, R cos ωt + R] (cycloid).
24–28 TANGENT
Given a curve C: r(t), find a tangent vector r′(t), a unit tangent vector u′(t), and the tangent of C at P. Sketch curve and tangent.
29–32 LENGTH
Find the length and sketch the curve.
where ρ′ = dp/dθ. Derive this. Use it to find the total length of the cardioid ρ = a(1 − cos θ). Sketch this curve. Hint. Use (10) in App. 3.1.
35–46 CURVES IN MECHANICS
Forces acting on moving objects (cars, airplanes, ships, etc.) require the engineer to know corresponding tangential and normal accelerations. In Probs. 35–38 find them, along with the velocity and speed. Sketch the path.
39–42 THE USE OF A CAS may greatly facilitate the investigation of more complicated paths, as they occur in gear transmissions and other constructions. To grasp the idea, using a CAS, graph the path and find velocity, speed, and tangential and normal acceleration.
47–55 CURVATURE AND TORSION
The parametric representations of curves C required vector functions that depended on a single variable x, s, or t. We now want to systematically cover vector functions of several variables. This optional section is inserted into the book for your convenience and to make the book reasonably self-contained. Go onto Sec. 9.7 and consult Sec. 9.6 only when needed. For partial derivatives, see App. A3.2.
Figure 213 shows the notations in the following basic theorem.
Let w = f(x, y, z) be continuous and have continuous first partial derivatives in a domain D in xyz-space. Let x = x(u, v), y = y(u, v), z = z(u, v) be functions that are continuous and have first partial derivatives in a domain B in the uv-plane, where B is such that for every point (u, v) in B, the corresponding point [x(u, υ), y(u, υ), z(u, υ)] lies in D. See Fig. 213. Then the function
is defined in B, has first partial derivatives with respect to u and v in B, and
In this theorem, a domain D is an open connected point set in xyz-space, where “connected” means that any two points of D can be joined by a broken line of finitely many linear segments all of whose points belong to D. “Open” means that every point P of D has a neighborhood (a little ball with center P) all of whose points belong to D. For example, the interior of a cube or of an ellipsoid (the solid without the boundary surface) is a domain.
In calculus, x, y, z are often called the intermediate variables, in contrast with the independent variables u, υ and the dependent variable w.
If w = f(x, y) and x = x(u, υ), y = y(u, υ) as before, then (1) becomes
If w = f(x, y, z) and x = x(t), y = y(t), z = x(t), then (1) gives
If w = f(x, y) and x = x(t), y = y(t), then (3) reduces to
Finally, the simplest case w = f(x), x = x(t) gives
If w = x2 − y2 and we define polar coordinates r, θ by x = r cos θ, y = r sin θ, then (2) gives
Let w = f(x, y, z) and let z = g(x, y) represent a surface S in space. Then on S the function becomes
Hence, by (1), the partial derivatives are
We shall need this formula in Sec. 10.9.
EXAMPLE 2 Partial Derivatives on Surface
Let w = f + x3 + y3 + z3 and let z = g = x2 + y2. Then (6) gives
We confirm this by substitution, using w(x, y) = x3 + y3 + (x2 + y2)3, that is,
Let f(x, y, z) be continuous and have continuous first partial derivatives in a domain D in xyz-space. Let P0: (x0, y0, z0) and P: (x0 + h, y0 + k, z0 + l) be points in D such that the straight line segment P0P joining these points lies entirely in D. Then
the partial derivatives being evaluated at a suitable point of that segment.
For a function f(x, y) of two variables (satisfying assumptions as in the theorem), formula (7) reduces to (Fig. 214)
and, for a function f(x) of a single variable, (7) becomes
where in (9), the domain D is a segment of the x-axis and the derivative is taken at a suitable point between x0 and x0 + h.
We shall see that some of the vector fields that occur in applications—not all of them!—can be obtained from scalar fields. Using scalar fields instead of vector fields is of a considerable advantage because scalar fields are easier to use than vector fields. It is the “gradient” that allows us to obtain vector fields from scalar fields, and thus the gradient is of great practical importance to the engineer.
DEFINITION 1 Gradient
The setting is that we are given a scalar function f(x, y, z) that is defined and differentiable in a domain in 3-space with Cartesian coordinates x, y, z. We denote the gradient of that function by grad f or ∇f (read nabla f). Then the qradient of f(x, y, z) is defined as the vector function
Remarks. For a definition of the gradient in curvilinear coordinates, see App. 3.4. As a quick example, if f(x, y, z) = 2y3 + 4xz + 3x, then grad f = [4z + 3, 6y2, 4x]. Furthermore, we will show later in this section that (1) actually does define a vector.
The notation ∇f is suggested by the differential operator ∇ (read nabla) defined by
Gradients are useful in several ways, notably in giving the rate of change of f(x, y, z) in any direction in space, in obtaining surface normal vectors, and in deriving vector fields from scalar fields, as we are going to show in this section.
From calculus we know that the partial derivatives in (1) give the rates of change of f(x, y, z) in the directions of the three coordinate axes. It seems natural to extend this and ask for the rate of change of f in an arbitrary direction in space. This leads to the following concept.
DEFINITION 2 Directional Derivative
The directional derivative Dbf or df/ds of a function f(x, y, z) at a point P in the direction of a vector b is defined by (see Fig. 215)
Here Q is a variable point on the straight line L in the direction of b, and |s| is the distance between P and Q. Also, s > 0 if Q lies in the direction of b (as in Fig. 215), s < 0 if Q lies in the direction of −b, and s = 0 if Q = P.
The next idea is to use Cartesian xyz-coordinates and for b a unit vector. Then the line L is given by
where p0 the position vector of P. Equation (2) now shows that Dbf = df/ds is the derivative of the function f(x(s), y(s), z(s)) with respect to the arc length s of L. Hence, assuming that f has continuous partial derivatives and applying the chain rule [formula (3) in the previous section], we obtain
where primes denote derivatives with respect to s (which are taken at s = 0). But here, differentiating (3) gives r′ = x′i + y′j + z′k = b. Hence (4) is simply the inner product of grad f and b [see (2), Sec. 9.2]; that is,
ATTENTION! If the direction is given by a vector a of any length (≠ 0), then
EXAMPLE 1 Gradient. Directional Derivative
Find the directional derivative of f(x, y, z) = 2x2 + 3y2 + z2 at P: (2, 1, 3) in the direction of a = [1, 0, −2].
Solution. grad f = [4x, 6y, 2z] gives at P the vector grad f(P) = [8, 6, 6,]. From this and (5*) we obtain, since ,
The minus sign indicates that at P the function f is decreasing in the direction of a.
Here is a finer point of mathematics that concerns the consistency of our theory: grad f in (1) looks like a vector—after all, it has three components! But to prove that it actually is a vector, since it is defined in terms of components depending on the Cartesian coordinates, we must show that grad f has a length and direction independent of the choice of those coordinates. See proof of Theorem 1. In contrast, [∂f/∂x, 2∂/∂y, ∂f/∂z] also looks like a vector but does not have a length and direction independent of the choice of Cartesian coordinates.
Incidentally, the direction makes the gradient eminently useful: grad f points in the direction of maximum increase of f.
THEOREM 1 Use of Gradient: Direction of Maximum Increase
Let f(P) = f(x, y, z) be a scalar function having continuous first partial derivatives in some domain B in space. Then grad f exists in B and is a vector, that is, its length and direction are independent of the particular choice of Cartesian coordinates. If grad f(P) ≠ 0 at some point P, it has the direction of maximum increase of at P.
PROOF
From (5) and the definition of inner product [(1) in Sec. 9.2] we have
where γ is the angle between b and grad f. Now f is a scalar function. Hence its value at a point P depends on P but not on the particular choice of coordinates. The same holds for the arc length s of the line L in Fig. 215, hence also for Dbf. Now (6) shows that Dbf is maximum when cos γ = 1, γ = 0, and then Dbf = |grad f|. It follows that the length and direction of grad f are independent of the choice of coordinates. Since γ = 0 if and only if b has the direction of grad f, the latter is the direction of maximum increase of f at P, provided grad f ≠ 0 at P. Make sure that you understood the proof to get a good feel for mathematics.
Gradients have an important application in connection with surfaces, namely, as surface normal vectors, as follows. Let S be a surface represented by f(x, y, z) = c = const, where f is differentiable. Such a surface is called a level surface of f, and for different c we get different level surfaces. Now let C be a curve on S through a point P of S. As a curve in space, C has a representation r(t) = [x(t), y(t), z(t)]. For C to lie on the surface S, the components of r(t) must satisfy f(x, y, z) = c, that is,
Now a tangent vector of C is r′(t) = [x′(t), y′(t), z′(t)]. And the tangent vectors of all curves on S passing through P will generally form a plane, called the tangent plane of S at P. (Exceptions occur at edges or cusps of S, for instance, at the apex of the cone in Fig. 217.) The normal of this plane (the straight line through P perpendicular to the tangent plane) is called the surface normal to S at P. A vector in the direction of the surface normal is called a surface normal vector of S at P. We can obtain such a vector quite simply by differentiating (7) with respect to t. By the chain rule,
Hence grad f is orthogonal to all the vectors r′ in the tangent plane, so that it is a normal vector of S at P. Our result is as follows (see Fig. 216).
THEOREM 2 Gradient as Surface Normal Vector
Let f be a differentiable scalar function in space. Let f(x, y, z) = c = const represent a surface S. Then if the gradient of at a point P of S is not the zero vector, it is a normal vector of S at P.
EXAMPLE 2 Gradient as Surface Normal Vector. Cone
Find a unit normal vector n of the cone of revolution z2 = 4(x2 + y2) at the point P: (1, 0, 2).
Solution. The cone is the level surface f = 0 of f(x, y, z) = 4(x2 + y2) − z2. Thus (Fig. 217)
n points downward since it has a negative z-component. The other unit normal vector of the cone at P is −n
At the beginning of this section we mentioned that some vector fields have the advantage that they can be obtained from scalar fields, which can be worked with more easily. Such a vector field is given by a vector function v(P), which is obtained as the gradient of a scalar function, say, v(P) = grad f(P). The function f(P) is called a potential function or a potential of v(P). Such a v(P) and the corresponding vector field are called conservative because in such a vector field, energy is conserved; that is, no energy is lost (or gained) in displacing a body (or a charge in the case of an electrical field) from a point P to another point in the field and back to P. We show this in Sec. 10.2.
Conservative fields play a central role in physics and engineering. A basic application concerns the gravitational force (see Example 3 in Sec. 9.4) and we show that it has a potential which satisfies Laplace's equation, the most important partial differential equation in physics and its applications.
THEOREM 3 Gravitational Field. Laplace's Equation
The force of attraction
between two particles at points P0: (x0, y0, z0) and P: (x, y, z) (as given by Newton's law of gravitation) has the potential f(x, y, z) = c/r, where r (> 0) is the distance between P0 and P.
Thus p = grad f = grad (c/r). This potential is a solution of Laplace's equation
[∇2f (read nabla squared f) is called the Laplacian of f.]
PROOF
That distance is r = ((x − x0)2 + (y − y0)2 + (z − z2)2)1/2. The key observation now is that for the components of p = [p1, p2, p3] we obtain by partial differentiation
and similarly
From this we see that, indeed, p is the gradient of the scalar function f = c/r. The second statement of the theorem follows by partially differentiating (10), that is,
and then adding these three expressions. Their common denominator is r5. Hence the three terms −1/r3 contribute −3r2 to the numerator, and the three other terms give the sum
so that the numerator is 0, and we obtain (9).
∇2f is also denoted by Δf. The differential operator
(read “nabla squared” or “delta”) is called the Laplace operator. It can be shown that the field of force produced by any distribution of masses is given by a vector function that is the gradient of a scalar function f, and f satisfies (9) in any region that is free of matter.
The great importance of the Laplace equation also results from the fact that there are other laws in physics that are of the same form as Newton's law of gravitation. For instance, in electrostatics the force of attraction (or repulsion) between two particles of opposite (or like) charge Q1 and Q2 is
Laplace's equation will be discussed in detail in Chaps. 12 and 18.
A method for finding out whether a given vector field has a potential will be explained in Sec. 9.9.
1–6 CALCULATION OF GRADIENTS
Find grad f. Graph some level curves f = const. Indicate ∇f by arrows at some points of these curves.
7–10 USEFUL FORMULAS FOR GRADIENT AND LAPLACIAN
Prove and illustrate by an example.
11–15 USE OF GRADIENTS. ELECTRIC FORCE
The force in an electrostatic field given by f(x, y, z) has the direction of the gradient. Find ∇f and its value at P.
18–23 VELOCITY FIELDS
Given the velocity potential f of a flow, find the velocity v = ∇f of the field and its value v(P) at P. Sketch v(P) and the curve f = const passing through P.
24–27 HEAT FLOW
Experiments show that in a temperature field, heat flows in the direction of maximum decrease of temperature T. Find this direction in general and at the given point P. Sketch that direction at P as an arrow.
Vector calculus owes much of its importance in engineering and physics to the gradient, divergence, and curl. From a scalar field we can obtain a vector field by the gradient (Sec. 9.7). Conversely, from a vector field we can obtain a scalar field by the divergence or another vector field by the curl (to be discussed in Sec. 9.9). These concepts were suggested by basic physical applications. This will be evident from our examples.
To begin, let v(x, y, z) be a differentiable vector function, where x, y, z are Cartesian coordinates, and let υ1, υ2, υ3 be the components of v. Then the function
is called the divergence of v or the divergence of the vector field defined by v. For example, if
Another common notation for the divergence is
with the understanding that the “product” (∂/∂x)υ1 in the dot product means the partial derivative ∂υ1/∂x, etc. This is a convenient notation, but nothing more. Note that ∇ • v means the scalar div v, whereas ∇f means the vector grad f defined in Sec. 9.7.
In Example 2 we shall see that the divergence has an important physical meaning. Clearly, the values of a function that characterizes a physical or geometric property must be independent of the particular choice of coordinates. In other words, these values must be invariant with respect to coordinate transformations. Accordingly, the following theorem should hold.
THEOREM 1 Invariance of the Divergence
The divergence div v is a scalar function, that is, its values depend only on the points in space (and, of course, on v) but not on the choice of the coordinates in (1) , so that with respect to other Cartesian coordinates x*, y*, z* and corresponding components υ1*, υ2*, υ3* of v,
We shall prove this theorem in Sec. 10.7, using integrals.
Presently, let us turn to the more immediate practical task of gaining a feel for the significance of the divergence. Let f(x, y, z) be a twice differentiable scalar function. Then its gradient exists,
and we can differentiate once more, the first component with respect to x, the second with respect to y, the third with respect to z, and then form the divergence,
Hence we have the basic result that the divergence of the gradient is the Laplacian (Sec. 9.7),
EXAMPLE 1 Gravitational Force. Laplace's Equation
The gravitational force p in Theorem 3 of the last section is the gradient of the scalar function f(x, y, z) = c/r, which satisfies Laplaces equation ∇2f = 0. According to (3) this implies that div p = 0 (r > 0).
The following example from hydrodynamics shows the physical significance of the divergence of a vector field. We shall get back to this topic in Sec. 10.8 and add further physical details.
EXAMPLE 2 Flow of a Compressible Fluid. Physical Meaning of the Divergence
We consider the motion of a fluid in a region R having no sources or sinks in R, that is, no points at which fluid is produced or disappears. The concept of fluid state is meant to cover also gases and vapors. Fluids in the restricted sense, or liquids, such as water or oil, have very small compressibility, which can be neglected in many problems. In contrast, gases and vapors have high compressibility. Their density ρ (= mass per unit volume) depends on the coordinates x, y, z in space and may also depend on time t. We assume that our fluid is compressible. We consider the flow through a rectangular box B of small edges Δx, Δy, Δz parallel to the coordinate axes as shown in Fig. 218. (Here Δ is a standard notation for small quantities and, of course, has nothing to do with the notation for the Laplacian in (11) of Sec. 9.7.) The box B has the volume ΔV = Δx Δy Δz. Let v = [υ1, υ2, υ3] = υ1i + υ2j + υ3k be the velocity vector of the motion. We set
and assume that u and v are continuously differentiable vector functions of x, y, z, and t, that is, they have first partial derivatives which are continuous. Let us calculate the change in the mass included in B by considering the flux across the boundary, that is, the total loss of mass leaving B per unit time. Consider the flow through the left of the three faces of B that are visible in Fig. 218, whose area is Δx Δz. Since the vectors υ1i and υ3 k are parallel to that face, the components υ1 and υ3 of v contribute nothing to this flow. Hence the mass of fluid entering through that face during a short time interval Δt is given approximately by
where the subscript y indicates that this expression refers to the left face. The mass of fluid leaving the box B through the opposite face during the same time interval is approximately (u2)y+Δy Δx Δz Δt, where the subscript y + Δy indicates that this expression refers to the right face (which is not visible in Fig. 218). The difference
is the approximate loss of mass. Two similar expressions are obtained by considering the other two pairs of parallel faces of B. If we add these three expressions, we find that the total loss of mass in B during the time interval Δt is approximately
where
This loss of mass in B is caused by the time rate of change of the density and is thus equal to
If we equate both expressions, divide the resulting equation by ΔVΔt, and let Δx, Δy, Δz, and Δt approach zero, then we obtain
or
This important relation is called the condition for the conservation of mass or the continuity equation of a compressible fluid flow.
If the flow is steady, that is, independent of time, then ∂ρ/∂t = 0 and the continuity equation is
If the density ρ is constant, so that the fluid is incompressible, then equation (6) becomes
This relation is known as the condition of incompressibility. It expresses the fact that the balance of outflow and inflow for a given volume element is zero at any time. Clearly, the assumption that the flow has no sources or sinks in R is essential to our argument. v is also referred to as solenoidal.
From this discussion you should conclude and remember that, roughly speaking, the divergence measures outflow minus inflow.
Comment. The divergence theorem of Gauss, an integral theorem involving the divergence, follows in the next chapter (Sec. 10.7).
1–6 CALCULATION OF THE DIVERGENCE
Find div v and its value at P.
Verify (b) for f = exyz and v = axi + byj + czk. Obtain the answer to Prob. 6 from (b). Verify (c) for f = x2 − y2 and g = ex + y. Give examples of your own for which (a)–(d) are advantageous.
15–20 LAPLACIAN
Calculate ∇2f by Eq. (3). Check by direct differentiation. Indicate when (3) is simpler. Show the details of your work.
The concepts of gradient (Sec. 9.7), divergence (Sec. 9.8), and curl are of fundamental importance in vector calculus and frequently applied in vector fields. In this section we define and discuss the concept of the curl and apply it to several engineering problems.
Let v(x, y, z) = [υ1, υ2, υ3] = υ1i + υ2j + υ3k be a differentiable vector function of the Cartesian coordinates x, y, z. Then the curl of the vector function v or of the vector field given by v is defined by the “symbolic” determinant
This is the formula when x, y, z are right-handed. If they are left-handed, the determinant has a minus sign in front (just as in (2**) in Sec. 9.3).
Instead of curl v one also uses the notation rot v. This is suggested by “rotation,” an application explored in Example 2. Note that curl v is a vector, as shown in Theorem 3.
EXAMPLE 1 Curl of a Vector Function
Let v = [yz, 3zx, z] = yzi + 3zxj + zk with right-handed x, y, z. Then (1) gives
The curl has many applications. A typical example follows. More about the nature and significance of the curl will be considered in Sec. 10.9.
EXAMPLE 2 Rotation of a Rigid Body. Relation to the Curl
We have seen in Example 5, Sec. 9.3, that a rotation of a rigid body B about a fixed axis in space can be described by a vector w of magnitude ω in the direction of the axis of rotation, where ω (> 0) is the angular speed of the rotation, and w is directed so that the rotation appears clockwise if we look in the direction of w. According to (9), Sec. 9.3, the velocity field of the rotation can be represented in the form
where r is the position vector of a moving point with respect to a Cartesian coordinate system having the origin on the axis of rotation. Let us choose right-handed Cartesian coordinates such that the axis of rotation is the z-axis. Then (see Example 2 in Sec. 9.4)
Hence
This proves the following theorem.
THEOREM 1 Rotating Body and Curl
The curl of the velocity field of a rotating rigid body has the direction of the axis of the rotation, and its magnitude equals twice the angular speed of the rotation.
Next we show how the grad, div, and curl are interrelated, thereby shedding further light on the nature of the curl.
Gradient fields areirrotational. That is, if a continuously differentiable vector function is the gradient of a scalar function, then its curl is the zero vector,
Furthermore, the divergence of the curl of a twice continuously differentiable vector function v is zero,
PROOF
Both (2) and (3) follow directly from the definitions by straightforward calculation. In the proof of (3) the six terms cancel in pairs.
EXAMPLE 3 Rotational and Irrotational Fields
The field in Example 2 is not irrotational. A similar velocity field is obtained by stirring tea or coffee in a cup. The gravitational field in Theorem 3 of Sec. 9.7 has curl p = 0. It is an irrotational gradient field.
The term “irrotational” for curl v = 0 is suggested by the use of the curl for characterizing the rotation in a field. If a gradient field occurs elsewhere, not as a velocity field, it is usually called conservative (see Sec. 9.7). Relation (3) is plausible because of the interpretation of the curl as a rotation and of the divergence as a flux (see Example 2 in Sec. 9.8).
Finally, since the curl is defined in terms of coordinates, we should do what we did for the gradient in Sec. 9.7, namely, to find out whether the curl is a vector. This is true, as follows.
THEOREM 3 Invariance of the Curl
curl v is a vector. It has a length and a direction that are independent of the particular choice of a Cartesian coordinate system in space.
PROOF
The proof is quite involved and shown in App. 4.
We have completed our discussion of vector differential calculus. The companion Chap. 10 on vector integral calculus follows and makes use of many concepts covered in this chapter, including dot and cross products, parametric representation of curves C, along with grad, div, and curl.
Find curl v for v given with respect to right-handed Cartesian coordinates. Show the details of your work.
9–13 FLUID FLOW
Let v be the velocity vector of a steady fluid flow. Is the flow irrotational? Incompressible? Find the streamlines (the paths of the particles). Hint. See the answers to Probs. 9 and 11 for a determination of a path.
15–20 DIV AND CURL
With respect to right-handed coordinates, let u = [y, z, x], v = [yz, zx, xy], f = xyz, and g = x + y + z. Find the given expressions. Check your result by a formula in Proj. 14 if applicable.
11–19 ALGEBRAIC OPERATIONS FOR VECTORS
Let a = [4, 7, 0], b = [3, −1, 5], c = [−6, 2, 0] and d = [1, −2, 8]. Calculate the following expressions. Try to make a sketch.
32–40 GRAD, DIV, CURL, ∇2, Dvf
Let f = xy − yz, v = [2y, 2x, 4x + z], and w = [3z2, x2 − y2, y2]. Find:
All vectors of the form a = [a1, a2, a3] = a1i + a2j + a3k constitute the real vector space R3 with componentwise vector addition
and componentwise scalar multiplication (c a scalar, a real number)
For instance, the resultant of forces a and b is the sum a + b.
The inner product or dot product of two vectors is defined by
where γ is the angle between a and b. This gives for the norm or length |a| of a
as well as a formula for γ. If a • b = 0, we call a and b orthogonal. The dot product is suggested by the work W = p • d done by a force p in a displacement d.
The vector product or cross product v = a × b is a vector of length
and perpendicular to both a and b such that a, b, v form a right-handed triple. In terms of components with respect to right-handed coordinates,
The vector product is suggested, for instance, by moments of forces or by rotations. CAUTION! This multiplication is anti commutative, a × b = −b × a, and is not associative.
An (oblique) box with edges a, b, c has volume equal to the absolute value of the scalar triple product
Sections 9.4–9.9 extend differential calculus to vector functions
and to vector functions of more than one variable (see below). The derivative of v(t) is
Differentiation rules are as in calculus. They imply (Sec. 9.4)
Curves C in space represented by the position vector r(t) have r′(t) as a tangent vector (the velocity in mechanics when t is time), r′(s) (s arc length, Sec. 9.5) as the unit tangent vector, and |r″ (s)| = κ as the curvature (the acceleration in mechanics).
Vector functions v(x, y, z) = [υ1 (x, y, z), υ2 (x, y, z), υ3 (x, y, z)] represent vector fields in space. Partial derivatives with respect to the Cartesian coordinates x, y, z are obtained componentwise, for instance,
The gradient of a scalar function f is
The directional derivative of f in the direction of a vector a is
The divergence of a vector function v is
or minus the determinant if the coordinates are left-handed.
Some basic formulas for grad, div, curl are (Secs. 9.7–9.9)
For grad, div, curl, and ∇2 in curvilinear coordinates see App. A3.4.
1 Named after the French philosopher and mathematician RENATUS CARTESIUS, latinized for RENÉ DESCARTES (1596–1650), who invented analytic geometry. His basic work Géométrie appeared in 1637, as an appendix to his Discours de la méthode.
2 LUDWIG OTTO HESSE (1811–1874), German mathematician who contributed to the theory of curves and surfaces.
3 GUSTAVE GASPARD CORIOLIS (1792–1843), French engineer who did research in mechanics.
4 Named after ARCHIMEDES (c. 287–212 B.C.), DESCARTES (Sec. 9.1), DIOCLES (200 B.C.), MACLAURIN (Sec. 15.4), NICOMEDES (250? B.C.) ÉTIENNE PASCAL (1588–1651), father of BLAISE PASCAL (1623–1662).
5 JEAN-FRÉDÉRIC FRENET (1816–1900), French mathematician.
6 CHARLES AUGUSTIN DE COULOMB (1736–1806), French physicist and engineer. Coulomb's law was derived by him from his own very precise measurements.