The discussion of functions of one real variable is continued in this chapter by considering three classes of functions that play an important role in physical sciences. They are: simple algebraic functions; trigonometric functions; logarithms and exponentials.
Here we discuss polynomials and the more complicated functions that can be defined in terms of them.
The polynomial function has the general form
where ai(i = 0, 1, 2, …, n) are constants, n is a non-negative integer (i.e. including zero), and the symbol Σ means that a sum is to be taken of all terms labelled by the indices 0, 1, 2, …, n. The value of n defines the order (or degree) of the polynomial. The expression x3 − 3x2 − 6x + 8 plotted in Figure 1.1 is therefore a polynomial of order 3.
The roots of polynomials are defined as the solutions of the equation
and correspond to the points where a graph of Pn(x) crosses the x-axis. For first-order polynomials, (2.2) is a linear equation of the form
(2.3)
where a and b are constants. This has one root, which is trivially given by x = −b/a.
For second-order polynomials, (2.2) leads to a quadratic equation of the form
where a, b and c are numerical constants. When the coefficients are simple, for example integers, one can sometimes spot that the quadratic form factorises, i.e.
where α and β are real numbers. The solutions are then clearly x1 = α and x2 = β. For example, 2x2 − 7x + 3 = (2x − 1) (x − 3) and so the roots of this polynomial are . In some circumstances a solution can be ‘lost’ if care is not taken. For example, the equation x2 − 5x = 0 factorises to x(x − 5) = 0, with solutions x = 0 and x = 5. However, had we divided both sides of the original equation by x, we would have only found the solution x = 5. So when manipulating the original equation, particularly when using division to simplify it, one should always check that no solutions are thereby being omitted.
If there is no obvious factorisation, the general solution of the quadratic is obtained by a process known as ‘completing the square’, as follows. By taking the constant c onto the right-hand side of the equation and then adding b2/4a to both sides, the left-hand side may be written as a perfect square. We have,
(2.6a)
and hence
We thus see that for b2 > 4ac there are two solutions, which coincide when b2 = 4ac. If we denote these two solutions by α and β, then from (2.6b) one easily confirms that the polynomial can be written in the factorised form (2.5) and that
These results are sometimes useful because a quadratic equation may be written
(2.8)
so that, for example, the polynomial with roots 2.1 and 3.2 is x2 − 5.30x + 6.72. If b2 − 4ac < 0, the argument of the square root in (2.6b) is negative, and (2.4) has no solutions for real x. For example, the quadratic equation x2 − 3x + k = 0 has two roots if 9 − 4k ≥ 0, i.e. k ≤ 9/4, but no real roots if 9 − 4k < 0.
Exact solutions for cubic and quartic equations exist, so that the roots of third-order and fourth-order polynomials may be determined exactly. However, the solutions are algebraically very complicated and we will not pursue them further. Except in special cases, the roots of higher-order polynomials are found by approximate methods. However, one can establish some important general results. To obtain these we initially consider the result obtained by dividing a polynomial Pn(x) of order n by a factor (x − a) using long division, until only a constant remainder is left. For example, on dividing x4 − 2x3 + 3x2 − 4x + 5 by (x − 1) one obtains
so that
More generally, dividing any polynomial of order n by (x − a) leads to an expression of the form
where the quotient Q(x) is a polynomial of order (n − 1) and the remainder R = Pn(a). This result is called the remainder theorem and implies that if a = α is a root of Pn(x), then R = 0 and (2.9) reduces to the partially factorised form
(2.10a)
where Pn − 1(x) is a polynomial of order n − 1. This is called the factor theorem. Furthermore, repeating the process for all m roots α1, α2, …, αm gives
and since the highest power on the left is xn, there are at most n real roots. Thus a polynomial Pn(x) of order n has at most n real roots.1 Beyond this, one can only say that the number of roots is odd or even, corresponding to whether the order of the polynomial is odd or even, respectively, provided that if two or more factors in (2.10b) are equal, we still count them separately. This is most easily seen by considering a graph of the polynomial, in which the roots correspond to the values at which the curve intercepts the x-axis, as illustrated in Figure 1.1. The results then follow by considering the asymptotic behaviour of the polynomial as x → ±∞, which is dominated by the term anxn in (2.1). Hence if n is even, the polynomial has the same sign in the limits x → ±∞, and since it is continuous, it must either not cross the x-axis at all, corresponding to no roots, or cross it an even number of times, corresponding to an even number of roots. A similar argument shows that a polynomial whose order is odd must have at least one root and there can only be an odd number of roots.
We now return to the problem of finding the roots. As noted above, the general solution for third and fourth order polynomials is very complicated, and for higher orders no general exact solution is known. However, for simple cases it may still be possible to find exact solutions of higher-order polynomials by spotting factors and using the factor theorem. For example, consider the fourth-order polynomial
(2.11)
By inspection, f(1) = 0, so (x − 1) is a factor. To find the quotient Q(x) we need to carry out a long division, which yields (2x3 + x2 − 7x − 6), so that
(2.12)
We now repeat the process by finding factors (if they exist) of the cubic. The final result is
(2.13)
so the solutions are .
Not all polynomials factorise, and in practice equations involving higher order polynomials are solved by approximate methods, either graphical or numerical. In the former, the function is plotted and the points where f(x) crosses the x-axis are found. In Figure 2.1 the function x3 − 3x2 − 4x + 7 is plotted and it is seen that it crosses the x-axis at the values x ≈ −1.7, 1.1 and 3.6, which are therefore the approximate solutions of the equation x3 − 3x2 − 4x + 7 = 0. It is worth reiterating that a polynomial of order n does not necessarily have n real roots, and so a graph of the function will not necessarily cut the x-axis at n points.
Although only approximate, the graphical solutions are still useful, as numerical methods for finding accurate roots often rely on knowing approximate solutions as starting values. One simple technique is the so-called bisection method, which can be applied to any continuous function. In this method one starts by finding two values of x, say x1 and x2, that straddle the position of a zero. Thus f(x1) and f(x2) will have opposite signs and so f(x1)f(x2) < 0. Now let and calculate f(xm). If f(x1)f(xm) < 0, then the root lies between x1 and xm. In this case, the mid-point of x1 and xm is found and the calculation repeated. If, however, f(x1)f(xm) > 0, then the root lies between x2 and xm, and in this case, the mid-point of x2 and xm is found and the calculation repeated. This iterative method can be rapidly implemented on a computer, and when applied to any of the roots, the range of values of x that produces a value of f(x) as close to zero as desired may be found.
Given two polynomials P(x) and Q(x), we can form the rational function f(x), defined by f(x) ≡ P(x)/Q(x). These are generalisations of numerical fractions and, by analogy with those, the rational expression P/Q is said to be proper if the order of the numerator is less than the order of the denominator. Otherwise it is called an improper fractional expression. Examples are:
(2.14)
In contrast to polynomials, rational functions are not in general continuous, but can have discontinuities corresponding to the roots of the denominator function Q(x), that is, where Q(x) = 0, and so are undefined at those points. For example, the rational function
(2.15)
has discontinuities at x = ±1, where the denominator vanishes, as shown in Figure 1.6b.
Rational expressions where the denominator is itself the product of polynomials may often usefully be decomposed into a sum of simpler terms called partial fractions. Assume for the moment that the initial expression is a proper fraction. There are several possible forms this can take and we will look at each in turn, before illustrating them with specific examples.
The first form is
(2.16)
where a, b, …, n are constants, and because the fraction is proper, P(x) is a polynomial of lower order than the product of factors in the denominator. In this case, we may write the identity
where A, B, … N are constants. By putting the terms on the right-hand side over a common denominator, (2.17) may be written
(2.18)
Because this is an identity, it is true for all values of x. Thus we can choose any values of x to evaluate it. So, in particular, if we choose x = a, x = b, … in turn, in each case all the terms on the right-hand side are zero except one and we can solve for the coefficients A, B, etc.
In all the above cases the original fractional function was proper. If the fraction is improper, then an initial long division must be made to write it as the sum of a polynomial and a proper fraction. The latter is then decomposed into partial fractions as above.
Polynomials and rational functions are the simplest examples of a broader class of functions, called algebraic functions. An algebraic function is any function y that can be defined by an equation of the form
where P(i)(x) (i = 0, 1, …, n) are given polynomials of any order. This definition is implicit, and for any x the function can be evaluated by first evaluating the polynomials P(i)(x), and then finding the roots of the resulting polynomial in y.
For n = 1, one easily sees that the above definition reduces to a rational function, or a polynomial in the case of P(0) = 1. More generally, it implies that any algebraic function can be defined in terms of a finite number of the basic operations of algebra (i.e. addition, subtraction, multiplication and division).
In contrast, functions that are not of the above form cannot be defined by a finite sequence of basic algebraic operations. Such functions are called transcendental functions and are somewhat analogous to irrational numbers, which cannot be evaluated from integers by a finite sequence of the operations of arithmetic. The functions to be discussed in the next two subsections – trigonometric functions, logarithms and exponential functions – are all examples of transcendental functions.
The trigonometric functions, sine, cosine and others (also called circular functions) have many applications. In particular, because of their periodic behaviour, they play a central role in the mathematical description of the phenomena of waves and oscillations that permeate the whole of physical science. Here we discuss their basic properties and some of their important applications in geometry.
Trigonometry is the study of angles, and before turning to the trigonometric functions themselves, it will be useful to consider angles and their use as co-ordinates. In doing so, we will make reference to Figure 2.2, which shows the angle of intersection θ between two lines OA, OB, together with a circle of radius r whose centre lies at the point of intersection.
One unit of angle is the degree, which is defined to be a part of a complete rotation and is denoted 1°. Thus a right-angle corresponds to 90°. In scientific work it is more usual to work in terms of the radian, which is defined as the angle when the length l of the arc P0P1 is equal to the radius r. Since the circumference of a circle of radius r is 2πr and corresponds to an angle of 2π radians, it follows that the length of an arc of a circle of radius r that subtends an angle θ at the centre of the circle is
It also follows that 2π radians = 360 degrees, so that a right angle is π/2 radians and 1 radian ≈ 57.3°. In addition, the area of the corresponding sector shown in Figure 2.2 is
We stress that, like many other equations in this book, (2.21) and (2.22) are only valid if the angles are expressed in radians. Unless stated otherwise it will be assumed from now on that all angles are expressed in radians.
Angles can also be used as co-ordinates, provided we adopt a convention to specify their sign. This is illustrated in Figure 2.3, where the position of the point P can be specified by the Cartesian co-ordinates (x, y) used in Section 1.3.1, or the plane polar co-ordinates (r, θ). Here, r > 0 is the distance of P from the origin O, with
by Pythagoras' theorem, and θ is the angle between the line OP and the x-axis measured in a counter-clockwise sense. Thus in Figure 2.4a the point P corresponds to θ = −π/4, since OP is at an angle π/4 to the x-axis when measured in a clockwise direction. However, the polar angle is not unique, and P also corresponds to θ = 7π/4, since OP is at an angle 7π/4 to the x-axis when measured in the counter-clockwise direction, as shown in Figure 2.4b. In general, the points (r, θ) and (r, θ + 2nπ) correspond to the same point in the plane for any integer n. This is illustrated for the case n = 1 in Figures 2.4c and 2.4d.
The ambiguity in the value of the polar angle corresponding to a given point can be removed by restricting the range of θ to 0 < θ < 2π. However, this is not always convenient. Consider, for example, a particle moving in a circular orbit of constant radius r with constant speed , as shown in Figure 2.5.
Assuming that θ = 0 at time t = 0, the motion is described in polar co-ordinates by the simple equations
(2.24)
where we have deduced the equation for θ from (2.21) together with the fact that the particle traverses a length of arc in time t. The angle increases indefinitely as t increases and θ = 2nπ + φ, with 0 < φ < 2π, corresponds to the particle arriving at the point (r, φ) after n complete revolutions since t = 0.
For angles less than , the sine and cosine functions (written ‘sine’ and ‘cosine’) are defined in terms of the sides of a right-angled triangle by
and applying this to the triangle in Figure 2.3 we obtain
where x and y are the projections of OP onto the x-axis and y-axis, respectively, and r is the length of OP. However, if we consider a point P rotating in a counter-clockwise direction about the origin at (0, 0), as shown in Figure 2.5, then (2.25) allows us to extend the definitions of sine and cosine to all angles provided the signs of x and y are taken into account. For example, in the fourth quadrant , we see from Figure 2.4(a) that x > 0 while y < 0, so that cos θ > 0 and sin θ < 0. More generally, as θ increases in Figure 2.5, one sees that x and y oscillate between r and −r, and hence cos θ and sin θ oscillate between −1 and +1, with a period of 2π corresponding to a single revolution. In other words,
(2.26)
and they are periodic with a period of 2π, that is, the form of the function repeats at intervals of 2π, so that
In addition, together with (2.23), the definitions (2.25) imply the important relation
for all values of θ.
The graphical forms of the sine and cosine functions follow from the definitions (2.25), together with the construction in Figure 2.5. They are shown in Figures 2.6a and 2.6b, respectively, and have a number of other general features which, in view of the enormous importance of these functions in physical science, are worth emphasising.
For the first quadrant, , this result follows from the construction of Figure 2.7, where P and P′ correspond to polar angles θ and . From this diagram, one easily sees that the triangles OAP and OBP′ are similar triangles and hence, since OP = OP′( = r), they are congruent, that is, identical if superimposed. Thus OC = BP′ = AC, and the result (2.31) follows. A similar construction works in the other three quadrants in 0 < θ < 2π, establishing the result for all angles.
Sine and cosine are not the only important circular functions, but the others can be defined in terms of them. In particular, we define the tangent and cotangent, written as ‘tan’ and ‘cot’, respectively, as
(2.32a)
and the secant and cosecant, written as ‘sec’ and ‘cosec’, by
which, together with (2.28), lead to the relations
The behaviours of these functions follow from the behaviour of sine and cosine shown in Figures 2.6a and 2.6b. The functions tan θ and cotθ are plotted in Figures 2.8a and 2.8b, respectively. Like the sine and cosine functions, they are periodic, but with a period of π rather than 2π. However, unlike those functions, tan θ is unbounded and is discontinuous at the points , for n = 0, ± 1, ± 2, …, where cos θ vanishes. Similarly, cotθ is discontinuous at the points where sin θ vanishes. The remaining circular functions may also be deduced from (2.28) and are shown in Figure 2.9.
In Section 1.3.1 we defined inverse functions. In the case of the circular functions this must be done with care, because it is clear from Figures 2.6, 2.8 and 2.9 that there are an infinite number of angles for a given value of sine, cosine or tangent. To obtain a single-valued function, we would therefore have to formally restrict the angular range of θ. The corresponding inverse circular functions for sine, cosine and tangent are shown for convenient choices in Figures 2.10a–2.10c, respectively. Using the notation of Section 1.3, it would be natural to refer to these as sin − 1, cos− 1 andtan − 1, but to avoid ambiguity with , etc. it is probably better to always use their alternative explicit names arcsin, arccos and arctan. An example of their use is furnished by the relation between the Cartesian co-ordinates (x,y) and the polar co-ordinates (r, θ) of Figure 2.3. From (2.25) we see that x and y are given in terms of r and θ by
(2.34)
from which the relations r2 = x2 + y2 and directly follow. Hence r and θ are given in terms of x and y by the relations
(2.35)
respectively.
Equation (2.33) is an example of a trigonometric identity. Here we list some of the most important identities, before commenting on their derivation and giving examples of their use. They are:
(2.36d)
(2.36e)
Specific useful cases that follow directly from (2.36a) (2.36b) and (2.36f) are the ‘double-angle’ formulas obtained by setting φ = θ:
and
(2.37b)
The analogous ‘half-angle’ formulas are
(2.37c)
and
(2.37d)
These identities can be proved by simple geometrical methods. To illustrate this we will prove (2.36a) by referring to Figure 2.11. From triangle ABC, we have
(2.38a)
since DC = EF. But from the triangles BDE and AEF,
(2.38b)
so that
Also, from triangle ABE,
(2.38d)
so finally, using these relations in (2.38c), we have
This derivation establishes (2.38e) for acute angles only, since this is what we have assumed in Figure 2.11. However, the proof can be extended to all angles. The result (2.36a) with a minus sign follows by letting φ → −φ and using the odd and even properties of the sine and cosine functions (2.29).
The rest of the formulas (2.36) follow from (2.30a) using our previous results. For example, to derive (2.36b) we write
using (2.30b) and (2.36a). Equation (2.36b) then follows, since
using (2.36a) and (2.29). The other identities follow in a similar way, and can be used to derive many more results, and solve trigonometric equations, as the following examples illustrate.
The trigonometric functions enable the discussion of co-ordinate geometry to be extended in a number of ways. One is to solve triangles, that is, to determine completely the lengths of their sides and the magnitude of all three angles. If two angles and one side, or two sides and a non-included angle are given, this can be done using the sine rule,
where the definitions of the angles A, B and C and lengths of the sides a, b and c are specified in Figure 2.12.
Alternatively, if the three sides, or two sides and the included angle are known, we can use the cosine rule,
together with its permutations
In what follows we shall prove these rules then illustrate their use by examples.
The sine and cosine rules can both be derived from the construction of Figure 2.13. To obtain the sine rule, we infer the length AP by using the definition of sine in the triangles PAB and PAC to give
(2.41a)
from which the second equality in (2.39) immediately follows. Applying the same argument to the triangles BCP′ and ACP′ gives
(2.41b)
where we have used (2.36a) and (2.30a) to show that sin (π − A) = sin A. The first equality in (2.39) then follows directly, completing the derivation of the sine rule.
To prove the cosine rule (2.40a), we apply Pythagoras' theorem to the triangle P′BC to obtain
where we have multiplied out the brackets and used (2.31). Since cos (π − A) = −cos A, from (2.36b) and (2.30b), this gives
and the cosine rule (2.40a). In a similar way, applying Pythagoras' theorem to the triangle PAC gives
thus establishing the first part of (2.40b). The second equation in (2.40b) follows by the same argument applied to the triangle PAB.
Finally, before giving examples of the application of the sine and cosine rules, we note that in proving them we have assumed that one of the angles A is obtuse.2 The corresponding proofs for the case of three acute angles are very similar and are left as exercises for the reader.
In this section we first define logarithms with respect to an arbitrary base and obtain the ‘laws of logarithms’. We then introduce the irrational number e as a favoured base, in order to discuss natural logarithms and exponentials, and the hyperbolic functions defined in terms of them.
In Section 1.1.2 we met expressions of the form ab, where the index, or power, b was a rational number. For a > 0, this definition can be extended to irrational numbers, since, for example, 2π can be evaluated to arbitrary precision by exploiting the fact that π itself, like any irrational number, can be approximated to arbitrary accuracy by a rational number.3 Since we wish to include irrational numbers in our discussion, we restrict ourselves to the case a > 0, when the expression c = ab is called an exponential expression with a the base and b the index. Conversely, b is called the logarithm of c to base a and is written b = log ac. To summarise,
where the symbol ⇔ means ‘implies and is implied by’, that is, the expressions on either side of the symbol are equivalent. Graphs of log ax for various integer values of a are shown in Figure 2.15.
Logarithms obey a number of laws that are easily derived from the basic result (2.42). If we set
(2.43a)
then
(2.43b)
and hence, from the result on indices (1.5), A′ + B′ = C, i.e.
(2.44a)
and in general
By setting A = B = C… in (2.44b), it follows that
(2.44c)
a result that holds also for fractional and negative values of n. In a similar way to the proof of (2.44b), we can show that
Finally, setting A = B in (2.44d) gives log a1 = 0 for any base a. The results (2.44) are referred to as the laws of logarithms.
These relations may be used to simplify expressions and solve equations involving logarithms, as we shall illustrate by examples below. They may also be used to derive the general formula for changing a logarithm from base a to base b. Thus, if log ac = x, then c = ax. So log bc = xlog ba and
which, for the special case b = c, implies
Because the decimal system is so widespread, logarithms to base 10 are called common logarithms and the base is usually omitted. For example, log 7 = 0.845. In the binary system, it would be equally appropriate to use base 2, when
by (2.46). However, it is usual instead to choose the irrational number e = 2.71828… as the base, for reasons to be explained in the next section.
We next consider the exponential function ax, where again a > 0, but now the exponent is a real variable x. The resulting function is plotted for the values in Figure 2.16. As can be seen, ax increases rapidly for large positive x if a > 1, but decreases for all a < 1. In addition, ax = 1 for x = 0 for all x, and the behaviours for positive and negative x are related by
(2.47)
so that, for example, the curves 2− x and are identical.
Perhaps the most important property of the exponential function is that it is proportional to its own gradient. To see this, consider the line AB joining the function ax at x and x + d, as shown in Figure 2.17. As can be seen, the gradient of this line becomes a better and better approximation to the gradient at x itself as d gets increasingly smaller. Hence, since
by (1.45) and (1.17), we immediately obtain the desired result
(2.49)
where the constant of proportionality
(2.50)
and the notation means ‘take the limit of the term in the brackets as d approaches zero’.4
At this point, we note that the constant k depends on the base a, and we define the Euler number e such that k = 1 for a = e. To find this number, for any given d, we choose a value
(2.51)
so that (2.48) gives
for any given d. Since as d approaches zero, the slope (AB) approaches the slope of the curve, this implies that
as required, if
(2.53)
The number e can now be estimated with increasing accuracy by choosing smaller and smaller d, the values d = 0.1, 0.01, 0.001, 0.0001, … giving 2.594, 2.705, 2.717, 2.718, … A better method for evaluating Euler's number will be given in Section 5.3.4, and to 6 significant figures
(2.54)
The corresponding behaviour of ex and of the closely related function e− x = (ex)− 1 is shown in Figure 2.18.
Because of the property (2.52), the number e is almost always chosen as a base in physical science work and the corresponding function exp (x), defined by
(2.55)
is called the natural exponential function, or more usually, but imprecisely, just the exponential function. The corresponding inverse function
(2.56)
is referred to as the natural logarithmic function, or simply the natural logarithm. Since it is the inverse of ex, its behaviour can be inferred from the plot of ex, and is shown in Figure 2.18. Finally, from (2.45), natural and common logarithms are related by
Given the natural logarithms, we can define hyperbolic functions as follows.
These are called the hyperbolic sine, hyperbolic cosine and hyperbolic tangent, respectively, and are shown in Figures 2.19 and 2.20. Their inverses are defined as
We will show in Section 6.4.1 that the hyperbolic functions are related to the circular functions, hence the origin of their names. The word ‘hyperbolic’ appears because they are also related to the equation for a hyperbola. This follows from the first of the identities
and
(2.58b)
which may be checked using the definitions (2.57). Equations (2.57) and (2.58a) imply that if x = cosh θ and y = sinh θ, then the point (x, y) lies on the branch of the rectangular hyperbola x2 − y2 = 1 for which x + y > 0. (Hyperbolas are discussed in Section 2.4.)
By analogy with the circular functions, we have
and we can also form the inverse hyperbolic functions by ‘inverting’ (2.59). For example, let
(2.60)
This can be written as a quadratic in the variable z = ex, treating y as if it were a constant, leading to the solution . Hence
and
(2.61)
In a similar way we find that
where unlike in the case of sinh − 1x, both signs of the square root lead to positive values for ex. However, because
(2.63a)
the result for cosh − 1x may also be written
(2.63b)
which shows explicitly that the two values are equal in magnitude but with opposite signs. Finally,
Just as for the inverse trigonometric functions, both the ‘arc’ and ‘−1’ notations used in (2.61b), (2.62) and (2.64) are in common use.
The hyperbolic functions satisfy a number of identities. By analogy with (2.36), they are
(2.65a)
(2.65b)
(2.65c)
(2.65d)
(2.65e)
(2.65f)
Specific useful cases that follow directly from (2.65) by setting x = y are the double-argument identities,
(2.66b)
and
(2.66c)
Another class of functions that is commonly met in physics are the conic sections. Their name derives from the fact that they are formed by the intersection of a plane with a double circular cone, that is, a pair of symmetric cones that are constructed by rotating a straight line through one revolution about an axis through the vertex of the cones and the centre of the base of the cones. This is illustrated in Figure 2.21 where β is the angle of rotation relative to the bases, which are taken to be horizontal when viewed in profile. Also shown in this figure are the intersections of a plane oriented at different angles α relative to the horizontal. The resulting curves are of four possible types. If α < β, the plane intersects only one cone and the resulting closed curve is called an ellipse. In the limiting case where α = 0, that is, the plane is horizontal, the closed curve is a circle. If α = β, that is, the plane is parallel to the edge of the cone, it again only intersects one cone, but the resulting curve is now open. It is called a parabola. Finally, if α > β, the plane intersects both cones and results in two non-intersecting branches of an open curve called a hyperbola.
All conic sections can be shown to have the property that there exists in the plane of the curve a point F called the focus, and a straight line d, called the directrix, such that if P is any point on the curve, the ratio of the distance from P to F to that of the perpendicular distance from P to a point N on the directrix is a fixed number e called the eccentricity. In this section, we shall take this as a definition of a conic section, rather than the geometrical constructions of Figure 2.21, and use it to derive the functions that describe them.
Consider a point P lying on a conic section, where F is the focus and we assume that P and F are on the same side of the directrix d, as shown in Figure 2.22. We then introduce polar co-ordinates P(r, θ) where the focus F is taken to be at the origin and θ = 0 corresponds to the direction XA. From the general property of a conic section
(2.67)
so the curve is specified by e and the distance XF, or equivalently the length L = 2l of the chord parallel to d through F. From Figure 2.22, and so the length NP is given by
(2.68a)
But , and so
(2.68b)
If, on the other hand, we consider the case where P and F are on opposite sides of d, a similar argument leads to
(2.69a)
and
The above equations define the functions r(θ) that describe conic sections, where, since l, r > 0, the second result (2.69b) applies only when ecos θ < −1, that is, when e > 1 and . However, in both cases multiplying by r and rearranging gives
which applies for any e and θ. Equivalently, if we consider Cartesian co-ordinates with the origin at the focus F and the positive x-axis in the direction of the line XF, the corresponding equation is found by substituting
into (2.70) to give
The properties of the different types of conic sections are now obtained by choosing different values of the eccentricity, starting with e = 1.
Parabola. For e = 1, (2.71) becomes
which is the implicit function for a parabola. A simpler form is obtained by writing and shifting the origin of the co-ordinate system to ( − a, 0), so that in the new variables
This corresponds to the unbounded curve in Figure 2.23a, where, in this frame of reference, the focus is F = (a, 0) and the directrix d is the line x = −a, as shown.
The second possibility is that e ≠ 1. In this case, (2.71) may be written
which becomes
on shifting the origin to , that is, the centre of the conic, while the focus and directrix are now at and , respectively. There are now two possibilities e < 1 and e > 1 and we consider each in turn.
Ellipse. For e < 1, (2.73) becomes
with
(2.74b)
and where the focus is at ( − ae, 0) and the directrix is the line . Because (2.74a) is symmetric with respect to y it follows that the ellipse has a second focus at (ae, 0), with a corresponding directrix at . Equation (2.74) is the equation of an ellipse, and may alternatively be expressed in parametric form
(2.74b)
where 0 ≤ φ ≤ 2π. It corresponds to the closed curve shown in Figure 2.23b, which cuts the x and y axes at ± a and ± b, respectively.
The line joining the two points on an ellipse which are most widely separated is called the major (or focal) axis. In this frame of reference it coincides with the x-axis and is of length 2a. The axis perpendicular to the major axis is called the minor axis and is of length 2b. For e = 0, a and b are equal and (2.74a) reduces to
which is the equation of a circle centred at the origin. Hence a circle can be regarded as an ellipse with zero eccentricity. This allows us to infer the formula
(2.75)
for the area of an ellipse, since the area must be proportional to both a and b and reduce to the area of a circle when a = b.
(2.76b)
This is the equation of a hyperbola. The corresponding curves are shown in Figure 2.23c. It is clear that the hyperbola has two distinct branches because for y = 0 there are two solutions for x, but for x = 0 there are no real solutions for y. In this reference frame, the focus F and the directrix are at (ae, 0) and , respectively, and as for the ellipse, the symmetry of (2.76a) implies that the hyperbola has a second focus at ( − ae, 0), with a corresponding directrix at . Equation (2.76a) can be written in the parametric form
(2.76c)
where − ∞ < u < ∞.
Equations (2.72), (2.74a) and (2.76a) are called the standard forms for the parabola, ellipse and hyperbola. They apply in co-ordinates systems chosen so that the directrix is parallel to the y axis and the focus is at ( − a, 0) for the parabola, ( − ae, 0) for the ellipse and (ae, 0) for the hyperbola. In an arbitrary Cartesian co-ordinate system, the three conic sections are described by second order equations of the form
(2.77)
where A, B, C, F, G, and H are constants. The following conditions, which we state without proof, determine which conic section this function represents:
For example, the equation
represents an ellipse because H2 < AB, but the non-zero terms in x and y indicate that the centre of the ellipse is not at the origin, and those in xy indicate that the major and minor axes do not coincide with co-ordinate axes, that is, the ellipse has been rotated.
If α and β are the roots of the equation x2 − 2x − 3 = 0, find the equation whose roots are and .
If x is real and , show that 9p2 ≥ 20(p + 10).
Find the gradients of the tangents to the circle x2 + y2 = r2 that intersect the y-axis at the point (0, c), where c is greater than r.
Two circles of radius 2 are centred at (x, y) = (0, 0) and (1, −1), respectively. What are the co-ordinates of their points of intersection? The line joining these points is a chord of both circles. What angle does it subtend at their centres?
Determine the integer roots of x4 − 2x3 − 2x2 + 5x − 2 and find its other two roots.
The function f(x) = x3 − 2x2 + 4x − 5 has a real root in the range 1.5 < x < 1.6. Find the value of the root correct to three decimal places.
Express in partial fractions:
Express in partial fractions:
Prove the identities:
Solve the following equations for angles in the range 0 < θ < 2π.
Find the general solution of the equation sin kθ = sin θ.
Show that the straight line xsin θ + ycos θ = p is a tangent to the hyperbola if (asin θ)2 − (bcos θ)2 = p2 and find the co-ordinates of the point of contact.
Prove the identity
The triangle ABC has lengths a = BC = 5 cm, b = AC = 4 cm and the angle B is 0.5 rad or 28.65 degrees. Find the length c = AB and the angles A and C.
The co-ordinates (x, y) of a triangle ABC are A = (1, 3), B = (5, 6) and C = (7, 2). Find the angles at the vertices.
Use the method of induction to show that sin [(2n + 1)θ] and can be expressed as polynomials in sin θ for all n ≥ 0.
Simplify the expressions:
Solve the equations:
Solve the equations:
Solve for real values of x the equations:
Solve the equation cosh 4x + 4cosh 2x − 125 = 0.
A straight line passes through the focus of the parabola y2 = 4ax and cuts the parabola at the points Pi(at2i, 2ati), i = 1, 2. Find the relationship between t1 and t2.
Find the equation of the tangent and the normal to the parabola y2 = 4x at the point (x, y) = (1, 2).