In this chapter we generalise the discussion of differential calculus in Chapter 3 to functions of more than one variable. Many results will be taken over from Chapter 3 and will be dealt with rather briefly, so that we can focus on the differences between the two cases.
Given a function f(x1, x2, …, xn) of n independent variables, x1, x2, …, xn, the partial derivative of f with respect to x1 is defined by
provided the limit exists. In other words, it is obtained by differentiating f with respect to x1, while treating the other variables x2, x3, …, xn as fixed parameters. Partial derivatives with respect to the other variables are defined in a similar way. For example, if
then differentiating with respect to x keeping y fixed gives, using the product rule (3.20),
while differentiating with respect to y keeping x fixed gives
Higher derivatives are obtained by repeated partial differentiation, so that
for the second derivatives. Thus for the function (7.2), using (7.3) one obtains
and
From this, one sees that
In general
for any f such that both the derivatives in (7.4) are continuous in xi and xj at the point of evaluation.
It is very important when working with partial derivatives to keep track of which variables are kept constant. This can be made explicit by adopting a notation in which the partial derivatives are written in brackets with the fixed variables as subscripts, so that (7.1) becomes
and (7.3a) and (7.3b) are written
To emphasise the importance of keeping track of which variables are held constant, we note that if we define z = xy, then (7.2) can be written
so that
This notation is widely used in thermal physics, for example, where different choices of variables are often used within the same calculation. Thus, the energy E of a gas at equilibrium is often written both as a function of temperature T and volume V, and also as a function of temperature and pressure P, but
except in the case of a ‘perfect gas’. In this chapter, we shall generally use the simpler notation (7.1), resorting to (7.6) only where there is room for ambiguity.
For functions f(x) of a single variable x, we are already familiar with the result [cf. (5.27)]
(7.7)
for small changes δx, provided the derivative exists. In the same way, the definition (7.1) implies
since x2, x3, …, xn are treated as fixed parameters in defining the partial derivatives. Analogous results are obtained for small changes in the other variables x2, x3, …, xn. From this, for a function of two variables f(x1, x2) one obtains
and substituting (7.8) into the first term of this equation gives
where the omitted terms are quadratic in δx1, δx2. On generalising to n variables, this becomes
(7.9)
where the omitted terms are again quadratic in δxi.
At this point, we denote small changes by dx or dxi, and define the differential df by
(7.10)
for the case of single variables, and
for the case of multi-variables. The important distinction between (7.10, 7.11) and (7.7, 7.9) is that the latter are approximations, with corrections of the order indicated, whereas the former, being definitions, are exact.
Differentials are used repeatedly throughout the rest of this chapter. Here we will show, by an example, how they can be used to obtain partial derivatives when the definition of the relevant function is implicit.
In this subsection we will consider a function of two variables f(x, y) and use differentials to derive the standard results
and
To do this, we use (7.11) to give
and then consider the corresponding function x(y, f) that specifies x in terms of y and f, to obtain the corresponding differential
Substituting (7.14b) into (7.14a) gives
Since any two of dx, dy, df are independent, the coefficient of df on the right-hand side must be unity, which gives (7.12), and the square bracket giving the coefficient of dy must vanish, which gives (7.13), as required.
Finally, we stress again that in using (7.12) and (7.13), it is important to pay attention to the variables being kept fixed in each derivative. In particular,
in general, and the equality only holds if, as in (7.12), the same variables are kept fixed in each partial derivative.
Given two functions A(x, y) and B(x, y), the quantity
is called an exact (or perfect) differential if there exists a function f(x, y) such that
If no such function exists, it is called an inexact differential. A simple test for whether a differential is exact or not is to note that if it is, (7.17a) implies
so that, by (7.5),
The definition of an exact differential may be extended to functions of more than two variables, so that (7.17a) becomes
(7.17b)
and the condition (7.19a) becomes
Exact differentials are used in solving an important class of differential equations (i.e. equations that contain a function and its derivatives), as we shall see in Section 14.1.4; and in thermal physics, where relations of the form (7.19) are called Maxwell relations. In fact, (7.19b) is both a necessary and sufficient condition for (7.16) to be an exact differential. We shall, however, omit the proof of this, and in particular cases where it is satisfied, we shall establish the existence of a suitable function f(x, y) by constructing it, as is shown in the following example.
We next consider a function f(x1, x2 , …, xn) where the variables xi are themselves functions of another variable t. The rate of change of f with t can then be calculated by substituting the expressions xi(t) into f and differentiating the result with respect to t. Alternatively, one can divide the differential (7.11) by dt to obtain the chain rule,
An important special case is when t is itself one of the arguments of the function, that is, when f ≡ f(t, x1, x2, …, xn). Equation (7.20), with n + 1 variables (xn + 1 = t) then gives
A function f(x1, x2, …, xn) is said to be homogeneous of degree k if
(7.22)
where λ is an arbitrary parameter. For example, the functions
are both homogeneous, of degree −1 and 3, respectively. Euler's theorem states that if f(x1, x2, …, xn) is homogeneous of degree k, then
To derive (7.23) we make the substitutions xi = λti and write
For any fixed set of t1, t2, …, tn, this is a function of λ only, and differentiating using the chain rule (7.20) gives
Euler's theorem then follows on multiplying by λ.
In this section we address the problem of how to change variables in equations that contain partial derivatives. To do this, we firstly consider a function f ≡ f( x1, x2, …, xn) of n variables x1, x2, …, xn that are each functions of another n variables xi ≡ xi(t1, t2, …, tn). Using (7.11) twice then gives
where we remind the reader that partial differentiation with respect to xi implies that all the other xj(j ≠ i) are kept constant; and similarly differentiating with respect to tj means that all the other ti(i ≠ j) are kept constant. In the same notation, expressing f directly in terms of tj, j = 1 , 2 , …, n gives
and comparing these two results gives the relation
between partial derivatives with respect to xi and tj.
To illustrate the use of this result, consider a function f(x, y) of the Cartesian co-ordinates x, y. We will change variables to the plane polar co-ordinates r , θ of Figure 2.3, where [cf. (2.34) and (2.35)]
and conversely
(7.25b)
From (7.24), setting (x1, x2) = (x, y) and (t1, t2) = (r, θ), we obtain
and
where
Using (7.25a), these equations imply
(7.26a)
and
(7.26b)
and conversely1
and
Corresponding results involving higher order partial derivatives can be obtained by repeated use of (7.26) and (7.27). As an example, we will transform Laplace's equation in two dimensions,
into polar co-ordinates (r, θ). From (7.27a), we have
In a similar way one obtains
Adding these two results and substituting in (7.28) then gives
(7.29)
as Laplace's equation expressed in plane polar co-ordinates.
The generalisation of Taylor's theorem (5.21) to more than one variable is straightforward. For simplicity, we start by finding an expansion of a function f(x, y) of two variables about x = x0, y = y0 in powers of h = x − x0, k = y − y0. To do this, for any given values of h and k, we define a function
which reduces to f(x, y) when the new variable t → 0. Provided the first N + 1 derivatives of F(t) exist over the whole range 0 ≤ t ≤ 1, Taylor's theorem (5.21) gives
(7.32a)
where is the nth derivative of F with respect to t [cf. (3.40b)], and where the remainder term is
(7.32b)
for at least one θ in the range 0 ≤ θ ≤ 1. However, from the chain rule (7.20) we also have
Substituting this into (7.32) and setting t = 1 then gives
(7.33a)
where
and
means the derivatives of f(x, y) are evaluated at x = x0, y = y0. The remainder term is
(7.33b)
for at least one θ in the range 0 ≤ θ ≤ 1. Assuming RN → 0 as N → ∞, then leads to the Taylor series
The above results are easily generalised to more than two variables. For a function f(x1, x2, …, xk) of k variables, (7.34), for example, becomes
on expanding about xi = ai(i = 1, 2, …, k), where the right-hand side is evaluated at x1 = a1, x2 = a2, …, xk = ak. However, expansions such as (7.35) for several variables rapidly become unwieldy, so we will restrict ourselves to explicitly expanding (7.34), when one obtains
(7.36)
where all the derivatives are evaluated at x = x0, y = y0, and we have assumed (7.5). In general, if one assumes that the order of the cross derivatives is unimportant, that is,
as is usually the case2, (7.34) becomes
(7.37)
where we have used the binomial expansion (1.23) and where all the derivatives are again evaluated at x = x0, y = y0.
The necessary and sufficient conditions for the differential df(x1, x2, …, xn) to vanish for arbitrary dx1, dx2, …, dxn are, from (7.11),
Points at which (7.38) are satisfied are called stationary points, in analogy to those discussed for a function of a single variable in Section 3.4.1. However, determining whether such points are local minima, maxima or saddle points is more complicated than for functions of a single variable. For simplicity, we shall restrict ourselves to functions of two variables f(x, y), which can be regarded as two-dimensional surfaces as shown in Figures 7.2 and 7.3. Suppose that f(x, y) has a stationary point at x = x0, y = y0, where by (7.38),
Then making a Taylor expansion about (x0, y0) gives
where we have neglected higher-order terms and assumed for all values of h and k. Then if (x0 , y0) is a minimum (maximum), as opposed to a saddle point, we must have for all non-zero h, k values. For h ≠ 0, this implies that
has no real roots, where . Since the condition for a quadratic az2 + bz + c = 0 to have no real roots is b2 < 4ac, this implies
and the same condition is obtained if instead we consider k ≠ 0. Hence (7.40a) is a necessary condition for f(x0, y0) to be either a maximum or a minimum, and since the left-hand side is positive definite, this implies that and are either both positive or both negative. Specifically, if (7.40a) is true and
then and f(x0, y0) is a maximum; whereas if (7.40a) holds and
(7.40c)
then and f(x0, y0) is a minimum. Examples of a maximum and a minimum in two variables are shown in Figure 7.2.
If on the other hand
(7.41)
f(x0, y0) is a saddle point. There are several different types of saddle point depending on the behaviour of the second derivative, and one example is shown in Figure 7.3.
Finally, if
then for all h and k, contradicting our earlier assumption, and higher-order terms in the Taylor expansion must be inspected to determine the nature of the stationary point.
In the preceding section, we discussed how to find the stationary points of a function of two or more variables. However sometimes one needs to find the stationary points of the function when the variables are subject to one or more additional conditions, called ‘constraints’. To take a very simple example, one could ask: “What is the maximum area of a rectangular field surrounded by a fence of fixed length, say 200 m?” In other words, if the length and breadth of the field are x and y metres respectively, what is the maximum value of the area A = xy subject to the constraint x + y = 100 m. In simple problems of this kind, one can use the constraint to eliminate one of the variables. In the above case eliminating y gives
which is easily shown to have a maximum value A = 2500 m2 for x = 50 m, corresponding to a square field with x = y = 50 m. However, in cases where the function and/or the constraint is more complicated, or there are more than two variables and more than one constraint, solving the problem by using each of the constraints to eliminate a variable can become very clumsy and tedious, and it is often easier to use an alternative method due to Lagrange.
Suppose we need to find the stationary points of a function f(x1, x2, …, xn), where the variables are restricted to a limited range of values by k constraints, that we shall assume can be written in the form
where k < n. In this case, the relation
(7.43)
no longer leads to the usual conditions
because the dxi are no longer independent, but are related by conditions of the form
(7.44)
This problem can in principle be solved, as in the simple example discussed above, by using conditions (7.42) to eliminate k of the variables, and expressing f(x1, x2, …, xn) as a function of the remaining independent variables, which can then be minimised in the usual way. However, following Lagrange, it is often more efficient to consider a new function,
(7.45)
where the λj are new variables called undetermined multipliers. One then determines the stationary points of F by treating x1, x2, …, xn as independent variables to give n conditions
These determine the values of x1, x2, …, xn as functions of the variables λj(j = 1, 2, …, k), whose values can then be determined by requiring the k conditions (7.42) to be satisfied. In other words, the n + k variables x1, …, xn, λ1, …, λk are determined by the n + k equations (7.46) and (7.42); and since F → f when (7.42) are satisfied, the xi values correspond to the stationary points of f subject to the constraints (7.42). This procedure is best illustrated by example.
We conclude this chapter by using the properties of partial derivatives to deduce the rules for differentiating integrals with respect to a variable parameter, starting with the indefinite integral
where (4.1) together with the definition of partial derivatives implies
Then the partial derivative
provided that F satisfies (7.5), that is,
To see this, we note that (7.52) and (7.50) imply
Integrating this equation with respect to x then gives
which together with (7.49) gives (7.51). In other words, we may reverse the order of the differentiation and integration, as in (7.51), provided (7.52) is satisfied. As we saw in Section 7.1, this is so if the first- and second-order partial derivatives of F are continuous in x and t, as is usually the case.
We next consider the definite integral
(7.54)
where the limits of integration, as well as the integrand, may also depend on t. Then, using the chain rule (7.21), we have
provided a, b are differentiable functions of t. In addition, (7.53) implies
so that, using this together with (7.50), one finally obtains Leibnitz's rule,
which reduces to
for fixed limits a, b. Finally, one may allow b → ∞ and/or a → −∞, provided all the integrals converge.
As well as allowing given integrals to be differentiated, these results can be exploited by using known integrals to evaluate related, unknown integrals. For example, in thermal physics one frequently needs to evaluate integrals of the form
where n ≥ 0 and α > 0. There are no problems with convergence, and for n = 0 one easily obtains
while differentiating (7.57) with respect to α using (7.56) gives
and hence
Thus, and in general
(7.58)
Show that the relation
is satisfied for each of the following functions:
A function f(x, y) is of the form
where g is an arbitrary function of . Show that
The plane z = αx + βy + γ is tangential to the sphere z2 = 14 − x2 − y2 at the point (x, y, z) = (1, 2, 3). Find the values of the constants α, β and γ, and hence the equation of the plane.
F is a function of three independent variables x, y and z, and a, b and k are constants.
If F = sin (ax)sin (by)sin [kz(a2 + b2)1/2], show that
If F = e− kz[sin (ax) + cos (by)], show that
Two independent variables u and w are given in terms of two other independent variables x and y, by
where a, b, k and h are constants. By using differentials, show that
and
A wide class of systems (e.g. a sample of liquid or gas) satisfies the fundamental thermodynamic identity
(7.59)
where E is the energy, S is the entropy, and P, V and T are the pressure, volume and temperature of the system, respectively.
Use (7.5) to derive the Maxwell identity
Obtain an expression for dG, where G ≡ E − TS + PV and hence derive the second Maxwell identity
The equilibrium behaviour of a gas at high temperature can be described approximately by Dieterici's equation:
(7.60)
where P, V and T are the pressure, volume and temperature, respectively, R is the gas constant, and a and b are parameters that are characteristic of the particular gas.
(7.61)
in this approximation.Which of the following differentials are exact?
Show that the following are exact differentials df of a function f(x, y) and identify the function.
Find when z is given by the following expressions
Which of the following functions f(x, y, z) satisfy the equation
and what is the corresponding value of the constant k?
If f(x1, x2, …, xn) is a homogeneous function of order k, show that
so that, for example,
if f(x, y) is homogeneous of degree k.
If z = f(x, y), where
show that
and
A function f(x, t) is given by
where φ1 and φ2 are arbitrary differentiable functions, and c is a constant. Show that
If the function f(x, y) is transformed to a function g(u, w) by the substitutions
show that
Use Taylor's theorem to expand to second order about the point x = 2, y = 1.
Expand as a Taylor series about x = y = 0 up to cubic terms.
Find the maximum and minimum values of the function
inside the square defined by 0 < x, y < π.
Find the stationary points of the function
and classify them as either minima, maxima, or saddle points.
Find the stationary points of the function f(x, y) = x2 − y2 − 2, subject to the constraint x2 − 2y = 2.
Find the volume of the largest box with sides parallel to the x, y, z axes that can be fitted into the ellipsoid;
A set of numbers xi (i = 1, 2, …, n) has a product P. What is the largest value of P, if their sum is equal to N?
Evaluate
where x > 0, and hence find I(x) itself, given that I(1) = 0.
If f(x, t) = 1/ln (x + t), find
Evaluate
Show by differentiation with respect to α, that
Find an explicit expression for
where a > 0 and k ≥ 0 is an integer, given that .