Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

9
Determinants, Vectors and Matrices

In Chapter 8, we introduced vectors as objects associated with a direction in everyday three-dimensional space and showed how they can be discussed using equations for their three components in a given reference frame. Here we shall show how to extend the number of components to define vectors in spaces of more than three dimensions. This leads to the introduction of matrices, which are two-dimensional arrays that enable vectors to be transformed into other vectors. The properties of matrices are discussed in detail and their uses illustrated in, for example, solving simultaneous linear equations. In the following chapter we continue the discussion of matrices, with applications to vibrating systems and to geometry. Firstly, however, we study related quantities called determinants, which will play a crucial role in this development.

9.1 Determinants

These occur in many contexts and we have already met examples in the discussion of vectors in Chapter 8. From (8.16b), the vector product of two vectors a and b in Cartesian co-ordinates has an x-component (a_yb_z − a_zb_y). Any four quantities a_ij(i, j = 1, 2) combined in this way can be written in the form of a square array, denoted by Δ₂, called a determinant. This is written in the form

(9.1) Unnumbered Display Equation

where the quantities a_ij(i, j = 1, 2) are called the elements of the determinant. For example,

The result, in this case − 5, is called the value of the determinant. It is important to note that the vertical bars in (9.1) do not mean that a modulus is to be taken, as this example confirms. Although we have used real numbers for the elements in this example, in general they can be algebraic expressions, real or complex, so the value of the determinant may also be real or complex expressions or numbers.

Determinants of larger dimensionality can also be constructed. Thus the 3 × 3 determinant

(9.2a) Unnumbered Display Equation

is defined as

(9.2b) Unnumbered Display Equation

Comparing this with (8.18), we see that the triple scalar product of three vectors a, b, c

(9.3a) Unnumbered Display Equation

is a determinant whose elements are the Cartesian components of the vectors. Likewise, comparing (9.2a) with (8.16a) shows that the vector product of two vectors a and b can also be written as a 3 × 3 determinant

(9.3b) Unnumbered Display Equation

The two compact forms (9.3a) (9.3b) are probably the easiest way of remembering the expressions (8.18) and (8.16a) for the triple scalar product and vector product, respectively.

Returning to (9.2b), we see that the terms in brackets on the right-hand side are themselves 2 × 2 determinants. Hence we can write

where the determinants that occur on the right-hand side are examples of minors. In general, the minor m_ij of any element a_ij of Δ₃ is the 2 × 2 determinant obtained by deleting all the elements in the ith row and jth column of Δ₃. Therefore (9.2b) can be written

(9.4a) Unnumbered Display Equation

where the co-factor of any element a_ij is defined by

(9.4b)

Equation (9.4a) is called the Laplace expansion along the first row of Δ₃. For example, the minors of the elements along the first row of the determinant

(9.5) Unnumbered Display Equation

are

so that (9.4a) gives

Laplace expansions can be made along any row or column. For example, the expression in (9.2b) can be rearranged to give

which is the Laplace expansion

along the second row. Using this expansion for the determinant (9.5) gives

in agreement with the value obtained by expanding along the first row. Alternatively (9.2b) can be written in the form

which is a Laplace expansion along the third column.

The definition of a determinant can now be extended to integers n > 3 by generalising the Laplace expansion (9.4a) to any n. To do this, we first write an n × n array

(9.6a) Unnumbered Display Equation

where the elements are a_ij (i, j = 1, 2, …, n) and the indices i and j again label the rows and columns, respectively. Then, by analogy with the expansion (9.4a) for 3 × 3 determinants, we define

(9.6b) Unnumbered Display Equation

where the minors m_ij are again the determinants obtained by deleting all the elements of the ith row and jth column, and the co-factors are given by (9.4b). Since the minors associated with the elements of an n × n determinant are (n − 1) × (n − 1) determinants, (9.6b) defines 4 × 4 determinants in terms of a sum of 3 × 3 determinants, and so on. Such higher order determinants are required in, for example, the solution of n simultaneous linear equations, as we shall see in Sections 9.1.2 and 9.4.4.

9.1.1 General properties of determinants

The evaluation of determinants using the Laplace expansion involves the arithmetical operations of addition, subtraction and multiplication, the number of which increases rapidly as the dimensionality of the determinant increases. The work involved can sometimes be reduced by exploiting a number of general properties of determinants that are given below.

Although these results hold in general, here we will only consider the case for 3 × 3 determinants. In this case it is convenient to define the totally antisymmetric symbol ϵ_ijk as follows:

(9.7) Unnumbered Display Equation

where cyclic permutations were defined following (8.16b). Using (9.7), Eqs. (9.2a) and (9.2b) may be written

(9.8)

where we have used a shorthand notation for a sum over three dummy indices i, j and k, which may each take the values 1, 2 and 3, i.e.

The theorems are as follows.

The value of a determinant is unchanged by interchanging (called transposing) its rows and columns.

This corresponds to the transformation a_ij → a_ji for i, j equal to 1, 2 and 3. Using the notation in (9.8) and denoting the new determinant by Δ^T₃, this gives

Rearranging the right-hand side gives

It follows that theorems about rows also apply to columns, so it is sufficient to prove them only for the former.
The sign of a determinant is reversed by interchanging any two of its rows (or columns).

This result again follows directly from (9.8). For example, interchanging the first row and second column gives

and using the definition (9.7),
The value of a determinant is zero if any two rows (or columns) are identical.

This follows immediately from the preceding result, because this interchange gives Δ₃ = −Δ₃ and hence Δ₃ = 0.
If the elements of any one row (or column) are multiplied by a common factor, the value of the determinant is multiplied by this factor.

This follows trivially, because each term in (9.8) contains a single element from each row (or column).

Using these theorems, a number of other useful results may be established as follows.
If any two rows (or columns) have proportional elements, the value of the determinant is zero.
If the elements of any row (or column) are the sums or differences of two or more terms, the determinant may be written as the sum or difference of two or more determinants.
The value of a determinant is unchanged if equal multiples of the elements of any row (or column) are added to the corresponding elements of any other row (or column).

These properties can often be used to manipulate a determinant into a form that is easier to evaluate. For example, consider the determinant

The elements of the first row are all multiples of 9, which can therefore be factored out to give

Then by property (vii) we can add row 3 to row 1 without changing the value of the determinant, when we obtain

because a determinant with two equal rows has a value zero [property (iii)].

In other cases, property (vii) can often be used to manipulate a determinant into a form where it has one or more zeros in a given row or column. Then if this row or column is used in the Laplace expansion, the number of arithmetic operations can be reduced considerably. Consider the evaluation of the determinant

In this case, one way of proceeding is to add column 4 to each of columns 1 and 3, and add twice column 4 to column 2, when we obtain

Making a Laplace expansion along the first row gives

Then subtracting row 2 from row 1 gives

The Laplace expansion is most suited for determinants of low dimensionality (i.e., small values of n) and where in numerical calculations the elements do not differ much in magnitude. For large-dimensional determinants, the final result may still be formed from the addition and subtraction of many terms, each of which is itself the product of several elements. In these cases there is a significant probability of inaccuracies being introduced in numerical calculations due to rounding errors, particularly if the elements differ considerably in magnitude. Special computer programs exist¹ that address this problem, and are capable of evaluating determinants exactly.

9.1.2 Homogeneous linear equations

We have seen that determinants appear naturally when manipulating vectors. They also appear in the theory of simultaneous linear equations. If there are n simultaneous linear equations in n unknowns x_i(i = 1, 2, …, n), they may be written in the general form,

(9.9) Unnumbered Display Equation

where the a_ij (i = 1, n ; j = 1, n) and b_j (j = 1, n) are constants. These equations are not necessarily compatible. In the general case where the b_j are not all zero, the equations are called inhomogeneous, and their solution will be discussed in Section 9.4.4. In the simpler homogeneous case, where all the constants b_j are zero, the equations are never inconsistent, because they always have a so-called trivial solution where all the x_i are zero. But they may also have non-trivial solutions, where not all the x_i are zero. Because the equations are linear and homogeneous, it follows that if a non-trivial solution exists for a particular set of values x_i(i = 1, 2, …, n), then the set cx_i(i = 1, 2, …, n), where c is a constant, is also a solution. Thus non-trivial solutions are characterised by the ratios x₁ : x₂ : x₃: ⋅⋅⋅: x_n, rather than by unique values.

We will examine below how to find non-trivial solutions, using initially the example of n = 3, that is, the set of equations

(9.10) Unnumbered Display Equation

which has an associated determinant of coefficients

The value of this determinant determines whether or not a non-trivial solution exists.

An obvious way to proceed is to use the third equation in (9.10) to give an expression for x₃ in terms of x₂ and x₁, then substitute this into the other two equations and examine the two resulting equations in x₁ and x₂ to see if they have compatible solutions. However, this is algebraically rather cumbersome and rapidly becomes very tedious if one considers more than three equations.

Instead, we will use another method, in which the key result follows from the equation

(9.11a) Unnumbered Display Equation

obtained by multiplying the first equation in (9.10) by the co-factor A₁₁, the second by A₂₁, and the third by A₃₁, and adding the three resulting equations together. The first term in brackets in (9.11a) is seen to be the Laplace expansion of Δ using the first column, and so has the value Δ. On comparing the second bracket with the first, we see that it is the Laplace expansion of a determinant in which the first column a₁₁, a₂₁, a₃₁ of Δ has been replaced by a₁₂, a₂₂, a₃₂. Hence

because two columns are identical. The third bracket in (9.11a) vanishes for a similar reason, so that (9.11a) reduces to

(9.11b)

and therefore x₁ = 0 unless Δ = 0. Analogous arguments show that x₂Δ = x₃Δ = 0, so a necessary condition for a non-trivial solution to (9.10) is

(9.12) Unnumbered Display Equation

Furthermore, if we substitute

(9.13a)

into (9.10), we see that the left-hand sides of the three equations (9.10) equal the three terms in brackets in (9.11a), which have all been shown to vanish for Δ = 0. Hence (9.13a) is the desired non-trivial solution and (9.12) is both a necessary and sufficient condition for it to exist. A similar argument shows that the solution can equally well be expressed in the form

(9.13b)

In contrast to the direct method of solution, the above chain of reasoning can be extended in a straightforward way to solve n homogeneous linear equations for any integer n. The condition for a non-trivial solution then becomes

(9.14) Unnumbered Display Equation

and provided this is satisfied, the non-trivial solution is given by the co-factors, i.e.,

(9.15)

Finally, we note that for the case n = 3, the homogeneous equations (9.10) have a simple geometrical interpretation if we interpret x₁, x₂ and x₃ as Cartesian co-ordinates x, y and z. On comparing to (1.51), we see that the three equations (9.10) are those of three planes passing through the origin. Hence the line of intersection of two of these planes, assuming they are not identical, will also pass through the origin. If this line lies in the plane described by the third equation, then any point on it is a solution to all three equations (9.10). In this case, there is a non-trivial solution given by (9.13a), which is indeed the equation of a straight line through the origin, as can be seen by comparing with (8.40). On the other hand, if it does not lie in the plane described by the third equation, then it just passes through that plane at the origin and there is no non-trivial solution to all three equations.

9.2 Vectors in n Dimensions

In Chapter 8, three-dimensional vectors were defined as mathematical quantities having magnitude and direction and satisfying the parallelogram law of addition. This approach is a geometrical one and is independent of the co-ordinate system. We also developed an algebraic approach using basis vectors (i, j, k) in the directions of the x, y, z axes of a three-dimensional Cartesian co-ordinate system. Any vector a could then be specified by its components a_x, a_y, a_z along the directions of the basis vectors, i.e.

or equivalently a = (a_x, a_y, a_z). The basis vectors are not unique (for example, we could rotate the three axes through a fixed angle and use these new directions to define new basis vectors) but they are linearly independent. This means that there is no linear combination of them that vanishes, unless the coefficients are all zero. That is,

only if

In the physical sciences it is common to encounter ordered sets of n quantities a = (a₁, a₂, …, a_n), b = (b₁, b₂, …, b_n) etc., whose elements satisfy the same algebraic properties as the components of vectors. In particular, if we define their sums by

(9.16a)

and multiplication by a scalar λ by

(9.16b)

then they obey all the general rules (8.1), (8.2) deduced for vectors in Chapter 8. For this reason (a₁, a₂, …, a_n) and (b₁, b₂, …, b_n) are referred to as the components of vectors a and b in an n-dimensional vector space. In addition, we can define a null vector 0, whose n components are all zero, so that for any vector a,

9.2.1 Basis vectors

Implicit in the choice of the word ‘component’ to describe (a₁, a₂, …, a_n), (b₁, b₂, …, b_n), etc. is the existence of a set of basis vectors, for example,

(9.17) Unnumbered Display Equation

so that

(9.18)

in analogy to a = a_xi + a_yj + a_zk for ordinary three-dimensional vectors. As for the case of ordinary vectors, the choice of basis vectors is not unique, and we can equally well expand the vector a in terms of any set of basis vectors e_i(i = 1, 2, …, n), providing the latter are linearly independent, that is, provided that

(9.19a) Unnumbered Display Equation

has no solutions for the constants μ_i except

(9.19b)

This ensures that none of the basis vectors can be expressed in terms of the others, and, in general, a vector space is said to be n-dimensional if it contains no linearly independent set of vectors within it with more than n members. Such a set of n linearly independent vectors is called a complete set. It also guarantees the uniqueness of the expansion (9.18). This is easily seen by writing

and equating this to (9.18) gives

which from (9.19) has no solution other than for all i = 1, 2, …, n. Of course the components (a₁, a₂, …, a_n) will depend on the particular basis vectors chosen, and (a₁, a₂, …, a_n) is said to be a representation of a in the basis e_i(i = 1, 2, …, n).

In what follows, we will need to relate the components a_i in a given representation (9.18) to the components a′_i in a representation

(9.20)

defined with respect to a different set of basis vectors where e′_i(i = 1, 2, ⋅⋅⋅, n). To do this, we note that any vector in the space can be written in the form (9.18), including the new basis vectors e′_i . Hence we can write

(9.21a) Unnumbered Display Equation

where p_ij are numerical constants. On substituting (9.21a) into (9.20), we obtain

This is only compatible with (9.18) for arbitrary vectors a if

(9.21b) Unnumbered Display Equation

which is the desired relation.

9.2.2 Scalar products

The components of vectors need not be restricted to real quantities. Complex vectors in an arbitrary number of dimensions play an important role in, for example, quantum mechanics. Generalising the vectors and scalar variables to complex quantities does not alter any of the equations (8.1), (8.2) or (9.16)–(9.18), but does affect the definition of the scalar product. To distinguish this from the scalar product defined in Chapter 8 for three-dimensional vectors, we will use the notation (a, b) (also called the inner product in this context).

For the moment, we restrict ourselves to the basis (9.17), when the inner product of two vectors a = (a₁, a₂, …, a_n) and b = (b₁, b₂, …, b_n) is defined to be

(9.22) Unnumbered Display Equation

It reduces to the scalar (dot) product defined in Chapter 8 for the case of real coefficients and ensures that the squared length

remains real and positive. This leads to the basic properties

(9.23a)

(9.23b)

(9.23c)

from which it follows that

(9.23d)

and

(9.23e)

where λ and μ are both in general complex constants. Note that these relations reduce to the corresponding relations (8.8a), (8.8b) and (8.8c) for the real vectors discussed in Chapter 8 when λ, μ and the vectors themselves are real. In particular, we see from (9.23c) that the scalar product is only commutative for real vectors.

We can now apply the general properties (9.23a)–(9.23e) to a general basis (9.18). In doing so, we will assume that the chosen basis satisfies the orthonormality relations [cf. (8.11)]

(9.24a)

where δ_ij is the Kronecker delta symbol, defined by

(9.24a)

Then using (9.23) repeatedly we have

using (9.24). Thus the expression (9.22) holds in all bases (9.18) provided the orthonormality relations (9.24) are satisfied. Furthermore, using (9.18) and (9.24) we have

i.e. the vector a is given by

(9.25) Unnumbered Display Equation

9.3 Matrices and linear transformations

In this section we introduce matrices and discuss their role in transforming vectors into other vectors.

9.3.1 Matrices

Consider the set of linear simultaneous equations

(9.27) Unnumbered Display Equation

where the coefficients a_ij(i = 1, 2, …, m; j = 1, 2, …, n) are constants. These equations determine m variables y_i(i = 1, 2, …, m) in terms of n given variables x_j(j = 1, 2, …, n), where the integers m and n are not necessarily equal. It is convenient to write (9.27) in a form that separates the variables x_j from the coefficients a_ij as follows:

(9.28) Unnumbered Display Equation

This array of coefficients is called a matrix and the quantities a_ij are called the elements of the matrix. It is said to be of order m × n because it has m rows and n columns. The vertical arrays y_i(i = 1, 2, …, m) and x_j(j = 1, 2, …, n) are also matrices, in this case of order m × 1 and n × 1. They are referred to as column matrices, or column vectors. Likewise, matrices of order 1 × n are referred to as row matrices, or row vectors. On comparing (9.28) with (9.27), we see that each of the y_i(i = 1, 2, …, m) is obtained by multiplying the element in the ith row of the m × n matrix by the numbers x_j(j = 1, 2, …, n) in turn and adding, so that

(9.29) Unnumbered Display Equation

For example, if

then

So far we have merely rewritten (9.27) in the different, but equivalent, form (9.28). The usefulness of this form results from developing rules for manipulating matrices directly. In doing this, it is convenient to denote matrices by upper-case bold Roman letters A, B, C, etc., with the exception that both row and column vectors are denoted by lower-case bold Roman letters a, b, c, etc. Thus, (9.28) may be written in the compact form

(9.30)

Matrix algebra is then defined by the following rules.

Equality

Two matrices A, with elements a_ij, and B, with elements b_ij, are equal, if, and only if, they are of the same order m × n, and a_ij = b_ij for all i = 1, 2, …, m and j = 1, 2, …, n.
Addition

The sum S of two matrices A and B is defined if, and only if, they have the same order. The elements of S are then given by

(9.31)

This leads directly to the commutative and associative laws

(9.32a)

and

(9.32b)

respectively.
Scalar multiplication

If a matrix A is multiplied by a scalar quantity λ, then every element of A is multiplied by λ, i.e.

(9.33)

If λ and μ are arbitrary constants, (9.31)–(9.33) lead to the associative and distributive laws

(9.34a)

(9.34b)

and

(9.34c)

provided again that A and B are of the same order. In addition, we define null matrices 0 of any dimension, whose elements are all zero, so that

(9.34d)
Matrix multiplication

The product of two matrices AB is defined if, and only if, the number of columns in A is the same as the number of rows in B. Then, if A is an l × m matrix and B is an m × n matrix, the product AB is an l × n matrix whose elements are defined by

(9.35)

for all i = 1, 2, …, l; j = 1, 2, …, n. In other words, the element (AB)_ik is obtained by multiplying each element of row i of A by the corresponding element of column k of B, and adding. For example, if

(9.36a)

then AB is the 2 × 2 matrix

(9.36b)

It is worth noting that, just as for the scalar products of ordinary three-dimensional vectors, . For example, if

then

but neither A nor B is a null matrix.

To motivate the definition (9.35) and to derive another important relation, let us suppose the n-component column vector x in (9.30) is related to a p-component column vector z by

(9.37a)

where B is an n × p matrix, so that

(9.37b)

Substituting (9.37a) into (9.30) gives

(9.38a)

On the other hand, substituting (9.37b) into (9.29), gives

which, on comparing with (9.35), is seen to be

Hence y = (AB)z and on comparing this with (9.38a), we finally obtain

(9.38b)

From this we see that the position of the brackets is immaterial and we can write y = ABz without ambiguity. By a similar argument one can show that

(9.39)

and so on. However, while the position of brackets in matrix products is not important, the order is crucial, since matrix multiplication is not in general commutative, that is, AB ≠ BA. This is obvious for the multiplication of a n × m matrix A and a m × n matrix B, because the products AB and BA have different dimensionalities, but it is also true even if n = m. Matrix multiplication is however distributive with respect to addition, i.e.

(9.40a)

and

(9.40b)

but (9.40a) and (9.40b) are not in general identical.

Example 9.6

Consider the matrices

Which of the following additions are defined: A + B, A + C, C + D?
Evaluate the matrix A + 2 B.
Which of the following products are defined: AB, BA, AC, CA, AD, DA?
Evaluate BA, where A and B are the matrices (9.36a) and compare it with the product AB given in (9.36b).

Solution

The addition of two matrices is only defined if the number of rows and columns of the two matrices are equal. Thus only A + B is defined; A + C and C + D are undefined.
The products of two matrices is only defined if the number of columns in the first matrix is equal to the number of rows in the second matrix. Thus only the products AB, BA, AC, DA are defined; CA and AD are undefined.
The number of rows in B matches the number of columns in A so the product BA is defined and is given by

Comparing with (9.36b), we see that BA does not even have the same dimensions as AB.

9.3.2 Linear transformations

Column matrices are special cases of m × n matrices with n = 1 and are written with the second index suppressed, that is, we write them with a single row index. For example,

(9.41) Unnumbered Display Equation

With this convention, for any two column matrices a and b, (9.31) and (9.33) reduce to

(9.42a)

and

(9.42b)

These relations are identical to (9.16a) and (9.16b) used to characterise the components of an n-dimensional vector in Section 9.2. Similarly, the matrix relations (9.32)–(9.34) reduce to the vector relations (8.1) and (8.2) when applied to column matrices. Hence column matrices are with justification referred to as column vectors. The scalar product of a vector a with a vector b is also easily expressed in matrix notation, since the product of a row vector and a column vector of the same order n is given by

Comparing this with (9.22), we see that in an orthogonal basis, the scalar product is

(9.43)

where the row vector a† corresponding to the column vector a is defined by

(9.44)

and is called the Hermitian conjugate of a for reasons that will become clear in Section 9.3.3.

Returning to (9.30), we now interpret the matrix A as a matrix operator that transforms an n-dimensional vector x into an m-dimensional vector y. By an operator we mean anything that acts on the object to its right, called the operand, to give a new object. Furthermore, it is easy to show, using (9.29) and (9.42) that

(9.45)

where λ and μ are arbitrary constants and a, b are arbitrary vectors. Any operator that satisfies an equation of the form (9.45) is called a linear operator and, correspondingly, (9.30) is called a linear transformation. Another linear operator, which we will meet in Chapter 10, is the differential operator , which transforms a function f(x) into its derivative. Thus,

(9.46a)

where the linearity condition

(9.46b)

follows directly from (3.19).

Linear operators and transformations are widely used in mathematics and physical science. Here we shall confine ourselves to matrix operators. A simple example is provided by considering a position vector in two dimensions,

(9.47)

When rotated through an angle θ, this gives a new position vector

of the same length r, as shown in Figure 9.1. Using the trigonometric identities (2.36), we have

and similarly

images — **Figure 9.1** The rotation of the two-dimensional vector (9.47) through an angle θ.

Hence in matrix notation,

(9.48) Unnumbered Display Equation

or equivalently,

(9.49)

where the rotation matrix

(9.50)

Finally, we consider the product of two transformation matrices A and B. Equation (9.38b) implies

so that the transformation AB is equivalent to the operator B acting first, followed by the operator A. In other words, the operator on the right acts first, and if A acts before B, the appropriate operator is BA ≠ AB, since in general matrices do not commute.

9.3.3 Transpose, complex, and Hermitian conjugates

Given a matrix A with elements a_ij, it is useful to define three related matrices, as follows.

The transpose of A, denoted A^T, is obtained by interchanging rows and columns. An example is

while the general relation is

(9.51)

It follows from this that

(9.52)

since

In general, the transpose of a product of matrices is the product of the individual transposed matrices taken in reverse order. Thus,

and so on, which follows by repeated application of (9.52).
The complex conjugate of a matrix A is denoted A* and has elements a*_ij. Complex conjugation has no effect on the order in products, i.e.

The Hermitian conjugate² of a matrix A, written A†, is defined as the transpose of the complex conjugate matrix, or vice versa, i.e.

(9.53a)

so that³

(9.53b)

Since Hermitian conjugation involves a transpose, it also reverses the order of products, i.e.

(9.54)

For a real matrix, the Hermitian conjugate is just the transpose.

9.4 Square Matrices

Matrices with the same number of rows and columns are called square matrices, and their dimension n = m is called their order. We discuss here some of the most important types of square matrices that will be required in later sections.

9.4.1 Some special square matrices

Diagonal matrix

A matrix A is diagonal if its elements a_ij are zero unless they lie on the leading diagonal i = j, so that a_ij = a_iδ_ij, where δ_ij is the Kronecker delta symbol of (9.24b). The sum of the elements along this diagonal is called the trace, denoted Tr. As an exception to the general rule, diagonal matrices of the same order commute under multiplication, that is, AB = BA if A and B are both diagonal. An important example of a diagonal matrix is the unit matrix I defined by

(9.55)

which has the property

(9.56)

for any matrix A (not necessarily diagonal) of the same order.
Symmetric and anti-symmetric matrices

A matrix is symmetric if it satisfies the condition A = A^T, i.e. a_ij = a_ji, and anti-symmetric (or skew symmetric) if A = −A^T, i.e. a_ij = −a_ji, where A^T is the transpose of A. Any matrix A may be expressed as the sum of a symmetric and an anti-symmetric matrix, by analogy with the decomposition of functions as the sum of symmetric and anti-symmetric functions, as discussed in Section 1.3.1. Thus

where by construction the first bracket is a symmetric matrix and the second is anti-symmetric.
Hermitian matrix

A matrix is Hermitian, if it satisfies A = A†, where the dagger indicates the combined operation of complex conjugation and transposition, carried out in either order, that is, if a†_ij = (a_ji)* = a_ij. If A† = −A, the matrix A is said to be anti-Hermitian (or skew Hermitian). Any complex matrix can be expressed as the sum of a Hermitian matrix and an anti-Hermitian matrix. Thus,

where by construction the first bracket is a Hermitian matrix and the second is anti-Hermitian. A real, symmetric matrix is automatically Hermitian, because A† = A^T in this case.
Unitary matrix

A matrix U is said to be unitary if it satisfies

(9.57a)

If we make the unitary transformation

on a vector x, then by (9.43) and (9.57a),

so that the length of the vector is unchanged.
Orthogonal matrix

An orthogonal matrix O is a real unitary matrix. It therefore also leaves the length of a vector unchanged and (9.57a) becomes

(9.57b)

9.4.2 The determinant of a matrix

Given a square matrix A of order n, we can define an associated determinant by

(9.58) Unnumbered Display Equation

If , the matrix is said to be singular; if , then A is non-singular.

The properties of determinants have been summarised in Section 9.1. Since interchanging rows and columns leaves the value of the determinant unchanged, it follows that

(9.59a)

Similarly, since , we have

(9.59b)

for the Hermitian conjugate matrix A†. Multiplying a matrix by a scalar constant λ multiplies every element a_i by λ, but since only one member of each row occurs in the determinant, we have

(9.60a)

for a square matrix of order n. The determinant of a product of matrices is equal to the product of the determinants.

(9.60b)

The proof of (9.60b) is rather lengthy and will not be reproduced here⁴. However, it follows from it that

(9.60c)

and repeated application of (9.60b) leads to

(9.60d)

for any number of matrices, independent of their order.

Equation (9.60b) also leads to useful results for unitary and orthogonal matrices. Specifically, from (9.57a) and (9.60b), we obtain

(9.61)

Hence the determinant of a unitary matrix is either +1 or −1, and since an orthogonal matrix O is just a real unitary matrix, the same result applies to orthogonal matrices.

A simple example of an orthogonal matrix is the rotation matrix in two dimensions R(θ) described in (9.50). One sees that

consistent with (9.61). In contrast, a matrix that generates a reflection in a given axis, for example

so that x′ = −x, y′ = y, has determinant − 1. This behaviour is characteristic of rotations and reflections about any given axis.

9.4.3 Matrix inversion

We can now complete the discussion of matrix algebra. The operation of division by a matrix is not defined. However, if we can find a matrix D such that AD = DA = I, then D is called the inverse of A and is written A^{− 1}, so that

(9.62)

The analogy with division is then multiplication by A^{− 1}, so that, for example,

Equation (9.62) can only be satisfied if A and A^{− 1} are square matrices of the same order, while (9.60b) then implies

so that a singular matrix (one having ) has no inverse, whereas a non-singular matrix does have an inverse. To find the inverse of a matrix A, we need a new matrix called the adjoint, denoted . This is defined as the transpose matrix of the co-factors of A. Thus for the n × n matrix A, with co-factors A_ij corresponding to the element a_ij, the adjoint matrix is

(9.63) Unnumbered Display Equation

from which it follows that

(9.64)

To see this, we note that for i = j, (9.64) is just the Laplace expansion of along row i; while for i ≠ j, it is the Laplace expansion of a matrix A′ which differs from A in that the jth row is replaced by the ith row. Thus we have arrived at the result that the matrix defined by has the property that AD = I and hence D can be identified with the inverse matrix A^{− 1}, i.e.

(9.65) Unnumbered Display Equation

and AA^{− 1} = I. A similar argument gives A^{− 1}A = I, and hence (9.62) is satisfied.

Using this result, it is easy to prove that

(9.66a)

and

(9.66b)

while

(9.66c)

For a 2 × 2 matrix A, (9.65) reduces to

(9.67)

but the evaluation of the inverses of matrices with higher dimensionality can be somewhat tedious. However the computational work needed can be reduced by a process called row reduction, or Gaussian elimination.

The three elementary operations used in row reductions are:

Multiply any row by a non-zero constant;
Interchange any two rows;
Replace any row by the sum (or difference) of itself and any multiple of another row.

Since by the law of matrix multiplication, the identity AA^{− 1} = I involves only the rows of A and the columns of A^{− 1}, it follows that the equality is preserved if one applies the same row reductions to A and the unit matrix; hence if a set of row reductions can be found which transform A to I, the same set will transform I to A^{− 1}. For example, if

then the row reduction r₁ → r₁ − 2r₃ transforms the first row of A to (1, 0, 0), and when followed by the reduction r₂ → r₂ − r₁ yields a unit matrix, as follows:

Applying the same sequence of reductions to the unit matrix I gives

so that

The calculations involved in manipulating matrices of large dimensionality can be very tedious and in these cases useful computer programs exist, such as that referenced in footnote 1 in Section 9.1.1. Simpler, but effective, free programs may also be found on the internet.

9.4.4 Inhomogeneous simultaneous linear equations

The n simultaneous linear equations in n unknowns x_i(i = 1, 2, …, n) given in (9.9) are conveniently written in matrix form

(9.68a)

where

(9.68b) Unnumbered Display Equation

The solution of (9.68) for the homogeneous case b = 0 was discussed in Section 9.1.3. Here we consider the inhomogeneous case, when b ≠ 0. We will also start by assuming that A is non-singular so that A^{− 1} exists. Then the solution of (9.68) is

(9.69)

and the solution is unique. The latter statement follows from assuming there are two solutions, x⁽¹⁾ and x⁽²⁾, so that Ax⁽ⁱ⁾ = b_i(i = 1, 2). Then Ax⁽¹⁾ = Ax⁽²⁾, and since A has an inverse, we may multiple by A^{− 1} to obtain x⁽¹⁾ = x⁽²⁾, as required for the solution to be unique.

The solution of linear simultaneous equations by finding the inverse matrix A^{− 1} can be tedious and it is sometimes simpler to use an alternative method based on Cramer's rule, which we now discuss. We will again consider the set of equations (9.68a), which we will write in the form

(9.70) Unnumbered Display Equation

Multiplying the equation for b_i by A_ij and summing over i using (9.64), gives

(9.71) Unnumbered Display Equation

Hence, provided , and setting , (9.71) becomes

(9.72a) Unnumbered Display Equation

or equivalently,

(9.72b)

where Δ_j is the determinant obtained by replacing the elements in the jth column of Δ by the elements of the column vector b. Equations (9.72a) and (9.72b) are the combined statement of Cramer's rule.

We now briefly consider the cases where A^{− 1} does not exist, that is, when There are two possibilities:

If any of the determinants in the numerators of (9.72) are non-zero, then since the determinant in the denominator is , no finite solution to the set of equations exists. The equations are said to be inconsistent, or incompatible.
If , but all the determinants in the numerators of (9.72) are also zero, then in general one can show that an infinity of solutions exists.

In the case of three simultaneous equations, these results have a simple geometrical interpretation. For n = 3, (9.68b) reduces to the three equations

and if we interpret x₁, x₂ and x₃ as Cartesian co-ordinates x, y and z, on comparing to (1.51) we see that these are the equations of three planes. Assuming they are not identical, the first two planes will intersect in a straight line. There are then three possibilities. If the line lies in the plane described by the third equation, then any point on it is a solution to all three equations so that there is an infinite number of solutions. This corresponds to case (ii) above. Alternatively, if the line of intersection is parallel to, but not in, the third plane, there is no solution. This corresponds to case (i) above. Finally, if the line of intersection is not parallel to the third plane, it will pass through it at a single point, corresponding to a unique solution.

Problems 9

The vectors a, b, c, are given by

Use determinants to evaluate a × b and b · a × c.
1. Evaluate the determinant
  
  by using the Laplace expansion about (i) the third column and (ii) the first row.
2. Use the general properties of a determinant, as stated in Section 9.1.2, to show that the determinant
  
  may be written
  
  and find its value.
Simplify and hence evaluate the determinant
1. Solve the equation
2. Write the determinant
  
  as the product of factors that are linear in α, β, γ.
The n × n determinant Δ_n is given by

Establish a recurrence relation for S_n ≡ Δ_n + Δ_{n − 1} and hence find an explicit formula for Δ_n
Consider the two sets of homogeneous equations

Determine whether these sets have non-trivial solutions for x, y, z and, if so, find them.
Find the values of α for which the equations

have a unique consistent solution and solve the equations for the larger of these values.
Given two vectors a and b in an arbitrary number of dimensions, use the properties of the inner product and the Cauchy–Schwarz inequality, (9.26), to prove:
1. the parallelogram equality
2. the triangle inequality |a + b | ≤ |a| + |b |.
Consider the matrices
1. Find A − 3B, AB and BA.
2. State which of the products AC, CA, AD, DA, CD and DC are defined and evaluate those that are.
1. The three matrices
  
  called the Pauli spin matrices, form a ‘vector’ σ. Show that (σ · a)² = a² I, where a is an arbitrary real vector a = (a_x, a_y, a_z) and I is the 2 × 2 unit matrix.
2. If the matrices M_± are defined by M_± ≡ M_x ± iM_y, where
  
  show that the commutator [M₊, M₋] ≡ M₊M₋ − M₋M₊ = 2M_z.
Write down the matrix operator corresponding to a rotation R(θ) through an angle θ about the z-axis in three dimensions, where positive θ corresponds to the x-axis moving towards the original y-axis. Use the form of this matrix to verify explicitly that

and that
The matrix operators corresponding to rotations R_x(θ) and R_y(θ) through an angle θ about the x and y axes are given by
1. Show that the matrix corresponding to a rotation through θ₁ about the x-axis, followed by a rotation through θ₂ about the y-axis, is given by
  
  Do R_x(θ₁) and R_y(θ₂) commute?
2. Write an expression for the inverse matrix R^{− 1}(θ₁, θ₂) in terms of R_x(θ) and R_y(θ) and hence confirm explicitly the relation R^{− 1} = R^T, which holds for any orthogonal matrix and show that in this case.
The powers of a matrix X are defined by X² ≡ XX, X³ ≡ XXX etc., while its exponential is defined as

If A and B are square matrices: (a) find an expression for (A + B)³ in terms of the products of A and B and their powers; (b) derive a condition for the relation

to be valid.
Find the transpose, complex conjugate and Hermitian conjugate of the matrix
1. Verify that the matrix
  
  is unitary.
2. Express the matrix
  
  in the form A_S + A_AS, where A_S is a symmetric matrix and A_AS is an anti-symmetric matrix.
Which of the matrices below are: (i) symmetric, (ii) orthogonal, (iii) unitary or (iv) Hermitian? Use the matrix that has none of these properties to construct (v) an anti-symmetric matrix and (vi) an anti-Hermitian matrix.
1. If S is a symmetric matrix and A is an anti-symmetric matrix, show that .
2. Prove that diagonal matrices commute with each other.
Find the inverse of the matrix

and check the answer by direct multiplication.
Find the inverse of the matrix

and hence solve the matrix equation
Find by matrix inversion the solution of the equations
Find the solution of the equations

by Cramer's rule.
The half-life τ of a radioactive atom is defined as the time it takes for half of a given quantity of atoms to decay. A sample consists of just two radioactive components A and B, both of which decay to gaseous products that rapidly disperse. The sample is weighed after 8 and 12 hours and is found to weigh 90 and 30 grams, respectively. If the half-lives of A and B are τ_a = 2 h and τ_b = 4 h, respectively, use Cramer's rule to calculate the amounts of A and B initially in the sample.
1. For what values of the constants α and β do the simultaneous equations
  
  have a unique solution?
2. Solve the equations for the case α = 2, β = 3 by inverting the appropriate matrix.
3. Comment on both the existence and uniqueness of solutions in the cases: (i) α = 3, β = 6 ; (ii) α = 3, β = 2.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.