6.7 The general linear model

6.7.1 Formulation of the general linear model

All of the last few sections have been concerned with particular cases of the so-called general linear model. It is possible to treat them all at once in an approach using matrix theory. In most of this book, substantial use of matrix theory has been avoided, but if the reader has some knowledge of matrices this section may be helpful, in that the intention here is to put some of the models already considered into the form of the general linear model. An understanding of how these models can be put into such a framework, will put the reader in a good position to approach the theory in its full generality, as it is dealt with in such works as Box and Tiao (1992).

It is important to distinguish row vectors from column vectors. We write  for a column vector and  for its transpose; similarly if  is an  matrix then  is its  transpose. Consider a situation in which we have a column vector  of observations, so that  (the equation is written thus to save the excessive space taken up by column vectors). We suppose that the xi are independently normally distributed with common variance  and a vector  of means satisfying

Unnumbered Display Equation

where  is a vector of unknown parameters and  is a known  matrix.

In the case of the original formulation of the bivariate linear regression model in which, conditional on xi, we have  then  takes the part of  , r = 2,  takes the part of  and  takes the part of  where

Unnumbered Display Equation

This model is reformulated in terms of  and

Unnumbered Display Equation

In the case of the one way model (where, for simplicity, we shall restrict ourselves to the case where Ki=K for all i),  and

Unnumbered Display Equation

The two way layout can be expressed similarly using a matrix of 0s and 1s. It is also possible to write the multiple regression model

Unnumbered Display Equation

(the xij being treated as known) as a case of the general linear model.

6.7.2 Derivation of the posterior

Noting that for any vector  we have  , we can write the likelihood function for the general linear model in the form

Unnumbered Display Equation

Taking standard reference priors, that is,  the posterior is

Unnumbered Display Equation

Now as  and scalars equal their own transposes

Unnumbered Display Equation

so that if  is such that

Unnumbered Display Equation

(so that  ), that is, assuming  is non-singular,

Unnumbered Display Equation

we have

Unnumbered Display Equation


Unnumbered Display Equation

It is also useful to define

Unnumbered Display Equation

Because  is of the form  , it is always non-negative, and it clearly vanishes if  . Further, Se is the minimum value of the sum of squares S and so is positive. It is sometimes worth noting that

Unnumbered Display Equation

as is easily shown.

It follows that the posterior can be written as

Unnumbered Display Equation

In fact, this means that for given  the vector  has a multivariate normal distribution of mean  and variance–covariance matrix  .

If you are now interested in  as a whole you can now integrate with respect to  to get

Unnumbered Display Equation


Unnumbered Display Equation

It may also be noted that the set

Unnumbered Display Equation

is a hyperellipsoid in r-dimensional space in which the length of each of the axes is in a constant ratio to  . The argument of Section 6.5 on the one way layout can now be adapted to show that  , so that E(F) is an HDR for  of probability p if F is the appropriate percentage point of  .

6.7.3 Inference for a subset of the parameters

However, it is often the case that most of the interest centres on a subset of the parameters, say on  . If so, then it is convenient to write  and  . If it happens that  splits into a sum

Unnumbered Display Equation

then it is easy to integrate

Unnumbered Display Equation

to get

Unnumbered Display Equation

and thus as

Unnumbered Display Equation


Unnumbered Display Equation

It is now easy to show that  and hence to make inferences for  .

Unfortunately, the quadratic form  does in general contain terms  where  but j> k and hence it does not in general split into  . We will not discuss such cases further; useful references are Box and Tiao (1992), Lindley and Smith (1972) and Seber (2003).

6.7.4 Application to bivariate linear regression

The theory can be illustrated by considering the simple linear regression model. Consider first the reformulated version in terms of  and  . In this case

Unnumbered Display Equation

and the fact that this matrix is easy to invert is one of the underlying reasons why this reformulation was sensible. Also

Unnumbered Display Equation

so that Se is what was denoted See in Section 6.3 on bivariate linear regression.

If you are particularly interested in α, then in this case the thing to do is to note that the quadratic form splits with  and  . Consequently, the posterior distribution of  is given by

Unnumbered Display Equation

Since the square of a tn–2 variable can easily be shown to have an F1,n–2 distribution, this conclusion is equivalent to that of Section 6.3.

The greater difficulties that arise when  is non-diagonal can be seen by following the same process through for the original formulation of the bivariate linear regression model in terms of  and  . In this case it is easy enough to find the posterior distribution of  , but it involves some rearrangement to get that of  .

