2.3 Several normal observations with a normal prior

2.3.1 Posterior distribution

We can generalize the situation in the previous section by supposing that a priori

Unnumbered Display Equation

but that instead of having just one observation we have n independent observations  such that

Unnumbered Display Equation

We sometimes refer to X as an n-sample from  . Then

Unnumbered Display Equation

Proceeding just as we did in Section 2.3 when we had only one observation, we see that the posterior distribution is

Unnumbered Display Equation

where

Unnumbered Display Equation

We could alternatively write these formulae as

Unnumbered Display Equation

which shows that, assuming a normal prior and likelihood, the result is just the same as the posterior distribution obtained from the single observation of the mean  , since we know that

Unnumbered Display Equation

and the above formulae are the ones we had before with replaced by  and x by  . (Note that the use of a bar over the x here to denote a mean is unrelated to the use of a tilde over x to denote a random variable).

We would of course obtain the same result by proceeding sequentially from  to  and then treating  as prior and x2 as data to obtain  and so on. This is in accordance with the general result mentioned in Section 2.1 on ‘Nature of Bayesian Inference’.

2.3.2 Example

We now consider a numerical example. The basic assumption in this section is that the variance is known, even though in most practical cases, it has to be estimated. There are a few circumstances in which the variance could be known, for example when we are using a measuring instrument which has been used so often that its measurement errors are well known, but there are not many. Later in this book, we will discover two things which mitigate this assumption – firstly, that the numerical results are not much different when we do take into account the uncertainty about the variance, and, secondly, that the larger the sample size is, the less difference it makes.

The data we will consider are quoted by Whittaker and Robinson (1940, Section 97). They consider chest measurements of 10 000 men. Now, based on memories of my experience as an assistant in a gentlemen’s outfitters in my university vacations, I would suggest a prior

Unnumbered Display Equation

Of course, it is open to question whether these men form a random sample from the whole population, but unless I am given information to the contrary I would stick to the prior I have just quoted, except that I might be inclined to increase the variance. Whitaker and Robinson’s data show that the mean turned out to be 39.8 with a standard deviation of 2.0 for their sample of 10 000. If we put the two together, we end with a posterior mean for the chest measurements of men in this population is normal with variance

Unnumbered Display Equation

and mean

Unnumbered Display Equation

Thus, for all practical purposes we have ended up with the distribution

Unnumbered Display Equation

suggested by the data. You should note that this distribution is

Unnumbered Display Equation

the distribution we referred to in Section 2.1 on ‘Nature of Bayesian Inference’ as the standardized likelihood. Naturally, the closeness of the posterior to the standardized likelihood results from the large sample size, and whatever my prior had been, unless it were very very extreme, I would have got very much the same result. More formally, the posterior will be close to the standardized likelihood insofar as the weight

Unnumbered Display Equation

associated with the prior mean is small, that is insofar as  is large compared with  . This is reassuring in cases where the prior is not very easy to specify, although of course there are cases where the amount of data available is not enough to get to this comforting position.

2.3.3 Predictive distribution

If we consider taking another one observation xn+1, then the predictive distribution can be found just as in Section 2.3 by writing

Unnumbered Display Equation

and noting that, independently of one another,

Unnumbered Display Equation

so that

Unnumbered Display Equation

It is easy enough to adapt this argument to find the predictive distribution of an m-vector  where

Unnumbered Display Equation

by writing

Unnumbered Display Equation

where  is the constant vector

Unnumbered Display Equation

Then θ has its posterior distribution  and the components of the vector  are  variates independent of θ and of one another, so that  has a multivariate normal distribution, although its components are not independent of one another.

2.3.4 Robustness

It should be noted that any statement of a posterior distribution and any inference is conditional not merely on the data, but also on the assumptions made about the likelihood. So, in this section, the posterior distribution ends up being normal as a consequence partly of the prior but also of the assumption that the data was distributed normally, albeit with an unknown mean. We say that an inference is robust if it is not seriously affected by changes in the assumptions on which it is based. The notion of robustness is not one which can be pinned down into a more precise definition, and its meaning depends on the context, but nevertheless the concept is of great importance and increasing attention is paid in statistics to investigations of the robustness of various techniques. We can immediately say that the conclusion that the nature of the posterior is robust against changes in the prior is valid provided that the sample size is large and the prior is a not-too-extreme normal distribution or nearly so. Some detailed exploration of the notion of robustness (or sensitivity analysis) can be found in Kadane (1984).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset