8.2 The hierarchical normal model

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8.2.1 The model

Suppose that

is a vector of fixed, unknown parameters and that

is a vector of independent observations such that

Of course, the Xi could each be means of a number of observations.

For the moment, we shall suppose that is known, so, after a suitable normalization, we can suppose that .

It is useful to establish some notation for use later on, We shall consider a fixed origin

and we will write

Unnumbered Display Equation

and

for a vector of r elements all equal to unity.

We suppose that on the basis of our knowledge of the Xi we form estimates of the and write

In general, our estimates will not be exactly right and we will adopt a decision theoretic approach as described in Section 7.5 on ‘Bayesian decision theory’. In particular, we shall suppose that by estimating the parameters we suffer a loss

We recall that the risk function is defined as

For our problem the ‘obvious’ estimator (ignoring the hierarchical structure which will be introduced later) is

and indeed since the log-likelihood is

it is the maximum likelihood estimator. It is clearly unbiased.

It is easy to find the risk of this obvious estimator – it is

8.2.2 The Bayesian analysis for known overall mean

To express this situation in terms of a hierarchical model, we need to suppose that the parameters come from some population, and the simplest possibility is to suppose that

in which case it is convenient to take . With the additional structure assumed for the means, the problem has the structure of a situation variously described as a random effects model, Model II or a components of variance model (cf. Eisenhart et al., 1947, or Scheffé, 1959, Section 7.2, n.7). We are, however, primarily interested in the means and not in the variance components and , at least for the moment.

It follows that the posterior distribution of given is

where (writing )

and

(cf. Section 2.2 on ‘Normal Prior and Likelihood’).

To minimize the expectation of the loss over the posterior distribution of , it is clearly necessary to use the Bayes estimator

where

the posterior mean of given (see the subsection of Section 7.5 on ‘Point estimators resulting from quadratic loss’). Further, if we do this, then the value of this posterior expected loss is

It follows that the Bayes risk

(the expectation being taken over values of ) is

We note that if instead we use the maximum likelihood estimator , then the posterior expected loss is increased by an amount

which is always positive, so that

Further, since the unconditional distribution of Xi is evidently , so that , its expectation over repeated sampling (the Bayes risk) is

This is, in fact, obvious since we can also write

where the expectation is over , and since for the maximum likelihood estimator =1 for all , we have

We can thus see that use of the Bayes estimator always diminishes the posterior loss, and that the amount ‘saved’ by its use averages out at λ over repeated sampling.

8.2.3 The empirical Bayes approach

Typically, however, you will not know (or equivalently λ). In such a situation, you can attempt to estimate it from the data. Since the Xi have an unconditional distribution which is , it is clear that S1 is a sufficient statistic for , or equivalently for λ, which is such that or

so that if we define

then using the probability density of a chi-squared distribution (as given in Appendix A)

Unnumbered Display Equation

so that is an unbiased estimator of λ.

Now consider the effect of using the empirical Bayes estimator

which results from replacing λ by in the expression for . If we use this, then the value of the posterior expected loss exceeds that incurred by the Bayes rule by an amount

Unnumbered Display Equation

which is always positive, so that

Further, if we write (so that and ), then we see that the expectation of over repeated sampling is

It follows that the Bayes risk resulting from the use of the empirical Bayes estimator is

as opposed to for the Bayes estimator or 1 for the maximum likelihood estimator.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8.2 The hierarchical normal model

Create new playlist

Sign In

Sign Up

Table of Contents for
8.2 The hierarchical normal model