7.8 Empirical Bayes methods

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7.8.1 Von Mises’ example

Only a very brief idea about empirical Bayes methods will be given in this chapter; more will be said about this topic in Chapter 8 and a full account can be found in Maritz and Lwin (1989). One of the reasons for this brief treatment is that, despite their name, very few empirical Bayes procedures are, in fact, Bayesian; for a discussion of this point see, for example, Deely and Lindley (1981).

The problems we will consider in this section are concerned with a sequence xi of observations such that the distribution of the ith observation xi depends on a parameter , typically in such a way that has the same functional form for all i. The parameters are themselves supposed to be a random sample from some (unknown) distribution, and it is this unknown distribution that plays the role of a prior distribution and so accounts for the use of the name of Bayes. There is a clear contrast with the situation in the rest of the book, where the prior distribution represents our prior beliefs, and so by definition it cannot be unknown. Further, the prior distribution in empirical Bayes methods is usually given a frequency interpretation, by contrast with the situation arising in true Bayesian methods.

One of the earliest examples of an empirical Bayes procedure was due to von Mises (1942). He supposed that in examining the quality of a batch of water for possible contamination by certain bacteria, m = 5 samples of a given volume were taken, and he was interested in determining the probability θ that a sample contains at least one bacterium. Evidently, the probability of x positive result in the 5 samples is

for a given value of θ. If the same procedure is to be used with a number of batches of different quality, then the predictive distribution (denoted to avoid ambiguity) is

where the density represents the variation of the quality θ of batches. [If comes from the beta family, and there is no particular reason why it should, then is a beta-binomial distribution, as mentioned at the end of Section 3.1 on ‘The binomial distribution’]. In his example, von Mises wished to estimate the density function on the basis of n = 3420 observations.

7.8.2 The Poisson case

Instead of considering the binomial distribution further, we shall consider a problem to do with the Poisson distribution which, of course, provides an approximation to the binomial distribution when the number m of samples is large and the probability θ is small. Suppose that we have observations where the have a distribution with a density , and that we have available n past observations, among which fn(x) were equal to x for . Thus, fn(x) is an empirical frequency and fn(x)/n is an estimate of the predictive density . As x has a Poisson distribution for given λ

Unnumbered Display Equation

Now suppose that, with this past data available, a new observation is made, and we want to say something about the corresponding value of λ. In Section 7.5 on ‘Bayesian decision theory’, we saw that the posterior mean of λ is

To use this formula, we need to know the prior or at least to know and , which we do not know. However, it is clear that a reasonable estimate of is , after allowing for the latest observation. Similarly, a reasonable estimate for is . It follows that a possible point estimate for the current value of λ, corresponding to the value resulting from a quadratic loss function, is

This formula could be used in a case like that investigated by von Mises if the number m of samples taken from each batch were fairly large and the probability θ that a sample contained at least one bacterium were fairly small, so that the Poisson approximation to the binomial could be used.

This method can easily be adapted to any case where the posterior mean of the parameter of interest takes the form

and there are quite a number of such cases (Maritz and Lwin, 1989, Section 1.3).

Going back to the Poisson case, if it were known that the underlying distribution were of the form for some S0 and , then it is known (cf. Section 7.5) that

In this case, we could use to estimate S0 and in some way, by, say, and , giving an alternative point estimate for the current value of

The advantage of an estimate like this is that, because, considered as a function of , it is smoother than , it could be expected to do better. This is analogous with the situation in regression analysis, where a fitted regression line can be expected to give a better estimate of the mean of the dependent variable y at a particular value of the independent variable x than you would get by concentrating on values of y obtained at that single value of x. On the other hand, the method just described does depend on assuming a particular form for the prior, which is probably not justifiable. There are, however, other methods of producing a ‘smoother’ estimate.

Empirical Bayes methods can also be used for testing whether a parameter θ lies in one or another of a number of sets, that is, for hypothesis testing and its generalizations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7.8 Empirical Bayes methods

Create new playlist

Sign In

Sign Up

Table of Contents for
7.8 Empirical Bayes methods