4.4 Point (or sharp) null hypotheses with prior information

4.4.1 When are point null hypotheses reasonable?

As was mentioned in Section 4.3, it is very common in classical statistics to conduct a test of a point (or sharp) null hypothesis

Unnumbered Display Equation

In such a case, the full-scale Bayesian approach (as opposed to the compromise described in the previous section) gives rise to conclusions which differ radically from the classical answers.

Before getting on to the answers, a few basic comments about the whole problem are in order. First, tests of point null hypotheses are often performed in inappropriate circumstances. It will virtually never be the case that one seriously entertains the hypothesis that  exactly, a point which classical statisticians fully admit (cf. Lehmann, 1986, Sections 4.5, 5.2). More reasonable would be the null hypothesis

Unnumbered Display Equation

where  is so chosen that all  can be considered ‘indistinguishable’ from  . An example in which this might arise would be an attempt to analyze a chemical by observing some aspect, described by a parameter θ, of its reaction with a known chemical. If it were desired to test whether or not the unknown chemical was a specific compound, with a reaction strength  known to an accuracy of ε, it would be reasonable to test

Unnumbered Display Equation

An example where ε might be extremely close to zero is a test for extra-sensory perception (ESP) with  representing the hypothesis of no ESP. (The only reason that ε would probably not be zero here is that an experiment designed to test for ESP probably would not lead to a perfectly well-defined  .) Of course, there are also many decision problems that would lead to a null hypothesis of the aforementioned form with a large ε, but such problems will rarely be well approximated by testing a point null hypothesis.

The question arises, if a realistic null hypothesis is H , when is it reasonable to approximate it by H ? From a Bayesian viewpoint, it will be reasonable if and only when we spread the quantity p0 of prior probability over  , the posterior probability  is close to that of  when a lump of prior probability p0 is concentrated on the single value  . This will certainly happen if the likelihood function is approximately constant on  , but this is a very strong condition, and one can often get away with less.

4.4.2 A case of nearly constant likelihood

Suppose that  are independently  where is known. Then we know from Section 2.3 on ‘Several normal observations with a normal prior’ that the likelihood is proportional to an  density for θ. Now over the interval  this likelihood varies by a factor

Unnumbered Display Equation

It follows that if we define z to be the statistic

Unnumbered Display Equation

used in classical tests of significance, and

Unnumbered Display Equation

then the likelihood varies over  by a factor which is at most exp(2k). Hence, provided that ε is reasonably small, there is a useful bound on the variation of the likelihood.

For example, if ε can be taken to be 0.0025 and

Unnumbered Display Equation

then the likelihood varies by at most exp(2k) over  . More specifically, if z = 2,  and n = 25, then k becomes

Unnumbered Display Equation

and exp(2k)=1.05=1/0.95. In summary, if all values within  of  are regarded as indistinguishable from  , then we can feel reassured that the likelihood function does not vary by more than 5% over this range of indistinguishable values, and if the interval can be made even smaller then the likelihood is even nearer to being constant.

Note that the bound depends on  as well as on  .

4.4.3 The Bayesian method for point null hypotheses

We shall now develop a theory for testing point null hypotheses, which can then be compared with the classical theory. If there is doubt as to the adequacy of the point null hypothesis as a representation of the real null hypothesis, it is always possible to test an interval null hypothesis directly by Bayesian methods and compare the results (and this will generally be easier than checking the constancy of the likelihood function).

You cannot use a continuous prior density to conduct a test of  because that would of necessity give  a prior probability of zero and hence a posterior probability of zero. A reasonable way of proceeding is to give  a prior probability of  and to assign a probability density  to values  where  and  integrates to unity. If you are thinking of the hypothesis  as an approximation to a hypothesis  then  is really your prior probability for the whole interval  .

You can then derive the predictive density  of a vector  of observations in the form

Unnumbered Display Equation


Unnumbered Display Equation

for what might be called the predictive distribution under the alternative hypothesis we see that

Unnumbered Display Equation

It follows that the posterior probabilities are

Unnumbered Display Equation

and so, the Bayes factor is

Unnumbered Display Equation

Of course, it is possible to find the posterior probabilities p0 and p1 in terms of the Bayes factor B and the prior probability  as noted in Section 4.1 when hypothesis testing in general was discussed.

4.4.4 Sufficient statistics

Sometimes, we have a sufficient statistic  for x given θ, so that

Unnumbered Display Equation

where  is not a function of θ. Clearly in such a case,

Unnumbered Display Equation

so that we can cancel a common factor  to get

Unnumbered Display Equation

and the Bayes factor is

Unnumbered Display Equation

In short, x can be replaced by t in the formulas for p0, p1 and the Bayes factor B.

Many of the ideas in this section should become clearer when you come to look at Section 4.5, in which the particular case of the normal mean is explored in detail.

