7.7 Decision theory and hypothesis testing
7.7.1 Relationship between decision theory and classical hypothesis testing
It is possible to reformulate hypothesis testing in the language of decision theory. If we want to test versus , we have two actions open to us, namely,
As before, we shall write and for the prior probabilities of and and p0 and p1 for their posterior probabilities and
for the Bayes factor. We also need the notation
where is the prior density function.
Now let us suppose that there is a loss function defined by
so that the use of a decision rule d(x) results in a posterior expected loss function
so that a decision d(x) which minimizes the posterior expected loss is just a decision to accept the hypothesis with the greater posterior probability, which is the way of choosing between hypotheses suggested when hypothesis testing was first introduced.
More generally, if there is a ‘0–Ki’ loss function, that is,
then the posterior expected losses of the two actions are
so that a Bayes decision rule results in rejecting the null hypothesis, that is, in taking action a1, if and only if K1p0< K0p1, that is,
In the terminology of classical statistics, this corresponds to the use of a rejection region
When hypothesis testing was first introduced in Section 4.1, we noted that in the case where and are simple hypotheses, then Bayes theorem implies that
so that the rejection region takes the form
which is the likelihood ratio test prescribed by Neyman–Pearson theory. A difference is that in the Neyman–Pearson theory, the ‘critical value’ of the rejection region is determined by fixing the size α, that is, the probability that x lies in the rejection region R if the null hypothesis is true, whereas in a decision theoretic approach, it is fixed in terms of the loss function and the prior probabilities of the hypotheses.
7.7.2 Composite hypotheses
If the hypotheses are composite (i.e. not simple), then, again as in Section 4.1 on ‘Hypothesis testing’,
so that there is still a rejection region that can be interpreted in a similar manner. However, it should be noted that classical statisticians faced with similar problems are more inclined to work in terms of a likelihood ratio
(cf. Lehmann, 1986, Section 1.7). In fact, it is possible to express quite a lot of the ideas of classical statistics in a language involving loss functions.
It may be noted that it is easy to extend the above discussion about dichotomies (i.e. situations where a choice has to be made between two hypotheses) to deal with trichotomies or polytomies, although some theories of statistical inference find choices between more than two hypotheses difficult to deal with.