8.3 The baseball example

Efron and Morris’s example on baseball statistics was outlined in Section 8.1. As their primary data, they take the number of times hits Si or equivalently the batting averages Yi=Si/n of r=18 major league players as they were recorded after n=45 times at bat in the 1970 season. These were, in fact, all the players who happened to have batted exactly 45 times the day the data were tabulated. If Xi and  are as in Section 8.1, so that approximately

Unnumbered Display Equation

then we have a case of the hierarchical normal model. With the actual data, we have

Unnumbered Display Equation

and so with

Unnumbered Display Equation

the empirical Bayes estimator for the  takes the form

Unnumbered Display Equation

so giving estimates

Unnumbered Display Equation

We can test how well an estimator performs by comparing it with the observed batting averages. We suppose that the ith player had Ti hits and was at bat mi times, so that his batting average for the remainder of the season was pi=Ti/mi. If we write

Unnumbered Display Equation

we could consider a mean square error

Unnumbered Display Equation

or more directly

Unnumbered Display Equation

In either case, it turns out that the empirical Bayes estimator appears to be about three and a half times better than the ‘obvious’ (maximum likelihood) estimator which ignores the hierarchical model and just estimates each  by the corresponding Xi. The original data and the resulting estimates are tabulated in Table 8.1.

Table 6.1 Data for the baseball example.

Table 6-1

So there is evidence that in at least some practical case, use of the hierarchical model and a corresponding empirical Bayes estimator is genuinely worth while.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset