1.5 Means and variances

1.5.1 Expectations

Suppose that m is a discrete random variable and that the series

Unnumbered Display Equation

is absolutely convergent, that is such that

Unnumbered Display Equation

Then the sum of the original series is called the mean or expectation of the random variable, and we denote it

Unnumbered Display Equation

A motivation for this definition is as follows. In a large number N of trials, we would expect the value m to occur about p(m)N times, so that the sum total of the values that would occur in these N trials (counted according to their multiplicity) would be about

Unnumbered Display Equation

so that the average value should be about

Unnumbered Display Equation

Thus, we can think of expectation as being, at least in some circumstances, a form of very long term average. On the other hand, there are circumstances in which it is difficult to believe in the possibility of arbitrarily large numbers of trials, so this interpretation is not always available. It can also be thought of as giving the position of the ‘centre of gravity’ of the distribution imagined as a distribution of mass spread along the x-axis.

More generally, if g(m) is a function of the random variable and  is absolutely convergent, then its sum is the expectation of g(m). Similarly, if h(m, n) is a function of two random variables m and n and the series  is absolutely convergent, then its sum is the expectation of h(m, n). These definitions are consistent in that if we consider g(m) and h(m, n) as random variables with densities of their own, then it is easily shown that we get these values for their expectations.

In the continuous case, we define the expectation of a random variable x by

Unnumbered Display Equation

provided that the integral is absolutely convergent, and more generally define the expectation of a function g(x) of x by

Unnumbered Display Equation

provided that the integral is absolutely convergent, and similarly for the expectation of a function h(x, y) of two random variables. Note that the formulae in the discrete and continuous cases are, as usual, identical except for the use of summation in the one case and integration in the other.

1.5.2 The expectation of a sum and of a product

If x and y are any two random variables, independent or not, and a, b and c are constants, then in the continuous case

Unnumbered Display Equation

and similarly in the discrete case. Yet more generally, if g(x) is a function of x and h(y) a function of y, then

Unnumbered Display Equation

We have already noted that the idea of independence is closely tied up with multiplication, and this is true when it comes to expectations as well. Thus, if x and y are independent, then

Unnumbered Display Equation

and more generally if g(x) and h(y) are functions of independent random variables x and y, then

Unnumbered Display Equation

1.5.3 Variance, precision and standard deviation

We often need a measure of how spread out a distribution is, and for most purposes the most useful such measure is the variance  of x, defined by

Unnumbered Display Equation

Clearly if the distribution is very little spread out, then most values are close to one another and so close to their mean, so that  is small with high probability and hence  is small. Conversely, if the distribution is well spread out then  is large. It is sometimes useful to refer to the reciprocal of the variance, which is called the precision. Further, because the variance is essentially quadratic, we sometimes work in terms of its positive square root, the standard deviation, especially in numerical work. It is often useful that

Unnumbered Display Equation

The notion of a variance is analogous to that of a moment of inertia in mechanics, and this formula corresponds to the parallel axes theorem in mechanics. This analogy seldom carries much weight nowadays, because so many of those studying statistics took it up with the purpose of avoiding mechanics.

In discrete cases, it is sometimes useful that

Unnumbered Display Equation

1.5.4 Examples

As an example, suppose that  . Then

Unnumbered Display Equation

After a little manipulation, this can be expressed as

Unnumbered Display Equation

Because the sum is a sum of binomial  probabilities, this expression reduces to  , and so

Unnumbered Display Equation

Similarly,

Unnumbered Display Equation

and so

Unnumbered Display Equation

For a second example, suppose  . Then

Unnumbered Display Equation

The integrand in the last expression is an odd function of  and so vanishes, so that

Unnumbered Display Equation

Moreover,

Unnumbered Display Equation

so that on writing

Unnumbered Display Equation

Integrating by parts (using z as the part to differentiate), we get

Unnumbered Display Equation

1.5.5 Variance of a sum; covariance and correlation

Sometimes we need to find the variance of a sum of random variables. To do this, note that

Unnumbered Display Equation

where the covariance  of x and y is defined by

Unnumbered Display Equation

More generally,

Unnumbered Display Equation

for any constants a, b and c. By considering this expression as a quadratic in a for fixed b or vice versa and noting that (because its value is always positive) this quadratic cannot have two unequal real roots, we see that

Unnumbered Display Equation

We define the correlation coefficient  between x and y by

Unnumbered Display Equation

It follows that

Unnumbered Display Equation

and indeed a little further thought shows that  if and only if

Unnumbered Display Equation

with probability 1 for some constants a, b and c with a and b having opposite signs, while  if and only if the same thing happens except that a and b have the same sign. If  we say that x and y are uncorrelated.

It is easily seen that if x and y are independent then

Unnumbered Display Equation

from which it follows that independent random variables are uncorrelated.

The converse is not in general true, but it can be shown that if x and y have a bivariate normal distribution (as described in Appendix A), then they are independent if and only if they are uncorrelated.

It should be noted that if x and y are uncorrelated, and in particular if they are independent

Unnumbered Display Equation

(observe that there is a plus sign on the right-hand side even if there is a minus sign on the left).

1.5.6 Approximations to the mean and variance of a function of a random variable

Very occasionally, it will be useful to have an approximation to the mean and variance of a function of a random variable. Suppose that

Unnumbered Display Equation

Then if g is a reasonably smooth function and x is not too far from its expectation, Taylor’s theorem implies that

Unnumbered Display Equation

It, therefore, seems reasonable that a fair approximation to the expectation of z is given by

Unnumbered Display Equation

and if this is so, then a reasonable approximation to  may well be given by

Unnumbered Display Equation

As an example, suppose that

Unnumbered Display Equation

and that z=g(x), where

Unnumbered Display Equation

so that

Unnumbered Display Equation

and thus  . The aforementioned argument then implies that

Unnumbered Display Equation

The interesting thing about this transformation, which has a long history [see Eisenhart et al. (1947, Chapter 16) and Fisher (1954)], is that, to the extent to which the approximation is valid, the variance of z does not depend on the parameter π. It is accordingly known as a variance-stabilizing transformation. We will return to this transformation in Section 3.2 on the ‘Reference Prior for the Binomial Distribution’.

1.5.7 Conditional expectations and variances

If the reader wishes, the following may be omitted on a first reading and then returned to as needed.

We define the conditional expectation of y given x by

Unnumbered Display Equation

in the continuous case and by the corresponding sum in the discrete case. If we wish to be pedantic, it can occasionally be useful to indicate what we are averaging over by writing

Unnumbered Display Equation

just as we can write  , but this is rarely necessary (though it can slightly clarify a proof on occasion). More generally, the conditional expectation of a function g(y) of y given x is

Unnumbered Display Equation

We can also define a conditional varianceas

Unnumbered Display Equation

Despite some notational complexity, this is easy enough to find since after all a conditional distribution is just a particular case of a probability distribution. If we are really pedantic, then  is a real number which is a function of the real number x, while  is a random variable which is a function of the random variable  , which takes the value  when  takes the value x. However, the distinction, which is hard to grasp in the first place, is usually unimportant.

We may note that the formula

Unnumbered Display Equation

could be written as

Unnumbered Display Equation

but we must be careful that it is an expectation over values of  (i.e.  ) that occurs here.

Very occasionally we make use of results like

Unnumbered Display Equation

The proofs are possibly more confusing than helpful. They run as follows:

Unnumbered Display Equation

Similarly, we get the generalization

Unnumbered Display Equation

and in particular

Unnumbered Display Equation

hence

Unnumbered Display Equation

while

Unnumbered Display Equation

from which it follows that

Unnumbered Display Equation

1.5.8 Medians and modes

The mean is not the only measure of the centre of a distribution. We also need to consider the median from time to time, which is defined as any value x0 such that

Unnumbered Display Equation

In the case of most continuous random variables there is a unique median such that

Unnumbered Display Equation

We occasionally refer also to the mode, defined as that value at which the pdf is a maximum. One important use we shall have for the mode will be in methods for finding the median based on the approximation

Unnumbered Display Equation

or equivalently

Unnumbered Display Equation

(see the preliminary remarks in Appendix A).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset