Chapter 34
A Conditional Law of Large Numbers

Annals of Probability, 8(1) (1980), 142–147.

Abstract

It is shown that, when conditional on a set of given average values, the frequency distribution of a series of independent random variables with a common finite distribution converges in probability to the distribution which has the maximum relative entropy for the given mean values.

Introduction

In statistical mechanics and other areas of physics, empirical distributions in the phase space conform in many circumstances to the distribution maximizing the entropy of the system subject to its constraints. The constraints are typically in the form of specified mean values of some functions of phase. If c34-math-0001 denotes the probability distribution over the state space, the constraints on p take the form

equation

and the maximum entropy distribution is the one that maximizes the entropy function

equation

subject to the constraints.

A principle stating that the empirical distribution possesses the maximum entropy within the restrictions of the system is due to Gibbs (1902). As a special case, he proposed the so-called canonical distribution as a description of systems subject to a single constraint that the average energy has a fixed value,

equation

where c34-math-0005 are the energy levels of each state. In this case, the maximum entropy distribution has the form

equation

which is the form that Gibbs called canonical.

Gibbs offered no justification for the canonical distribution, and the principle of maximum entropy in general. In spite of its apparent arbitrariness, however, the maximum entropy principle has since found a number of successful applications in a wide range of situations, and has led to many new developments in physics. For an informed discussion, see Jaynes (1967).

In a subsequent paper, Jaynes (1968) presented a demonstration that the distribution with the maximum entropy “can be realized experimentally in overwhelmingly more ways than can any other.” Therefore, for large physical systems, the empirical distribution should, indeed, agree with the maximum entropy distribution.

In this chapter, a limit theorem is given that provides a foundation the above principle in the same sense in which the law of large numbers justifies interpretation of limiting frequencies as probabilities. Informally stated, the theorem asserts that in the equiprobable case, the frequencies conditional on given constraints converge in probability to the distribution that has the maximum entropy subject to these constraints.

A generalization of this result is also given, which relaxes the assumption of all states being equally likely. In the general case, the frequencies conditional upon a set of conditions converge to the distribution that maximizes the entropy relative to the underlying distribution.

The Limit Theorems

Let c34-math-0007 be a finite set of k elements and consider a series c34-math-0008 of independent identically distributed random variables with values on c34-math-0009, such that

1 equation

Denote by c34-math-0011 the frequency distribution of c34-math-0012,

equation

where I is the characteristic function. Let c34-math-0014 be a given c34-math-0015 matrix and c34-math-0016 a given vector. Put c34-math-0017 and define

2 equation

where S is the set of probability distributions on c34-math-0019,

equation

Assume that c34-math-0021. Define the entropy of a distribution in S by

3 equation

with the convention c34-math-0023 Denote by c34-math-0024 the maximum point of H on D0,

4 equation

Since H is continuous on S and D0 is compact, the maximum exists. Moreover, it is unique by virtue of strict concavity of H on S and convexity of the set D0.

Theorem 1. For every c34-math-0026, there exists c34-math-0027 such that for every δ, c34-math-0028,

5 equation

c34-math-0030, where c34-math-0031 is the maximum entropy distribution, c34-math-0032.

This theorem is a special case of the more general conditional law of large numbers, which will now be stated. Replace the assumption (1) of the equiprobable case by a general assumption that

6 equation

where c34-math-0034 is a given distribution. Assume, without loss of generality, that c34-math-0035. Define the entropy c34-math-0036 of a distribution in S relative to the distribution q by

7 equation

Again let D0 be the set in (2), c34-math-0038, and replace the definition (4) of p0 by the definition

8 equation

Again, the maximum relative entropy point p0 exists and is unique.

Theorem 2. For every c34-math-0040, there exists c34-math-0041 such that for every c34-math-0042,

9 equation

as c34-math-0044, where c34-math-0045 is the distribution with the maximum entropy relative to q,

equation

The maximum relative entropy distribution p0 is easy to find. It is given by

equation

where the constants c34-math-0048 are determined by the condition c34-math-0049.

Proof of the Theorems

Theorem 1 follows immediately from Theorem 2, since for c34-math-0050

equation

so that the maximum points in Eqs. (4) and (8) coincide.

Proof of Theorem 2. Let c34-math-0052 be fixed, and put

where p0 is given by (8). For each c34-math-0054, define

11 equation

Define uniquely a point c34-math-0056 by

12 equation

Introduce a topology on S by the metric

equation

We will first prove that

Let the set c34-math-0060 be directed by the relation c34-math-0061 if c34-math-0062. Since S is compact, the directed set c34-math-0063 has at least one limit point. Let p* be one such limit point. Choose an arbitrary c34-math-0064 and put

equation

There exists c34-math-0066 such that

equation

Then

equation

and therefore c34-math-0069. Since this is true for every c34-math-0070, it follows that

equation

Now c34-math-0072 for every c34-math-0073. Since c34-math-0074 is a continuous function, the same is true for the limiting point,

equation

But p0 is the unique maximum point of c34-math-0076 on c34-math-0077, and therefore c34-math-0078. Thus, c34-math-0079 is the only limit point of c34-math-0080, which proves Eq. (13).

It follows that there exists c34-math-0081 such that for every δ, c34-math-0082,

Let δ be selected arbitrarily from c34-math-0084 and fixed. Put c34-math-0085 where V is given by (10), and denote the adherence of W by c34-math-0086. Put

equation

Since

equation

and c34-math-0089 by virtue of Eq. (14), it follows that

equation

Put

equation

so that

15 equation

and define

equation

Let

equation

We will now show that B contains an open set. Let c34-math-0095 and put

equation

The point c34-math-0097 is an interior point of c34-math-0098 for every c34-math-0099. To prove that, choose

equation

For every c34-math-0101 such that

equation

it is true that

equation

so that c34-math-0104. Thus, c34-math-0105 belongs to the interior of c34-math-0106 for every c34-math-0107. Since c34-math-0108 is an interior point of V and, by continuity of c34-math-0109, also of R, the point c34-math-0110 will be in the interior of both V and R if λ is sufficiently small. Thus, such c34-math-0111 is an interior point of B, and consequently B contains an open set, say C.

To summarize our results so far, we have proven that there exists an open set C such that

equation

and

equation

Now

equation

where

equation

We will make use of the inequality

valid for c34-math-0117, where we define c34-math-0118 in agreement with the earlier convention c34-math-0119 The inequality (16) is easily established from the Stirling formula. Then

equation

and therefore

where #[Z] denotes the number of elements of a finite set Z. Now

equation

converges with c34-math-0123 to a finite limit c34-math-0124 where c34-math-0125 are the volumes of S, C, respectively, by c34-math-0126-dimensional Lebesgue measure, and c34-math-0127. Since c34-math-0128, the right-hand side of Eq. (17) converges to zero as c34-math-0129, and consequently

equation

which completes the proof.

References

  1. Gibbs, J.W. (1902). Elementary Principles in Statistical Mechanics. New Haven, CT: Yale University Press.
  2. Jaynes, E.T. (1967). “Foundations of Probability Theory and Statistical Mechanics.” In Delaware Seminar in Foundation of Physics. (Ed., M. Bunge). Berlin: Springer.
  3. Jaynes, E.T. (1968). “Prior Probabilities.” IEEE Trans. System Science and Cybernetics, 4, 227–241.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset