40 ACCEPTANCE/REJECTION
2.7.2 Chernoff bounds as an AR algorithm
Chernoff bounds [23] give upper limits on the p robability that a sum of random vari-
ables is greater (or smaller) than a set value. Let S
n
= X
1
+ ···+X
n
,whereX
1
,...,X
n
are iid; then the analog sampling goal to generate a draw from S
n
conditioned on
S
n
a or S
n
a.
Markov’s inequality can also be applied to the sum o f random variables, but the
bound only goes down as the inverse of n. Chernoff bounds, on the other hand, tend
to decrease exponentially in n, thereby giving much closer matches to the actual
behavior of the tails of S
n
and a much faster way of generating samples from the
tails.
Chernoff bounds can be viewed as an application of Markov’s inequality to the
moment-generating function of X.
Lemma 2.9 (Chernoff bounds). Consider a random variable X where E[e
tX
] is nite
for t [a,b]. Then for a R,
P(X a) mgf
X
(t)/exp(ta) for all t [0, b], (2.8)
P(X a) mgf
X
(t)/exp(ta) for all t [a, 0]. (2.9)
To get the best possible bound on the right-hand side, minimize over the allowed
values of t.
Proof. First P(X a )=P(tX a) for any positive t.NextP(tX ta)=
P(exp(tX) exp(ta)). Finally, apply Markov’s inequality to get P(exp(tX
exp(ta))) E[exp(tX )]/exp(ta). The result for P(X a) is shown similarly.
Now consider what happens when Chernoff bounds are used on a sum of inde-
pendent random variables. First a fact about moment-generating functions.
Lemma 2.10. Fo r S
n
= X
1
+ ···+ X
n
,wherethe{X
i
} are iid with nite moment-
generating functions, E[exp(tS
n
)] = [E[exp(tX
i
)]]
n
.
Applying this to Chernoff bounds immediately yields the following.
Lemma 2.11. Fo r S
n
= X
1
+ ···+ X
n
,wherethe{X
i
} are iid with mgf
X
i
(t) is nite
for t [a,b],
P(S
n
α
n)
mgf
X
i
(t)
exp(t
α
)
n
for all t [0, b], (2.10)
P(S
n
α
n)
mgf
X
i
(t)
exp(t
α
)
n
for all t [a, 0]. (2.11)
Say that X
i
f
X
. The goal is to use the moment-generating function idea that led
to the Chernoff bound to obtain a better AR-type algorithm. To accomplish this goal,
it must be possible to draw from the twisted density g
t
(x) e
tx
f
X
(x).Whent is large,
this density pushes the probability in density g
t
out to larger values. When t is small
and negative, this pushes the probability in g
t
to more negative values.
Let t > 0. For x a,theng
t
(x)=e
tx
f
X
(x) e
ta
f
X
(x). Therefore a draw from g
t
SIMULATION USING MARKOV AND CHERNOFF INEQUALITIES 41
can be accepted as a draw from f
X
with probability e
ta
/e
tx
. If the probability that x is
very much larger than a is small, then this acceptance probability will be very close
to 1 . A similar method ap plies when t < 0.
AR Chernoffs bound Input: density f
X
, a, t
Output: S
n
= X
1
+ ···+ X
n
given S
n
a (when t > 0) or S
n
a (when t < 0)
1) If t > 0, then A [a, ),elseA (,a]
2) Repeat
3) Draw X
1
,...,X
n
iid from unnormalized density e
tx
f (x)
4) S
n
X
1
+ ···+ X
n
5) Draw C Bern(exp(t(a S
n
))1(S
n
A))
6) Until C = 1
Fact 2.5. Suppose mgf
X
i
(t)exp(ta/n) < 1 .ThenAR Chernoff bound generates
S
n
from [S
n
|S
n
> a] when t > 0 or from [S
n
|S
n
< a] when t < 0.
Proof. Consider any vector (x
1
,...,x
n
) so that
x
i
= s. Then consider X
1
,...,X
n
iid
with density f
X
(x),andX
1
,...,X
n
iid with density exp(tx) f
X
(x). Then the density of
S
n
= X
1
+ ···+ X
n
will just be the density of S
n
= X
1
+ ···+ X
n
with an extra factor
of exp(tx
1
)exp(tx
2
)···exp(tx
n
)=exp(ts).
Hence AR has generation density g(s)=exp(ts) f
S
n
(s) and the target de nsity is
f (s)= f
S
n
(s)1(s A). Therefore f (s)/g(s) is either 0 (when s / A), or exp(ts)
when s A.
When A =[a,),thent > 0andexp(ts) exp(ta). Similar ly, when A =
(,a ],thent < 0andexp(ts) exp(ta). Hence c = exp(ta) works, and
f (s)/[cg(s)] = exp(ta
)exp (ts)1(s A), as given in the algorithm.
2.7.3 Example: generating from the tails of a binomial random variable
Suppose X is a binomial random variable that is the number of successes on n inde-
pendent Bernoulli trials, each of which is a su ccess with prob ability p. Write X
Bin(n, p).ThenE[X ]=np.Fora > np, consider drawing samples from [X |X a].
The simplest way to draw a binomial is to draw B
1
,...,B
n
iid Bern(p),andthen
let X = B
1
+···+B
n
. Therefore this method works very well with the Chernoff bound
approach.
The density of a Bernoulli is p at 1 and 1 p at 0. Therefore, the mgf of B
Bern(p) is mg f
B
(t)=e
t
p +(1 p). Hence
P(X
α
n)
e
t
p +(1 p)
exp(t
α
)
n
. (2.12)
The value of t that minimizes the right-hand side (giving the best bound) occurs when
e
t
=
α
(1 p)/[p(1
α
)].
Hence the density of the B
i
should be e
t
p =(1 p)
α
/(1
α
) at 1 and 1 p at
0. Normalizing, this gives P(B
i
= 1)=
α
and P(B
i
= 0)=1
α
.
For drawing [X|X a], a =
α
n so
α
= a/n, which gives the following algorithm.
42 ACCEPTANCE/REJECTION
AR Chernoffs bound binomial
Input: n,p,a
Output: [X |X a],whereX Bin(n, p)
1)
γ
(a/n)(1 p)/[p(1 (a/n))]
2) Repeat
3) Draw B
1
,...,B
n
iid Bern(a/n)
4) X B
1
+ ···+ B
n
5) Draw C Bern(
γ
aX
1(X a))
6) Until C = 1
For example, when n = 10 and p = 0.1, consider drawing [X |X 5]. The basic
AR method of drawing X until X 5 requires on average over 6807.2 times through
the repeat loop to generate one sample. The AR
Chernoffs bound binomial rou-
tine requires on average at most 3.7 times through the repeat loop to generate one
conditional sample from X .
2.8 Where AR fails
So why does our journey through perfect simulation not end here with the AR
method? Because for many problems of interest, the chance of acceptance goes down
exponentially with the size of the problem input.
Consider again the problem of generating a linear extension of a permutation by
acceptance/rejection. Suppose there are an even number n elements, and in the par tial
order 1 2, 3 4,...,n 1 n.
A permutation can be drawn uniformly at random by inserting item 1 into one
location, then choosing 2 to be either before or after 1, then 3 to be in any of three po-
sitions, and so on. Then the chance that the partial order is respected by this random
permutation is only (1/2)
n/2
.
Or consider the Ising model with no magnetic eld from earlier. The unnormal-
ized density of a conguration has a minimum of 1 and a maximum of exp(
β
#E).
Therefore an acceptance/rejection-style algorithm draws a conguration uniformly
from {−1, 1}
V
, and then accepts with probability exp(
β
H(x))/ exp (
β
#E), where
H(x)=
{i, j}∈E
1(x(i)=x( j)). If the graph consists of an even number of nodes
where the degree of each node is exactly 1, then each edge has 1(x(i)=x( j)) = 1
with probability 1/2 independently of the others. For
β
> 0 the chance that H(x) >
(3/4)#E declines exponentially in #E, which in tu rn means with high probability
there is only a exp(
β
#E/4) chance of accepting the result.
It turns out that a more advanced form of AR for the Ising model exists, which
does only require a polynomial number of draws on average for a nontrivial set of
β
.
This method is discussed in Chapter 9.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset