Chapter 2: Acceptance/Rejection (4/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

40 ACCEPTANCE/REJECTION

2.7.2 Chernoff bounds as an AR algorithm

Chernoff bounds [23] give upper limits on the p robability that a sum of random vari-

ables is greater (or smaller) than a set value. Let S

= X

+ ···+X

,whereX

,...,X

are iid; then the analog sampling goal to generate a draw from S

conditioned on

≥ a or S

≤ a.

Markov’s inequality can also be applied to the sum o f random variables, but the

bound only goes down as the inverse of n. Chernoff bounds, on the other hand, tend

to decrease exponentially in n, thereby giving much closer matches to the actual

behavior of the tails of S

and a much faster way of generating samples from the

tails.

Chernoff bounds can be viewed as an application of Markov’s inequality to the

moment-generating function of X.

Lemma 2.9 (Chernoff bounds). Consider a random variable X where E[e

] is ﬁnite

for t ∈[a,b]. Then for a ∈ R,

P(X ≥ a) ≤ mgf

(t)/exp(ta) for all t ∈ [0, b], (2.8)

P(X ≤ a) ≤ mgf

(t)/exp(ta) for all t ∈ [a, 0]. (2.9)

To get the best possible bound on the right-hand side, minimize over the allowed

values of t.

Proof. First P(X ≥ a )=P(tX ≥ a) for any positive t.NextP(tX ≥ ta)=

P(exp(tX) ≥ exp(ta)). Finally, apply Markov’s inequality to get P(exp(tX ≥

exp(ta))) ≤ E[exp(tX )]/exp(ta). The result for P(X ≤a) is shown similarly.

Now consider what happens when Chernoff bounds are used on a sum of inde-

pendent random variables. First a fact about moment-generating functions.

Lemma 2.10. Fo r S

= X

+ ···+ X

,wherethe{X

} are iid with ﬁnite moment-

generating functions, E[exp(tS

)] = [E[exp(tX

)]]

Applying this to Chernoff bounds immediately yields the following.

Lemma 2.11. Fo r S

= X

+ ···+ X

,wherethe{X

} are iid with mgf

(t) is ﬁnite

for t ∈[a,b],

P(S

≥

n) ≤



mgf

(t)

exp(t

)



for all t ∈ [0, b], (2.10)

P(S

≤

n) ≤



mgf

(t)

exp(t

)



for all t ∈ [a, 0]. (2.11)

Say that X

∼ f

. The goal is to use the moment-generating function idea that led

to the Chernoff bound to obtain a better AR-type algorithm. To accomplish this goal,

it must be possible to draw from the twisted density g

(x) ∝ e

(x).Whent is large,

this density pushes the probability in density g

out to larger values. When t is small

and negative, this pushes the probability in g

to more negative values.

Let t > 0. For x ≥ a,theng

(x)=e

(x) ≥ e

(x). Therefore a draw from g

SIMULATION USING MARKOV AND CHERNOFF INEQUALITIES 41

can be accepted as a draw from f

with probability e

. If the probability that x is

very much larger than a is small, then this acceptance probability will be very close

to 1 . A similar method ap plies when t < 0.

AR Chernoffs bound Input: density f

, a, t

Output: S

= X

+ ···+ X

given S

≥a (when t > 0) or S

≤ a (when t < 0)

1) If t > 0, then A ← [a, ∞),elseA ← (−∞,a]

2) Repeat

3) Draw X



,...,X



iid from unnormalized density e

f (x)

4) S

← X



+ ···+ X



5) Draw C ← Bern(exp(t(a −S

))1(S

∈ A))

6) Until C = 1

Fact 2.5. Suppose mgf

(t)exp(−ta/n) < 1 .ThenAR Chernoff bound generates

from [S

> a] when t > 0 or from [S

< a] when t < 0.

Proof. Consider any vector (x

,...,x

) so that

∑

= s. Then consider X

,...,X

iid

with density f

(x),andX



,...,X



iid with density exp(tx) f

(x). Then the density of



= X



+ ···+ X



will just be the density of S

= X

+ ···+ X

with an extra factor

of exp(tx

)exp(tx

)···exp(tx

)=exp(ts).

Hence AR has generation density g(s)=exp(ts) f

(s) and the target de nsity is

f (s)= f

(s)1(s ∈ A). Therefore f (s)/g(s) is either 0 (when s /∈ A), or exp(−ts)

when s ∈ A.

When A =[a,∞),thent > 0andexp(−ts) ≤ exp(−ta). Similar ly, when A =

(−∞,a ],thent < 0andexp(−ts) ≤ exp(−ta). Hence c = exp(−ta) works, and

f (s)/[cg(s)] = exp(ta

)exp (−ts)1(s ∈ A), as given in the algorithm.

2.7.3 Example: generating from the tails of a binomial random variable

Suppose X is a binomial random variable that is the number of successes on n inde-

pendent Bernoulli trials, each of which is a su ccess with prob ability p. Write X ∼

Bin(n, p).ThenE[X ]=np.Fora > np, consider drawing samples from [X |X ≥ a].

The simplest way to draw a binomial is to draw B

,...,B

iid Bern(p),andthen

let X = B

+···+B

. Therefore this method works very well with the Chernoff bound

approach.

The density of a Bernoulli is p at 1 and 1 − p at 0. Therefore, the mgf of B ∼

Bern(p) is mg f

(t)=e

p +(1 −p). Hence

P(X ≥

n) ≤



p +(1 −p)

exp(t

)



. (2.12)

The value of t that minimizes the right-hand side (giving the best bound) occurs when

(1 −p)/[p(1 −

)].

Hence the density of the B



should be e

p =(1 −p)

/(1 −

) at 1 and 1 − p at

0. Normalizing, this gives P(B



= 1)=

and P(B



= 0)=1 −

For drawing [X|X ≥ a], a =

n so

= a/n, which gives the following algorithm.

42 ACCEPTANCE/REJECTION

AR Chernoffs bound binomial

Input: n,p,a

Output: [X |X ≥ a],whereX ∼ Bin(n, p)

← (a/n)(1 − p)/[p(1 −(a/n))]

2) Repeat

3) Draw B



,...,B



iid Bern(a/n)

4) X ← B



+ ···+ B



5) Draw C ← Bern(

a−X

1(X ≥ a))

6) Until C = 1

For example, when n = 10 and p = 0.1, consider drawing [X |X ≥ 5]. The basic

AR method of drawing X until X ≥ 5 requires on average over 6807.2 times through

the repeat loop to generate one sample. The AR

Chernoffs bound binomial rou-

tine requires on average at most 3.7 times through the repeat loop to generate one

conditional sample from X .

2.8 Where AR fails

So why does our journey through perfect simulation not end here with the AR

method? Because for many problems of interest, the chance of acceptance goes down

exponentially with the size of the problem input.

Consider again the problem of generating a linear extension of a permutation by

acceptance/rejection. Suppose there are an even number n elements, and in the par tial

order 1  2, 3  4,...,n −1  n.

A permutation can be drawn uniformly at random by inserting item 1 into one

location, then choosing 2 to be either before or after 1, then 3 to be in any of three po-

sitions, and so on. Then the chance that the partial order is respected by this random

permutation is only (1/2)

n/2

Or consider the Ising model with no magnetic ﬁeld from earlier. The unnormal-

ized density of a conﬁguration has a minimum of 1 and a maximum of exp(

#E).

Therefore an acceptance/rejection-style algorithm draws a conﬁguration uniformly

from {−1, 1}

, and then accepts with probability exp(−

H(x))/ exp (

#E), where

H(x)=−

∑

{i, j}∈E

1(x(i)=x( j)). If the graph consists of an even number of nodes

where the degree of each node is exactly 1, then each edge has 1(x(i)=x( j)) = 1

with probability 1/2 independently of the others. For

> 0 the chance that H(x) >

(3/4)#E declines exponentially in #E, which in tu rn means with high probability

there is only a exp(−

#E/4) chance of accepting the result.

It turns out that a more advanced form of AR for the Ising model exists, which

does only require a polynomial number of draws on average for a nontrivial set of

This method is discussed in Chapter 9.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2: Acceptance/Rejection (4/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 2: Acceptance/Rejection (4/4)