Chapter 3: Coupling from the Past (2/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

48 COUPLING FROM THE PAST

Lemma 3.1. The number of calls to

needed by Coupling from the past is at

most t[2P(U ∈A)

−1

−1].

Proof. In the call to the algor ithm, if U /∈ A, then the recursive call in line 5 calls

Coupling

from the past again. Each of these recursive calls is independent of the

time before, so the number of calls is a geometric random variable with parameter

P(U ∈ A). That makes the mean number o f calls P(U ∈ A)

−1

Line 3 is a call to

, which requires t calls to

. If line 6 is executed, then a

second call to

is needed. Therefore, the expected number of calls to

in any level

of the algorithm is t[1 + P(U /∈ A)] = t[2 −P(U ∈ A)]. Therefore the overall number

of expected calls is t[2P(U ∈ A )

−1

−1].

To keep the running time small, it is important that t be large enough that P(U ∈

A) is reasonably sized, but once that is achieved t should be kept as small as possible.

It is not always easy to determine what such a t should be. Fortunately, there is a way

around this problem.

3.2.1 Varying the block size

In building CFTP, a single step update

was composed with itself t times. Call this

set of t steps a block.

Suppose there is a threshold value t



such that P(U ∈ A)=0ift < t



. In practice,

it is usually very difﬁcult to compute t



explicitly; therefore, it is common to use

blocks of increasing size to make sure that eventually t is greater than the threshold.

In general, there is no reason why the same number of steps needs to be used

for each block. The statement and proof of Theorem 1.1 easily adapts to the more

general situation where the functions b, g,and f change with each level of recursion.

Theorem 3.2 (Fundamental Theorem of perfect simulation (2nd form)). Suppose

for U

,... independent and identically distributed, there are sequences of com-

putable functions {b

}, {g

}, and {f

} such that each b

has range {0, 1} and

∏

∞

t=1

P(b

(U )=0)=0. Let X be a random variable such that for all t:

X ∼ b

(U )g

(U )+(1 −b

(U )) f

(X,U ). (3.10)

Let T = inf{t : b

)=1}.Then

Y = f

(···f

T −2

( f

T −1

),U

T −1

),...,U

) (3.11)

has the same distribution as X and E[T ]=1/P(b(U)=1).

Proof. The pro of is nearly identical to that of Theorem 1.1.

Using this for general version of the fundamental theorem for CFTP gives the

following result.

FROM THE PAST 49

Theorem 3.3. Let t(0) < t(1) < t(2) < ··· be positive integers. Let W

,... be

independent with W

∼ Unif([0, 1]

t(i)

), and A

t(i)

a measurable subset of [0,1]

t(i)

such

that

t(i)

(Ω,W

)={y}. Suppose

∏

∞

i=1

P(W

∈ A)=0. Then for any x

∈ Ω,

t(0)

)1(W

∈ A)+

t(1)

(

−1

),W

)1(W

−1

∈ A)+

t(2)

(

−2

),W

−1

),W

)1(W

−2

∈ A)+···

has Y

∼

A common choice is to make t(i)=2t(i −1), so that the number of steps taken

doubles at each step. If the initial t = t(0) is 1, then doubling each time reaches t



after only a logarithmic number of recursions. Moreover, the total work done in all

of the blocks will be about 1 + 2 + 4 + ···+t



= 2t



−1.

Doubling Coupling from the past Input: t Output: Y

1) Draw U ← Unif([0, 1]

)

2) If U ∈ A

3) Return

(·,U)

4) Else

5) Draw X ←Doubling

Coupling from the past(2t)

6) Return

(X,U )

It should be noted, however, that this doubling is only a rule of thumb. One could

use t(i)=(3/2)

,or4

,orjusti and the algorithm will still return a perfect sample,

no matter how big t



is. Only the number of steps (on average) taken by the algorithm

could change.

For a more sophisticated block size method designed to give greater concentra-

tion of the running time around the mean, see Section 11.6.

3.2.2 Recursion as history

There is another way of viewing CFTP that can make more intuitive sense, especially

to those already familiar with Markov chain Monte Carlo. Suppose that a Markov

chain is run forward in time yieldin g X

,.... Then as the number of steps

grows towards inﬁnity, Theorem 1.3 tells us that the state will converge towards

stationarity.

Now suppose that the Markov chain steps are not just nonnegative integers, but all

integers. So the process is ...,X

−2

−1

,...and not only runs inﬁnitely for-

ward in time, but also inﬁnitely backwards in time as well. Then by time 0, intuitively

the process has already been running for an “inﬁnite” number of steps. Therefore X

should already be stationary.

This intuitive idea can b e made precise using the idea of the reverse process for a

Markov chain (see Section 5.2 .1,) but for now let us just follow where this intuition

leads.

Suppose CFTP is run with t = 1. Then

can be viewed as the step from X

−1

. If coalescence occurs, the algorithm quits and returns X

. Otherwise, recursion

50 COUPLING FROM THE PAST

occurs, and t is doubled to 2. These two steps can be viewed as the step from X

−3

−2

and the step from X

−2

to X

−1

If coalescence occurs, then one more step in the Markov chain is taken, and recall

this was the step from X

−1

up to X

. The two blocks together have moved the chain

from step X

−3

up to step X

If coalescence does not occur. then now t = 4, and steps X

−6

→ X

−5

→ X

−4

→

−3

are taken. Then as the recursion unpacks, the two steps from X

−3

→X

−2

→X

−1

are taken, ﬁnally followed by the step X

−1

→ X

No matter how far back the recursion goes, the ﬁnal output of the procedure (from

this point of view) is X

∼

. This way of viewing CFTP becomes very important for

designing perfect simulation algorithms for spatial point processes (see Chapter 7.)

3.3 Monotonic CFTP

To use CFTP requires an update function that is stationary with respect to

,anda

set A such that for U ∈ A,

(Ω,U)={x}. The stationary requirement is is usually

easy, as it can be done by constructing a Markov chain whose stationary distribution

. The hard part is creating the sets A

t(i)

together with a method for determining if

U ∈ A

t(i)

It is this difﬁculty that prevents CFTP from being used for every Monte Carlo

application. In the remainder of this chapter, different methods for ﬁnding such A

t(i)

will be discussed, together with applications. The ﬁrst method considered is the use

of a monotonic update function.

Monotonicity was used by Johnson [74] to detect the speed of convergence for

the Gibbs chain for the Ising model, although the term monotonicity was not used

there. Propp and Wilson [112] saw that this same idea could be used as the basis of a

CFTP-type algorithm for perfect simulation.

The idea is as follows. Suppose that the state space Ω admits a partial order.

Recall Deﬁnition 1.24, which states that the binary relation  is a partial order if it is

reﬂexive (so s  s), antisymmetric (so s  t and t  s gives s = t), and transitive (so

s  t and t  r gives s  r.) But unlike a total order (such as ≤ over R ), there could

be elements s and t for which neither s t or t  s holds.

As an example, consider two Ising conﬁgurations x and y. Say that x  y if

for all nodes v, x(v) ≤ y(v). So if there are four nodes, (−1, −1, 1,1) precedes

(

−1,−1,−1,1). However, the states (−1 , −1, 1, 1) and (1,1,−1,−1) are incompa-

rable; neither one precedes the other.

Suppose further that Ω has a largest element x

max

, and a smallest element x

min

Continuing the Ising model example, the conﬁguration of all −1 values is the small-

est element, and the conﬁguration of all 1 values is the largest element.

Deﬁnition 3.3. Let

be an update function for a single step of a Markov chain. If

has the property that for all x  y and u ∈ [0,1], it holds that

(x,u) 

(y,u),then

call

monotonic.

Consider running two Markov chains {X

} and {Y

} wh ere X

= x

and Y

= y

with x

 y

. Suppose at each step, X

t+1

) and Y

t+1

),where

,... is an iid stream of Unif([0,1]) random variables.

MONOTONIC CFTP 51

Then a simple induction gives that X

t+1

 Y

t+1

for all values of t. Moreover,

suppose x

= x

min

and y

= x

max

. Then any other state w

satisﬁes x

w

y

,and

so building a W

chain in similar fashion gives that X

W

Y

for all t.

Suppose that X

= Y

. The properties of a partial order imply that X

= W

as well. The starting state of W

was arbitrary, so for any starting state w

∈ Ω,

,u)=X

= Y

In other words, if after taking a ﬁxed number of steps, the largest state chain and

the smallest state chain (also referred to as the upper and lower chain) have reached

the same state, every other state in the chain has been trapped in between them,

and also reached the same state. This g ives rise to one of the most important CFTP

variants, monotonic CFTP.

Monotonic coupling from the past Input: t Output: Y

1) Draw U ← Unif([0, 1]

)

2) Let X

←

max

,U),andY

←

min

,U).

3) If X

= Y

3) Return X

4) Else

5) Draw X ←Monotonic

coupling from the past(2t)

6) Return

(X,U )

3.3.1 The Ising model

Consider the Ising

Gibbs update from Section 1.8.1. In this update, a node v was

chosen uniformly at random, as was U ∼ Unif([0,1]).ThenN

was the number of

neighbors o f v labeled c.IfU < exp(N

)/[exp(N

)+exp(N

−1

)], then the node

was labeled 1, otherwise it was labeled 0.

In order to use this update for perfect simulation, it needs to b e written as an

update function. That is, the current state and the random choices become input pa-

rameters.

Ising Gibbs Input: x ∈{−1,1}

, u ∈ [0,1], v ∈V Output: x ∈{−1,1}

1) Let N

be the number of neighbors of v labeled 1 in x

2) Let N

−1

be the number of neighbors of v labeled -1 in x

3) If U < exp (N

)/[exp(N

)+exp(N

−1

)]

4) x(v) ← 1

5) Else

6) x(v) ←−1

When u ∼ Unif([0,1]) and v ∼ Unif(V ), this is the random scan Gibbs update from

Section 1.8.1.

Fact 3.1. The Ising

Gibbs update is monotonic when

> 0.

Proof. Let x  y.Letv be any node, and N

(z) denote the number of neighbors of v

labeled c in conﬁguration z. Then any neighbor of v in x labeled 1 is also labeled 1 in

y,soN

(x) ≤ N

(y). Similarly N

−1

(x) ≥ N

−1

(y).

52 COUPLING FROM THE PAST

Let f (n

−1

)=exp(N

(x)

)/[exp(N

(x)

)+exp(N

−1

)].Since

> 0,

f (n

−1

) is increasing in n

, and decreasing in n

. Hence if U < f (N

(x),N

−1

(x)),

then certainly U < f (N

(y),N

−1

(y)).Thatis,ifx(v) is changed to 1 , then certainly y

is as well.

Similarly, if U > f (N

(y),N

−1

(y)) so that y(v) has its label changed to 0, then

U > f (N

(x),N

−1

(x)) and x(v) is changed to 0 as well.

Therefore no m atter what the choice of v and U is in the update, if x  y then

(x,v,U) 

(y,v,U).

This means that monotonic CFTP can be used to generate a random sample ex-

ample from the stationary distribution.

Monotonic Ising Gibbs Input: t Output: X

1) Draw (U

,...,U

) ← Unif([0,1]

),draw(V

,...,V

) ← Unif(V

)

2) Let x

max

← (1, 1,...,1), x

min

← (0,0,...,0)

3) For i from 1 to t

4) x

max

← Ising Gibbs(x

max

)

5) x

min

← Ising Gibbs(x

min

)

6) If x

max

= x

min

7) Draw x

max

← Monotonic Ising Gibbs(2t)

8) For i from 1 to t

9) x

max

← Ising Gibbs(x

max

)

10) X ← x

max

3.3.2 The hard-core gas model on bipartite graphs

Consider the HCGM

Gibbs update from Section 1.8.5. This update function is not

monotonic using the simple partial order x  y ⇔ (∀v ∈V )(x(v) ≤ y(v)).

Therefore either a new partial order or a new update is needed. The problem is

that the chain is repulsive—neighbors in the graph cannot both be given the label 1.

However, when the graph is bipartite, it is possible to build a new partial order that

is monotonic.

Deﬁnition 3.4. A graph is bipartite if the vertex set V can be partitioned into V

and

so that for every edge e, one node of e belongs to V

and one node belongs to V

For conﬁgurations on a bipartite graph, the partial order is similar to that of the

previous section:

x  y ⇔ (∀v

∈V

)(∀v

∈V

)(x(v

) ≤ y(v

) ∧x(v

) ≥ y(v

)). (3.12)

Fact 3.2. HCGM

Gibbs is monotonic with respect to (3.12).

Proof. If x  y and node i ∈ v

is chosen to update, then if any neighbor w of i has

x(w)=1theny(w)=1 as well. So both x(i) and y(i) will always stay 0. The only

way there is a chance that y(i) is changed to 1 is if y(w)=0 for all neighbors w of i.

In this case all x(w)=0 as well. x(i)=y(i)=1(U <

+ 1)).

When i ∈ v

the analysis is similar.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3: Coupling from the Past (2/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3: Coupling from the Past (2/4)