Chapter 3: Coupling from the Past (1/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3

Coupling from the Past

Life can only be understood backwards; but it must be lived forwards

Søren Kierkegaard

Coupling from the past (CFTP) allowed the cr eation of perfect simulation algo-

rithms for problems where only Markov chain methods existed before. The easiest

way to use this approach is to take advantage of monotonicity in the update function

for a Markov chain. However, even for nonmonotonic problems, the use of bounding

chains can combine with CFTP to obtain perfect samples. Basic CFTP does have

drawbacks, though, the two biggest being noninterruptibility and the fact that the

random choices made by the algorithm must be used twice.

3.1 What is a coupling?

Coupling from the past, hereafter referred to as CFTP, is a method that allows ideas

from Markov chain theory to be used to create perfect sim ulation algorithms. When

introduced by Propp and Wilson [112] in the mid 1990s, it greatly extended the

applicability of pe rfect sim ulation, an d made efﬁcient simulation of problems such

as the Ising model feasible for the ﬁrst time.

The idea is deceptively simple, and begins with the notion of an update function.

Deﬁnition 3.1. Say that

: Ω ×[0,1] →Ω is an update function for a Markov chain

} if for U ∼ Unif([0, 1]), [X

t+1

] ∼

,U).

The function

is deterministic—all of the randomness is contained in the value

of U . Any Markov chain that can be simulated on a computer is an example o f an

update function, so creating an update function representation of a Markov chain is

done every time a Markov chain is actually simulated. Update functions for a given

transition kernel a re n ot unique. In fact, a given Markov chain usually has an inﬁnite

number of different update functions that result in the same transition kernel.

Given an iid sequence U

,..., the entire process can be simulated by let-

ting X

),andfori > 1, X

i−1

).LetU denote the entire se-

quence, then let

,U)=

(

(···(

),U

),...,U

t−1

)) be the state after t

steps in the chain starting from x

Suppose x

and y

are any two states in Ω.LetX

,U) and Y

,U)

44 COUPLING FROM THE PAST

1 2 3

Figure 3.1 A completely coupled set of processes. X (represented by squares) and Y (repre-

sented by circles) ar e simple symmetric random walks with partially r eﬂecting boundaries on

Ω = {0,1,2}. They have coupled at time t = 3.

(using the same values of U). Then {X

} and {Y

} are both instances of the Markov

chain, just with (possibly) different starting points x

and y

A Markov chain can be created for every possible starting state x ∈ Ω using the

U values. A family of Markov chains that share a common kernel is known as a

coupling.

Note that with this coup ling, if X

= Y

= x for some t ≥ 0, then X



= Y



for all



≥t. Just as two train cars that are coupled together stay together, two processes at

the same state remain together no matter how much time passes.

An update function and the use of common U

values gives a coupling that applies

to the family of Mar kov chains started at every po ssible state in the set Ω,sothisis

also sometimes called a complete coupling [51].

Deﬁnition 3.2. Let S be a collection of stochastic processes d eﬁned over a common

index set I and common outcome space Ω. If there is an index i ∈I and state x ∈Ω

such that for all S ∈ S ,S

= x, then say that the stochastic processes have coupled

or coalesced.

Example 3.1. Consider the simple symmetric random walk with partially reﬂecting

boundaries on Ω = {0,1,2}. This Markov chain takes a step to the right (or left)

with probability 1/2, unless the move would take the state outside of Ω. An update

function for this chain is

(x,U)=x + 1(x < 2,U > 1/2) −1(x > 0,U ≤ 1/2). (3.1)

Figure 3.1 is a pictur e of two Markov chains, {X

} and {Y

},whereX

= 0and

= 2. The ﬁrst random choice is U

= 0.64 ..., so both chains attempt to increase by

1. The next random choice is U

= 0.234..., so both chains attempt to d ecrease by 1.

The third random choice is U

= 0.100 ..., so again both chains attempt to decrease

by one. At this point, both X

= Y

= 0. So the two chains have coupled at time 3 .

In fact, we could go further, and consider a chain started in state 1 at time 0. The

ﬁrst move takes this chain up to state 2, then the second move brings it down to 1,

and then the last move takes it to state 3. No ma tter where the chain started, after

three moves the state of the chain will be state 0 .

FROM THE PAST 45

3.2 From the past

Propp and Wilson started with this notion of coupling through use of an update func-

tion, and showed how this could be used to obtain perfect samples from the target

distribution.

To see how this works, consider a stationary state X . Then for a ﬁxed t,letY =

(X,U ).ThenY is also stationary, because

is the composition of t stationary

updates.

The output of CFTP will always be Y .LetW =(U

,...,U

) be uniform over

[0,1]

. Consider a measurable set A ⊂[0, 1]

. Th e n either W ∈ A or W /∈A.So

Y =

(X,W )1(W ∈ A)+

(X,W )1(W /∈ A). (3.2)

Suppose we could ﬁnd a set A, such that when W ∈A,

(X,W ) did not depend on

X. That is to say, whenever W ∈ A, it holds that

(x,W )=



,W ) for all x,x



∈ Ω.

Such a Markov chain h as completely forgotten its starting state through its random

moves. If this happens, then there is no need to know the value of X in order to ﬁnd

the value of Y !

Recall the example of the simple symmetric random walk with partially reﬂecting

boundaries on {0,1,2}illustrated in Figure 3.1. In that example

(0,W )=

(1,W )=

(2,W )=0. (3.3)

({0,1,2},W)={0}.

So what is a valid choice of A here? Suppose that the ﬁrst three steps were

(up,down,down). This sequence of moves corresponds to the set of values A

(1/2,1] ×[0, 1/2 ] ×[0,1/2].IfW ∈ A

,then

({0,1,2},W)={0}.

Another set of moves that moves all starting states together is (down,down,down).

Here A

=[0,1/2] ×[0, 1/2]×[0,1/2]. Of course, if A

and A

are both valid, so is

= A

∪A

.So

({0,1,2},W)={0} for all W ∈ A

. The point is that the set A

does not have to be exactly the set of moves that coalesce down to a single point.

However, the bigger A is, the more likely it will be that W ∈ A.

Back to Equation (3.2). In the ﬁrst term, the value of X does not matter, so

(X,W )1(W /∈ A)=

,W )1(W ∈ A) where x

is an arbitrary element of the

state space. That is,

Y =

,W )1(W ∈ A)+

(X,W )1(W /∈ A). (3.4)

So when W ∈ A, there is no need to ﬁgure out what X is, in order to compute Y .

Okay, that is all well and good, but what happens when W /∈ A?

Then it is necessary to evaluate

(X,W ) to obtain Y . The central idea behind

CFTP is to ﬁnd X ∼

by recursively calling CFTP (that is the “from the past” part

of the algorithm), and then evaluate Y =

(X,W ) as before.

To see how this works in practice, it helps to alter our notation somewhat by

adding a time index. Let Y

= Y , W

= W ,andY

−1

= X . With this notation,

)1(W

∈ A)+

−1

)1(W

/∈A). (3.5)

46 COUPLING FROM THE PAST

So if W

∈A, we are done. Otherwise, it is necessary to draw Y

−1

. But this is done

in exactly the same way as drawing Y

−1

)1(W

−1

∈ A)+

−2

−1

)1(W

−1

/∈ A), (3.6)

where W

−1

∼ Unif([0, 1]

) and is independent of W

In general,

−i

)1(W

−i

∈ A)+

−i−1

−i

)1(W

−i

/∈A), (3.7)

This can be taken back as far as necessary. The end result is an inﬁnite sequence

representation of a stationary state.

Theorem 3.1 (Coupling from the past). Suppose that

is an update function for a

Markov chain over Ω such that for U =(U

,...,U

t−1

) ∼ Unif([0,1]

) the following

holds.

• Fo r Y ∼

(Y,U

) ∼

• There exists a set A ⊆ [0,1]

such that P(U ∈ A) > 0 and

(Ω,A)={x } for some

state x ∈ Ω.

Then for all x

∈ Ω,

)1(U

∈A)+

(

−1

),U

)1(U

−1

∈A)+

(

−2

),U

−1

),U

)1(U

−2

∈ A)+···

has Y

∼

Proof. Fix x

∈Ω. Then the result follows from the Fundamental Theorem of perfect

simulation (Theo rem 1.1) using g(U)=

,U), b (U )=1(U ∈ A),and f (X,U)=

(X,U ).

The key insight of Propp and Wilson was that it is not necessary to evaluate every

term in the sequence to ﬁnd Y

. We only need U

,...,U

−T

,whereU

−T

∈ A. As long

as P(U ∈A) > 0, T will be a geometric random variable with mean 1/P(U ∈A).The

following code accomplishes this task.

Coupling from the past Output: Y ∼

1) Draw U ← Unif([0,1]

)

2) If U ∈ A

3) Return

,U) (where x

is an arbitrary element of Ω)

4) Else

5) Draw X ← Coupling

from the past

6) Return

(X,U )

This is a recursive algorithm; the alg orithm is allowed to call itself in line 5. When

the algorithm recursively calls itself, it does not pass the values of U as a parameter.

The new call to Coupling

from the past generates in line 1 its own value of U

that is indpendent of the choice of U in the calling functio n. This is important to do,

as otherwise the Fundamental Theorem of Perfect Simulation would not apply.

FROM THE PAST 47

Table 3.1 For simple symmetric random walk on {0,1,2}, this lists the possible outcomes and

their probabilities for taking two steps in the chain given U /∈ A.

state X Moves

up-up up-down down-up

02 0 1

12 1 1

22 1 2

Example 3.2. Use CFTP to draw uniformly from {0,1,2}by using the Markov chain

with update function

(x,U)=x + 1(x < 2,U > 1/2) −1(x > 0,U ≤ 1/2),

t = 2, and A =[0,1/2] ×[0,1/2].

As noted earlier,

({0,1,2},U)={0} whenever U ∈ A .SotouseCFTP, ﬁrst

draw U .IfU ∈ A , output {0} and quit. Otherwise recursively call the algorithm in

order to ﬁnd X ∼

. Then output

(X,U ). Note that because this second step was

only executed when U /∈A, the output of

(X,U ) in this second step is not stationary,

but instead is the distribution of

(X,U ) conditioned on U /∈ A.

The theory guarantees that the output of this procedure has the correct distribu-

tion, but it can also be veriﬁed directly in this example.

What is the probability that the output of the algorithm is 0? The event U ∈ A

occurs with probability (1/2)(1/2)=(1/4), so there is a 1/4 chance that the state is

0aftertheﬁrst step, and a 3/4 chance that the algorithm continues to the next step.

During this next step, the state X is a draw from the stationary distribution, and so

has a one-third chance of being either 0, 1, or 2. Conditioned on U /∈ A, the uniforms

could either result in the chain moving up-up, up-down, or down-up. Therefore there

are nine possibilities to consider, the o utcomes of which are collected in Table 3.1.

For instance, if the state starts at 0, then the chain moves down-up, th en m oving down

leaves the state at 0 and then moving up brings the ch ain to 1. So the ﬁnal state is 1.

Each of these outcomes occur with probab ility 1/9. There is a 1/3 chance for

each starting state and (conditioned on U /∈ A)a1/3 chance for each pair of moves.

Therefore, the chance of ending in state 0 is

. (3.8)

The chance the algorithm ends in state 1 or 2 is

0 +

. (3.9)

So the output does have the correct distribution! Now consider the running time,

as measured by the number of evaluations of the update function

that is needed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3: Coupling from the Past (1/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 3: Coupling from the Past (1/4)