Chapter 4: Bounding Chains (3/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

76 BOUNDING CHAINS

Combining these gives

E[K

t+1

]=K

−

∑

r=0

−r)

+1−r

1 −

+1−r

≤K



1 −

(1 −

)



. (4.19)

So K

is changing on average by a small amount

, which is positive when

1/2 ...(the solution to

/(1 −

)

= 1.) Standard methods for martingales with drift

(see [31] then can be used to show that the expected time for K

to reach n is n/

which gives the desired result.

Interestingly, it is kn own that for any probability vector (p

,...,p

),theMA1

rule has a lower long-run expected access time than the MTF rule [114], but Monte

Carlo simulation is necessary to discover exactly how much lower it is.

4.6 Using simple coupling with bounding chains

For many problems, ﬁnding an initial bounding state is easy, but it is unclear how to

move any from that initial state.

An approach to this type of situation is to partition the state space into pieces

where in each piece, some information about the labeling is known. For instance, if a

node v is labeled from {0,1,2}, then the set of conﬁgurations can be par titioned into

three sets Ω

, Ω

,andΩ

,whereΩ

consists of those conﬁgurations where node v is

labeled i.

This gives the bounding chain some information to start with, and allows for

every state within each Ω

to coalesce down to a single state. That leaves a single

state coming from each Ω

, for th ree states total.

Now these three states can be run forward in time using an update function with

the hope that these ﬁnal three then coalesce down to a single state, allowing the

application of CFTP.

4.6.1 Sink-free orientations

This technique was used to give the ﬁrst polynomial time perfect simulation algo-

rithm for sampling uniformly from the sink-free orientations of a graph. In an undi-

rected graph G =(V,E), the edges are size two subsets of the set of nodes where

order does not matter. The edge {i, j} and {j,i} are identical. To o rient an edge is to

give it one of two directions, either from i to j, or from j to i.

Deﬁnition 4.4. An orientation of a graph is a labeling where each edge {i, j} is

labeled either (i, j) or ( j,i). An edge labeled (i, j) is said to be directed from i to j.

Callithetail of the edge and j the head.

Deﬁnition 4.5. An orientation is said to have a sink at node i if every edge of the

form {i, j} is oriented ( j,i). An orientation where none of the nodes are a sink is said

to be

sink-free.

If the degree of a node is 1, then there is only one way to orient the outgoing edge

to avoid creating a sink. That node and edge can then be removed from consideration,

USING SIMPLE COUPLING WITH BOUNDING CHAINS 77

and repeat the process on any new d egree 1 nodes created by the removal. So assume

without loss of generality that all the nodes o f the graph have degree at least 2.

Similarly, if a graph is disconnected, then each component will have to be ori-

ented separately, which means that again without loss of generality it is possible to

only consider connected graphs.

Consider the set of sink-free o rientations of a graph. A simple Gibbs sampler

chain chooses an edge uniformly at random, and then an orientation for the edge is

chosen uniformly from among the orientations that do not create a sink. Pseudocode

for this update function looks like this.

Sink Free Orientation Gibbs chain

Input: x, {i, j}∈E, B ∈{0,1} Output: x

1) If B = 1thena ← (i, j),elsea ← ( j,i)

2) Let n

out

← #{w : x({a(2),w})=(a(2),w)}

3) If n

out

≥ 1thenx({i, j}) ← a

In other words, if the chosen orientation for {i, j} leaves at least one other edge

leaving the head of the edge (so the change does not create a sink), then make the

change in orientation.

Bubley and Dyer showed that this chain mixes rapidly as long as the graph was

connected and contained more than one cycle. More speciﬁcally, they showed that if

two chains X and X



are passed through the pseudocode above using the same choice

of edge and coin B,thenafterO(#E

) steps there is a large chance that the two chains

have come together.

Lemma 4.5 ([17]). Suppose e

,... are iid uniform over E, and B

,... are

iid Bernoulli with mean 1/2. Then consider using the same update for X

and Y

so X

t+1

= Sink Free Orientation Gibbs chain(X

t+1

) and also Y

t+1

Sink

Free Orientation Gibbs chain(Y

t+1

). Then no matter the values

of X

and Y

, P(X

= Y

) ≤ (1/2)

t/#E

Using the coupling lemma, this means that starting at about #E

number of steps,

the total variation distance from the Markov chain state distribution a nd the stationary

distribution starts declining exponentially.

So it is possible to bring two states together. However, to use CFTP it is necessary

to bring all the states together. The problem is that for this cha in, it is imp ossible to

use the bounding chain technique. Each edge {i, j} is oriented either (i, j) or ( j, i),

so initially each bounding chain state is {(i, j),( j,i)}. But then taking steps in the

Markov chain does not help: no matter what edge is chosen to ﬂip or not ﬂip, the fact

that all the adjacent edges h ave unknown orientation means that the state of the edge

will still be {(i, j),( j,i)} at th e next step.

Here is the solution (a revision of an idea ﬁrst presented in [64]): ﬁrst bring all

the states down to two states, then use the Bubley-Dyer coupling to b ring those two

states together.

Pick an edge {i, j}.LetΩ

consist of those orientations where {i, j} is oriented

(i, j),andΩ

be those orientations with {i, j} oriented ( j,i). Then the Gibbs chain

78 BOUNDING CHAINS

will choose uniformly at random any edge other than {i, j}, and then uniformly

choose from the orientations of that edge that do not contain a sink.

Since {i, j} can never be picked, a bounding chain for Ω

can now make progress,

since the orientation of {i, j} is ﬁxed at (i, j),soi will always be sink-free regardless

of how its adjacent edges are oriented.

Similarly, at the same time these moves can make progress on a bounding chain

for Ω

.AfterΘ(#E

) steps, it is likely that each of these two bounding chains have

reduced the set of possible states to just one apiece. Then the regular coupling of

Bubley and Dyer can be used to bring these two states together down to just one

state, thereby allowing the use of CFTP.

As with the regular Gibbs step, the input to the bounding chain will be the edge

{i, j} whose orientation is trying to change, and B the coin ﬂip that determines if the

orientation to be attempted is (i, j) or ( j,i).

Suppose the orientation is (i, j). Then this could make j a sink. But suppose that

there is a k, such that y({j, k})={( j, k)}. Then it is known that j already has an

edge leaving it, so it is okay to change the orientation to (i, j). On the other hand, if

no edge is known to be leaving j, and some edges have unknown orientation, then it

is unknown whether the ﬂip can be accomplished or not. This procedure is given by

the following pseudocode.

Sink Free Orientation bounding chain

Input: y, {i, j}∈E, B ∈{0,1} Output: y

1) If B = 1thena ← (i, j),elsea ← ( j,i)

2) Let n

out

← #{w : y({a(2),w})={(a(2),w)}

3) Let n

unknown

← #{w : y({a(2),w})={(a(2),w),(w,a(2))}

4) If n

out

> 1theny({i, j}) ←{(a(1), a(2))}

5) Else if n

out

= 0andn

unknown

> 0, y({i, j}) ← y({i, j}) ∪{(a(2), a(1))}

Lemma 4.6. Suppose there exists an edge e such that #Y

(e)=1,e

,...,e

are

iid uniform over E {e}, and B

,...,B

are iid uniform over {0,1}.ThenifY

t+1

Sink

Free Orientation Gibbs chain(Y

t+1

), the chance that any edge e



exists with #Y



) > 1 is at most

exp



−t



2#E

24#E



. (4.20)

In particular, to get the chance of any edge remaining unknown to b e at most

requires (2#E

+ O(#E

))lo g(

−1

) steps.

Proof. Let D

= #{e :#Y (e) > 1}.ThenD

= #E −1. At step t, the value at edge

t+1

could change, so |D

t+1

−D

|≤1.

How can D

increase? Suppose Y

({i, j})={(i, j)},ande

t+1

= {i, j }.Thenif

B = 1 the edge just keeps its current orientation, so consider B = 0.

Then #Y

t+1

({i, j})=2 only if there is at least one edge {i,w} with

t+1

({i,w}) > 1, and no other edge {i,w



} with Y

t+1

({i,w



})={(i,w



)}.That

means that for the edge {i,w}, the edge {i, j} is the unique edge leaving i that can be

made to increase D

by 1.

USING SIMPLE COUPLING WITH BOUNDING CHAINS 79

But since edge {i, j} is known to be oriented (i, j), if edge {i,w} had been se-

lected and attempted to b e oriented (w,i), that would have been accepted, and D

would be reduced by 1.

This argument means that the chance that D

increases by 1 exactly equals the

chance that D

decreases by 1. Let that chance that it goes up 1 or down 1 be p

Then if f (a)=sin(a/#E),then

E[ f (D

t+1

)|F

]= f (D

t+1

+ 1)p

+ f (D

t+1

−1)p

− f (D

t+1

)

= f (D

)[1 −2p

(1 −cos(1/#E)],

where the last line follows from the sum of angles formula for sine. Since at least

one edge is always known, there is at least a (1/2)(1/#E) (the 1/2 comes from the

choice of B) chance of D

changing, hence 2p

≥ #E

−1

. An easy induction (and the

Taylor series bound on cosine) then gives

E[ f (D

)] ≤ f (D

)[1 −#E

−1

((#E

−2

/2) −(#E

−4

/24))]

Of course f (D

) ≤ 1, and when E[ f (D

)] ≤ sin(

/#E), by Markov’s inequality,

P(D

) > 0) ≤

. Using sin(1/#E) ≥ #E

−1

−#E

−3

/6 completes the proof.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4: Bounding Chains (3/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 4: Bounding Chains (3/4)