THE GENERAL RANDOMNESS RECYCLER 151
Now the proof of the theorem is easy.
Proof RR theorem. Let A be any measurable set. Then
P(X
T
A|T = t)=P(X
T
A|X
0
= x
π
,...,X
t1
= x
π
,x
t
= x
π
)
= K
I
(x
π
,A)
=
π
(A).
8.3.1 Simple symmetric random walk
Consider the simple symmetric random walk example. There Ω = Ω
= {1,2,...,n},
and for x
Ω
, K
I
(x
,·) Unif({1,2,...,x
}.Thetarget
π
is Unif({1,2,...,n}),
and x
π
= n. It is easy to sample from x
0
= 1.
The bivariate kernel is a two-step procedure. First take one step in the simple
random walk to get from state x to y. Then, if y > x
1 or an independent U
Unif([0,1]) is at most 1/2, let y
= x
+ 1. Otherwise y
= x
1.
Fact 8.1. The design property holds for SSST
for ssrw.
Proof. Fix x
,y
∈{1,...,n} and y ∈{1,...,n}. There are several subcases to con-
sider.
For instance, suppose y
= x
+ 1, y = x
+ 1. Then the only way to end at y is
for x = x
, and a move to the right occurred. Note K
I
(x
,{x })=1/x
in this case, so
xΩ
K
I
(x
,·)K((x
,x), (dy
,dy)) =
1
x
·
1
2
.
Now K
I
(y
,{y })=1/y
= 1/(x
+ 1 ) in this case, so it remains to compute
K
(x
,{x
+ 1}).
For each x ∈{1,2,...,x
2}, there will be a 50% chance of changing the index
from x
to x
+ 1. For x ∈{x
1, x
}, if the move is to the left, there is a 50%
chance of increasing the index, but if the move is to the right, then the index always
increases. Here there is a 75% chance of increase in these two cases. Therefore
K
(x
,{x
+ 1})=
x
2
x
0.5 +
2
x
0.75 =
1
2
·
x
+ 1
x
. (8.13)
Combining with K
I
(y
,{y })=1/(x
+ 1) gives
K
(x
,{x
+ 1})K
I
(y
,{y })==
1
2
·
1
x
. (8.14)
which matches
xΩ
K
I
(x
,·)K((x
,x), (dy
,dy)).
There are many o ther cases to consider, but th ey are all resolved in a similar
fashion.
152 THE RANDOMNESS RECYCLER
8.3.2 The Ising model
In the RR approach for Ising from Section 8.2, ingredient 1 is the set of congurations
Ω = {−1,1}
V
, and ingredient 2 (the dual state space) consists of subsets of nodes
(the frozen set) together with labels for each frozen node.
The index kernel (ingredient 3) is just the Ising density conditioned on the val-
ues of the labels for the frozen nodes. The target distribution (ingredient 4) is the
Ising density for the entire graph, and ingredient 5 is when no nodes are frozen. In-
gredient 6 is when all nodes are frozen at label 1, and ingredient 7 is given by the
RR
for Ising procedure.
Fact 8.2. The design property holds for RR
for Ising.
Proof. Fix x
and y
as sets of frozen nodes together with their labels. Then either y
has one fewer frozen node than x
, o r it has more frozen nodes.
First suppose that v is frozen in x
, but is unfrozen in y
.Lety be a state whose
frozen nodes in y
have the correct labe l.
Then there is exactly one state x such that K((x
,x), (y
,y)) > 0, namely, the same
as y but with the label of v changed to what it was frozen to by x
.
Let Z
x
=
x consistent with x
exp(H(x)
β
),andletZ, f
x(v)
, f
x(v)
,n
x(v)
,n
x(v)
be as
dened in RR
for Ising. Suppose y(v)=x(v).Then
K((x
,x), (y
,y)) = exp(H(x)
β
)Z
1
x
exp( f
x(v)
β
)Z
1
= exp(H(y)
β
)Z
1
x
Z
1
.
Similarly, if y(v)=x(v),then
K((x
,x), (y
,y)) = exp(H(x)
β
)Z
1
x
exp(( f
x(v)
+ n
x(v)
n
x(v)
)
β
)Z
1
= exp(H(y)
β
)Z
1
x
Z
1
.
Hence the design property holds in this case with K
(x
,y
)=Z
1
x
Z
1
.
Now suppose that y
has more frozen nodes than x
. The only way this could
have happened is if rejection occurred. For any y consistent with y
, x = y is the only
state where moving from (x
,x) to (y
,y) is possible. In particular:
K((x
,x), (y
,y)) = exp(H(x)
β
)Z
1
x
P(C = r)
= exp(H(y)
β
)Z
1
x
C(x
,y
).
Note that exp(H(x)
β
)/exp(H(y)
β
) only consists of factors arising from edges with
one endpoint a neighbor of v and the other endpoint a frozen edge in x
, or an edge
both of whose endpoints are neighbors of v that are unfrozen in x
.
Either way, these factors are recoverable from x
and y
.AlsoP(C = r) is a func-
tion of f
x(v)
, f
x(v)
,n
x(v)
,n
x(v)
, all of which is recoverable from x
and y
. Therefore
all of these factors can be written as C(x
,y
).
Since Z
1
x
C(x
,y
) is a function of x
and y
, the design property holds in this
case as well.
THE GENERAL RANDOMNESS RECYCLER 153
8.3.3 Using Metropolis-Hastings and the design property
One way to achieve the design property is to use a Metropolis-Hastings-like ap-
proach. Suppose that the current state is (x
,x), and the goal is to move to a new
index y
. Then create a proposal state Y for index y
using kernel K
propose
((x
,x), ·),
where for all measurable A, K
I
(y
,A) > 0 K
propose
((x
,x), A) > 0.
To get the chance of proposing in state A, integrate over the possible values of X
t
:
P(Y A|X
t
= x
)=
xΩ
K
I
(x
,dx)K
propose
((x
,x), A).
The goal is to have Y K
I
(y
,·), so it is important that the right-hand side of the
equation above be positive whenever K
I
(y
,A) > 0.
Denition 8.3. Say that measure
ν
1
is absolutely continuous with respect to
ν
2
if for
all measurable A,
ν
1
(A) > 0
ν
2
(A) > 0.
Create a ratio similar to the Metropolis-Hastings ratio:
ρ
(x
,y
,y)=
K
I
(y
,dy)
xΩ
K
I
(x
,dx)K
propose
((x
,x), dy)
. (8.15)
Because the numerator is absolutely continuous with respect to the denominator, this
ratio exists (formally it is called a Radon-Nikodym derivative.)
Those familiar with the Metropolis-Hastings method for Markov chains know
that to get reversibility, it is necessary to accept the move with probability equal to
the minimum of 1 and the Metropolis-Hastings ratio. The ratio method for RR is not
quite so forgiving.
For RR, it is necessary to accept with probability b e tween 0 and 1, and so the
ratio
ρ
must be normalized. Let
M(x
,y
)=sup
yΩ
ρ
(x
,y
,y)
p
accept
(x
,y
,y)=
ρ
(x
,y
,y)
M(x
,y
)
.
Metropolis-Hasting RR then proceeds as follows:
1. Based on the current index x
, choose a new index y
to attem pt to move to.
2. Propose a new conguration y for the new index using K
propose
((x
,x), ·).
3. Accept this state and move to (y
,y) with probability p(x
,y
,y).
4. Otherwise reject, and move to an index z
so that K
I
(z
,·) is the distribution of x
conditioned on rejection occurring.
Lemma 8.6. Metropolis-Hastings RR satises the design property.
Proof. What is K
(x
,{y
}) for Metropolis-Hastings RR? In order to make the move,
acceptance must o ccur, and so integrate over all possible values of x given X
t
= x
,
154 THE RANDOMNESS RECYCLER
and all possible proposed states y.
K
(x
,{y
})=
yΩ
xΩ
K
I
(x
,dx)K
propose
((x
,x), dy)
ρ
(x
,y
,y)
M(x
,y
)
=
yΩ
K
I
(y
,dy)
M(x
,y
)
= M(x
,y
)
1
On the other hand,
xΩ
K
I
(x
,dx)K((x
,x), ({y
},dy)=
xΩ
K
I
(x
,dx)K
propose
((x
,x), dy)
ρ
(x
,y
,y)
M(x
,y
)
= K
I
(y
,dy)M(x
,y
)
1
= K
I
(y
,dy)K
(x
,y
),
so the design property is satised for the move from x
to z
.
The proof for the move from x
to z
is similar.
The difcult part of applying RR is coming up with a good recycling procedure.
One such procedure for the Ising model was given previously; the next section gives
a very different RR approach to the same problem.
8.4 Edge-based RR
In the previous section the distribution was always the same, the standard Ising model
distribution for the graph. A completely d ifferent way of using the index set is to let
it change the distribution that the current conguration comes from.
For instance, suppose the index is not the set of frozen nodes, but rather a set of
edges. If an edge is not in the index, that means that that edge is effectively removed
from the graph. This can be viewed as g
{i, j}
(c
1
,c
2
)=1forallc
1
,c
2
, or equivalently,
the values of the labels of the endpoints of an edge only affect the density if the edge
is part of the index set x
.
Suppose that x
= /0. Then x
indicates that none of the edges are contributing to
the Ising density, and so it is easy to sample from the correct distribution. Since there
is no external magnetic eld, the label for each node is independent, and uniformly
distrib uted over {−1,1}.
Now suppose that an attempt is made to add an edge {i, j} to x
in order to get
the next index y
. Note that the proposed state y is the same as the original state.
But K
I
(y
,{x }) might not equal K
I
(x
,{x }) depending on the values of x(i) and x( j).
If x(i)=x( j),thenK
I
(y
,{x })Z(y
)=exp(
β
)K
I
(x
,{x })Z(x
). If x(i)=x(j),then
K
I
(y
,{x })Z(y
)=K
I
(x
,{x })Z(x
). Using the Metropolis-Hastings RR approach
from earlier:
ρ
(x
,x
∪{i, j},y)=exp(
β
1(x(i)=x( j)))Z(x
)/Z(y
).
Once again the nuisance factors Z(x
)/Z(y
) always appear regardless of the
EDGE-BASED RR 155
conguration x, and so also appear in M(x
,y
). That means these factors cancel out
in the acceptance probability p,and
p(x
,x
∪{i, j},y)=exp(
β
1(x(i) = x( j))).
When x (i)=x( j), the acceptance probab ility is 1 and the new state is ac-
cepted. When x(i) = x( j), the acceptance probability is exp(
β
), and the possibility
of rejection exists. Rejection means that x(i) = x( j), and so the conguration no
longer comes from the target distribution, but the target distribution conditioned on
x(i) = x( j).
So if (for instance), x(i)=1, then x( j)=1, and there is a connected component
of nodes labeled 1 that does not extend back to node i. The idea is to try to isolate
these nodes from the rest of the sample by removing edges.
Removing an edge from the index is very similar to adding an edge. The proposed
move is to index y
= x
{e}. If the endpoints of e have the same label in x,then
K
I
(x
,x) exp(
β
)Z(x
)=K
I
(y
,x)Z(y
). If they do not have the same label in x,
then K
I
(x
,x)Z(x
)=K
I
(y
,x)Z(y
).
Hence p(x
,x
{{i
, j
}},x)=exp(
β
1(x(i
)=x( j
))). Now suppose this re-
jects. The only way this could happen is if x( j
)=x(i
).
So either the edge is removed, or the state is conditioned on the endpoints of the
edges being the same. At this point the edge can be removed, since conditioned on
the endpoints having the same value, the edge only changes the unnormalized density
by the constant exp(
β
).
So our index actually consists of three components. The rst component is the
set of edges included in the graph. The second component is either empty, or consists
of a single node of the graph. The third component consists of a set of nodes o f the
graph.
The conguration is a draw from the Ising model 1) only including edges in the
rst component in the graph; 2) every label on every node in the third component
is the same; and 3) every node in the third component has a different label from the
node in the second component.
Now suppose that every edge adjacent to a node in the third component is absent
from the rst component. Then the nodes of the third component can be relabeled
uniformly from {−1,1}, and no longer is the conguration conditioned on the third
component nodes having the same label (which is different from the second compo-
nent node.) So at this point the second (and third) components can be changed back
to the empty set, and once again work on adding (rather than removing) edges from
the rst component.
In fact, the edge that we tried to ad d originally can be added with probability 1:
it is the only edge adjacent to node j in x
(1),sosimplydrawx( j) as equal to x(i)
with probability exp(
β
)/[1 + exp(
β
)], otherwise it is the opposite label.
Since every individual step in the chain is following Metropolis-Hastings RR,the
nal conguration will be a draw from the Ising model. The pseudocode looks like
this.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset