214 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
Lemma 11.8. Let Y denote the output of Tripled
Concentrated A(d).When
Y =, Y is a draw from the target distribution. Moreover,
P(Y =)
1
4
[2
d2
/q
3/4
(T
A
)
ln(2)/ln(3)
]1
. (11.28)
Proof. The fact that Y conditioned on Y = comes from the target distribution fol-
lows from induction on d.Whend = 0, Y is the output from Bounded time A(1),
and so [Y |Y =] is just [Y |T
A
= 1], which from the interruptibility of A gives the
target distribution.
For d > 1, assume that it holds for d 1. Then the output Y is ei-
ther one of the draws from Tripled
Concentrated A(d 1) or a draw from
Bounded
time A(3
d1
). Either way, the result (conditioned o n Y =) comes from
the target distribution.
Let Y (d) denote the output of the algorithm with input d. then the recursive struc-
ture of the algorithm gives
P(Y (d)=)=P (Y (d 1)=)
2
P(T
A
> 3
d1
)
P(Y (d 1)=)
2
(1/4)
1(q
3/4
(T
A
)3
d1
)
.
So to get an upper bound on this probability, dene p(d) recursively as
p(d)=p(d 1)
2
(1/4)
1(q
3/4
(T
A
)3
d1
)
(11.29)
and p (0)=1.
Then a simple induction gives P(Y (d)=) p(d) for all d.Notep(d)=
(1/4)
r(d)
for some nonnegative integer r(d). The recursion equation for p(d) then
gives r(d)=2r(d 1)+1(q
3/4
(T
A
) 3
d1
), r(0)=0.
Therefore,
r(d)=2
n
+ 2
n1
+ ···+ 1 = 2
n+1
1, (11.30)
where n is the number of powers of three between 3
d1
and q
3/4
(T
A
).Thisisatmost
d 2 log
3
(q
3/4
(T
A
)) 1.
Since 2
log
3
(x)
= x
ln(2)/ ln(3)
,thisgivesr(d) [2
d2
/q
3/4
(T
A
)
ln(2)/ ln(3)
] 1.
To actually be useful as an algorithm, it is necessary not to have to set d ahead of
time, but to run the chain forward the proper number of steps. Fortunately, it is easy
to develop the vector of times to give Bounded
time A. In what follows, length(a)
denotes the number of elements of the vector a, [abc] denotes the concatenation o f
vectors a, b,andc. (For example if a =(3, 4), b =(1),andc =(0,5),then[abc]=
(3,4,1,0,5).)
Then for d = 0, the rst time to run Bounded
time A is 1, so start with a =[1].
When d = 1, rst the 1 time is used, then 1, then 3
d1
.Sonowa =[111].
When d = 2, rst the [111] times are u sed, then [111] again, and nally
Bounded
time A is run for 3
d1
= 3steps.Soa =[1111113].Atd = 3,
a =[111111311111139], and so on. The vector a can be constructed
in th is fashion to have as many elements as needed.
RELATIONSHIP BETWEEN SAMPLING AND APPROXIMATION 215
Forward Tripled Concentrated A Input: d Output: Y
1) i 0, d 0, a [1], Y ←⊥
2) While Y =
3) i i + 1
4) If length(a) < i
5) a [aa3
d
]
6) d d + 1
7) Y Bounded
time A(a(i))
Lemma 11.9. The output Y of Forward Tripled Concentrated A comes from
the target distribution. Let T be the total number of steps taken by the interruptible
algorithm. Then fo r all k 1,
P(T > kq
3/4
(T
A
)) 4exp((1/4)ln(4)k
ln(2)/ ln(3)
). (11.31)
Proof. The output is always a draw from Bounded
time A, and so even conditioned
on the input, if Y =,thenY is a draw from the target distribution by the interrupt-
ibility of A.
Let d be the largest value of d such that kq
3/4
(T
A
) 3
d
. Then from the last
lemma the running time is at most
P(T > kq
3/4
(A )) P(T > 3
d
)
exp(ln(4)([2
d2
/q
3/4
(T
A
)
ln(2)/ ln(3)
] 1))
= 4exp((1/4)ln(4)[2
d
/q
3/4
(T
A
)
ln(2)/ ln(3)
])
= 4exp((1/4)ln(4)[(3
d
/q
3/4
(T
A
))
ln(2)/ ln(3)
]).
Since 3
d
> kq
3/4
(T
A
)/3, this gives
P(T > kq
3/4
(T
A
)) < 4exp((1/4)ln(4)3
ln(2)/ln(3)
k
ln(2)/ ln(3)
),
which completes the proof.
11.6.2 Concentration for CFTP
While the algorithms above were written for interruptible algor ithms, the same pro-
cedure can be applied to block lengths for CFTP. If the block length is generated
be taking the vector a of length 3
d
and replacing it with [aa3
d
] to get the next
set of block lengths, nearly the same result as Lemma 11.9 holds. The only differ-
ence is that CFTP (being read twice) requires twice as many steps as a forward run
algorithm. The subpolynomial tail, however, is exactly the same.
11.7 Relationship between sampling and approximation
What methods such as TPA and PPE show is that given the ability to draw samples
exactly from a distribution in polynomial time, then it is usually possible to approx-
216 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
imate the partition fu nction of the distribution to any desired degree of accuracy in
polynomial time.
When it is not possible to perfectly sample, but only approximate sampling is
available, then it is still possible to use TPA and PPE, although the fact that samples
are not coming exactly from the correct distribution has the be taken into account.
Now consider the opposite problem of sampling g iven the ability to compute the
partition function. For instance, for the Ising m odel on planar graphs, it is possible to
compute Z in O(n
3
) time using the Pfafan approach [78, 79, 80]. Consider a node v.
If the partition function when x(v)=1 (call it Z
1
)andwhenx(v)=1 (call it Z
2
)are
computed, then P(X(v)=1)=Z
1
/[Z
1
+ Z
2
]. So it is possible to draw x(v) exactly
from its marginal distribution.
Conditioned on the draw for x(v),thevalueofx(w) could then be drawn for
some w = v. Continue until all the labels have been decided. The point is that given a
polynomial time method for computing the partition function, there is a polynomial
time method for generating perfect samples.
An approximate method for computing the partition functio n will only yield an
approximate sampling method. The result is that it is possible to approximate the
partition function in polynomial time for these types of problems if and only if it is
possible to approximately sample from the associated distribution.
If it is possible to perfectly sample from the distribution in polynomial time it is
possible to approximate the partition function in polynomial time. However, given
the ability to approximate the partition function, it is not necessarily possible to per-
fect simulate from the distribution.
11.8 Limits on perfect simulation
Perfect simulation algorithms have been found for a wide variety of distributions,
but not all. Moreover, as more and more algorithms have been discovered, a pattern
has emerged. Distributions where the label of a node only depends on immediate
neighbors, and where there is a chance of being able to ignore the neighbors, are the
most easily handled by perfect simulation protocols. Such high noise models have
long been used to get approximate sampling methods using Markov chain Monte
Carlo. Today these same models can be sampled from exactly.
Statistical models in particular tend to fall into this category, as they often do not
wish to restrict the outcome too severely, instead giving the data a chance to show
where the model is incomplete or incorrect.
However, in statistical physics, a different story emerges. Here models (such as
the Ising model) exhibit phase transitions, where the character of the distribution
changes sharply at a particular value of the parameters for the model. It is not sur-
prising, then, that the perfect simulation algorithms also display these types of phase
transitions, provably running in polynomial time only for restricted values of the
parameters.
Recent work such as that of Sly and Sun [118] has made this comparison with
the pha se transition explicit. Consider the hard-core gas model from Denition 1.20.
LIMITS ON PERFECT SIMULATION 217
Recall that in this model on a nite graphs, the weight of a conguration x in {0,1}
V
is
λ
vV
x(v)
if no two nodes labeled 1 are adjacent to each other, and zero otherwise.
In an innite graph, it is possible to d ene the Gibbs measure for the hard-core
gas model by specifying the distribution at a single node conditioned on the neigh-
bors.
Denition 11.5. The hard-core Gibbs measure has a specication given by
P(X(v)=1|X(V v)) =
λ
1 +
λ
·
w:{v,w}∈E
(1 x(w))
P(X(v)=0|X(V v)) = 1 P(X(v)=1|X (V v)).
Denition 11.6. An innite graph G =(V, E) is a d-regular innite tree if there is
a unique path connecting any two distinct nodes of the tree, and the degree of each
node is d.
For d-regular innite trees, the phase transition for the hard-core Gibbs measure
is known. In this case, the phase transition is a value of the parameter
λ
c
such that
for
λ
<
λ
c
, there exists a unique measure that satises the local properties of Deni-
tion 11.5, and for
λ
>
λ
c
, there are multiple measures that do so.
Lemma 11.10. The uniqueness threshold for the d-regular tree is
λ
c
(d)=(d
2)
1
[(d 1)/(d 2)
d1
].
When d is large,
λ
c
(d) e(d 2)
1
. Compare that to the analysis of the run-
ning time for the bounding chain for the swap chain for the hard-core gas model in
Lemma 4.2, which showed that it was possible to sample exactly from the model
when
λ
2(d 2)
1
.
Recall that an (
ε
,
δ
)-randomized approximation scheme (Denition 2.13) returns
a value with absolute relative error greater than
ε
with probability at most
δ
.
Denition 11.7. A fully polynomial-time randomized approximation scheme (fpras)
is an (
ε
,
δ
)-ras that runs in time polynomial in the input size of the problem,
δ
1
,
and
ε
1
.
Using TPA or PPE together with the bounding chain approach gives a fpras for
the hard-core gas model when
λ
2(d 2)
1
. Sly and Sun [118] were able to show
that this is only possible b ecause
λ
falls below
λ
c
. To state their result properly, one
more complexity class de n ition is needed.
Denition 11.8. A decision problem is in the class randomized polynomial time (RP)
if the algorithm runs in time polynomial in the input; if the correct answer is false,
then the algorithm returns false; and if the correct answer is true then it returns true
with probability at least 1/2.
The question of whether NP = RP (like that of whether NP = P) is still open, but
the wealth of NP complete problems indicates that it is unlikely that a randomized
polynomial time algorithm to solve a problem that is NP complete will be found any
time soon.
Theorem 11.4 (Theorem 1 of [118]). Fo r d 3, unless NP = RP, there exists no
fpras for the partition function of the hard-core model with parameter
λ
>
λ
c
=
(d 1)
d1
/(d 2)
d
on d-regular graphs.
218 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
Now, Weitz gave a fpras for the hard-core mode l par tition function on graphs of
maximum degree d whenever
λ
<
λ
c
(d). Therefore, for
λ
< 2(d 2)
1
it is known
how to perfectly sample in polynomial time, for
λ
<
λ
c
(d) e(d 2)
1
it is known
how to approximate the partition function with a fpras, and for
λ
>
λ
c
(d) there exist
graphs where no fpras can exist unless NP = RP.
For this reason, the value 2(d 2)
1
is known as an articial phase transition of
the perfect simulation algorithm. Currently it is unknown if it is possible to perfectly
sample from all graphs with maxim um degree d in polynomial time when
λ
<
λ
c
(d),
or if this articial p hase transition reects something deeper about the problem.
In other models, the question of how good the perfect simulation can be is still
unknown. For example, the ferromagnetic Ising model with unidirectional magnetic
eld is known to have a polynomial time algorithm for all values of the parame-
ter [70]. However, despite the apparent success in practice for CFTP on this model,
it is unknown if the method provably runs in polynomial time on these problems.
For other classes of problems the situation is different. For linear extensions (as
has been seen), the perfect simulation alg orithms are known to be the fastest pos-
sible for the problem at hand, beating even the approximate sampling algorithms.
Another example is in sampling from weighted permutations for the permanent on
dense matrices, where the perfect sampling algorithms have known bounds that beat
the approximate sampling methods.
11.9 The future of perfect simulation
Perfect simulation has come far from its humble beginnings in the accep-
tance/rejection protocol. Coupling from the past opened up a multitude of problems
for simulation, including the Ising model. While the Ising model does have a polyno-
mial method for approximate simulation [70], this is far from a linear time method.
For many decades it had resisted the creation of a near-linear time algorithm, to the
point where many did not think that such an algorithm was even possible.
In the twenty years since the creation of CFTP, a wealth of new perfect sim-
ulation protocols have been invented. Many, such as partially recursive accep-
tance/rejection, and retrospective sampling, are variations on the original accep-
tance/rejection that h ave opened up broad new problem areas to perfect simulation.
Others such as the randomness recycler are a hybrid of Markov chain and accep-
tance/rejection type methods.
Fill’s method showed that looked at from a certain point of view, CFTP could be
seen as a variant of AR run in th e reverse direction in time. Are all perf ect simulation
algorithms in the end variants of AR?Isthearticial phase transition phenomenon
intrinsic to certain problems, or mer e ly an artifact o f the algorithms in use? The
set of problems addressable by perfect simulation protocols continues to grow, albeit
more slowly than after the introduction of CFTP jumpstarted the eld. Many of these
algorithms run into problems that are NP or #P complete. The most interesting open
question about perfect simulation is this: how far can these ideas be taken?
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset