Chapter 9: Advanced Acceptance/Rejection (1/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 9

Advanced Acceptance/Rejection

Happiness can exist only in acceptance.

George Orwell

Chapter 2 presented the acceptance/rejection (AR) protocol, which runs into trou-

ble wh e n naively applied to permutation problems suc h as estimating the permanent

and h igh-dimensional Markov random ﬁeld problems, such as sampling from the

Ising model. This is because typically the number of combinatorial objects in a set

grows exponentially in the dimension. In this chapter three ideas are introduced to

deal with this problem, sequential acceptance/rejection (SAR), partially recur sive ac-

ceptance/rejection (PRAR), and popping. Together with the ability to preprocess a

given problem instance, SAR gave the ﬁrst perfect simulation algorithms for estimat-

ing the permanent that has provably polynomial time over a nontrivial subclass of

nonnegative matrices. PRAR allows the acceptance/rejection approach to be applied

to Markov random ﬁelds such as the Ising or autonormal model without involving

Markov chains in any way. Popping allows the p ieces of the sample that are blocking

acceptance to be pruned away, giving a perfect simulation algorithm for sampling

from the rooted trees of a graph.

9.1 Sequential a cceptance/rejection

Theoretically, AR can be used on any set where the set has an upper bound. For

instance, if A is a ﬁnite set with #A ≤ M,simplydrawU ∼ Unif([0,M]) until U ∈

(0,#A]. If the elements of #A are numbered, then return element U.

Unfortunately, numbering the elements of A is not always possible in practice,

especially when the upper bound is not given through a complex argument rather

than a simple assignment of number to object.

Sequential acceptance/rejection (SAR) can be employed when an upper bound

on the partition function can be shown using either induction o r strong induction.

For instance, suppose that the goal is to sample from an unnormalized density

on the set of permutations on n objects, so P(X =

)=w(

∑



∈S



),where

∑



∈S



) is difﬁcult to compute.

When X(1) is assigned, the remaining components X(2),...,X(n) form a permu-

tation of the elements {1, 2,...,n}{X(1)}. A problem where assigning one com-

ponent leaves a subset of objects of the same type is called self-reducible, and this

161

162 ADVANCED ACCEPTANCE/REJECTION

is a key property for using random samples from an unweighted density in order to

estimate the partition function (see [72] for details).

For convenience, SAR will be described for discrete sets, but this method ap-

plies equally well to continuous problems. For a nonnegative function w,letZ(A)=

∑

x∈A

w(x),wherex ∈ C

is a vector in an n-dimensional space whose elements are

drawn from C.

Let i ∈{1,...,n}.Ifforallx ∈ A with w(x) > 0, x(i)=a

, say that the ith co m-

ponent of x is ﬁxed, otherwise it is free.

Lemma 9.1. Let M(A) be a function with the following properties.

• If A = {x},thenM(A) ≥ w(x).

• For any A with free dimension i, and c ∈C, let A

= {x : x(i)=c}.Then

M(A) ≥

∑

c∈C

M(A ∩A

). (9.1)

Then M(A) ≥ Z(A)=

∑

x∈A

w(x).

Proof. Use induction on the number of free dimensions in A.IfA has no free dimen-

sions, then A = {x},andM(A) ≥ w(x)=Z(A), so the base case is done.

The induction hypothesis is that the result applies when A has k or fewer free

dimensions; n ow consider A with k + 1 free dimensions. Choose any one of the di-

mensions. The key observation is that Z(A)=

∑

c∈C

Z(A ∩A

). Hence

M(A) ≥

∑

c∈C

M(A ∩A

) ≥

∑

c∈C

Z(A ∩A

)=Z(A). (9.2)

[Note it is necessary to use a strong induction hypothesis here and suppose the

technique works for k or fewer free dimensions. This is because for certain w func-

tions, ﬁxing one dimension might also effectively ﬁx other dimensions as well!]

SAR gives samples from unnormalized density w(x) given the ability to compute

M(A) with the properties in Lemma 9.1. As before, let C = {c

,...,c

} denote the

set of possible labels for any x(i).

SAR Input: A, Output: X

1) A

← A

2) While #A > 1

3) Let i be any free dimension of A

4) For all j ∈{1 ,...,k}, A

←{x ∈ A : x(i)=c

}

5) Draw I so P(I = j)=M(A

)/(M(A)), P(I = 0)=1 −

∑

M(A

)/M(A)

6) If I = 0thenX ← SAR(A

) and return X

7) Else A ← A

8) Draw B ←Bern(w(x)/M({x}))

9) If B = 0thenX ← SAR(A

),returnX

10) Else let X be the only element of A

APPLICATION: APPROXIMATING PERMANENTS 163

Before showing that the overall algorithm works, it helps to have a preliminary

understanding of what happens when lines 2 through 9 complete without recursively

calling SAR.

Lemma 9.2. Let E

denote the event that SAR reaches line 10 with A = {x}.Then

P(E

)=w(x)/M(A

Proof. The proof is by induction on the size of A at line 2. When A has one element,

then lines 8 and 9 make the lemma true. Suppose that the lemma holds for #A ≤ n,

now suppose #A = n + 1.

Let x ∈ A.Thenx ∈ A

for some j ∈{1,...,k}.SoP(E

) is the probability

that set A

is chosen times the probability that x is chosen uniquely from A

.This

second factor is w(x)/M(A

) by the strong induction hypothesis. Hence P(E

[M(A

)/M(A)][w(x)/M(A

)] = w(x)/M(A) and the induction is complete.

Lemma 9.3. Suppose there exists at least one x with w(x) > 0. The output X is a

draw from the unnormalized density w. The expected running time is M(A)/Z(A)

times the number of free dimensions in A.

Proof. Conditioned on acceptance occurring (so no recursive calls to SAR are made,

P(X = x|accept) ∝ w(x).OurAR theory then says that P(X = x) ∝ w(x).

The chance of accepting is

∑

w(x)/M(A)=Z(A)/M(A). The running time result

then fo llows from the basic AR theory.

As with all AR-type algorithms, if n is the number of steps needed to obtain a

single sample, M(A)/n provides an immediate estimate of Z

or this algorithm can

be used as part of a GBAS approximation (see Section 2.5.)

9.2 Application: approximating permanents

In Section 1.6.2 the permanent of a matrix A =(A(i, j)) was given as

per(A)=

∑

∈S

∏

i=1

A(i,

(i)). (9.3)

This should not be confused with the determinant of a matrix, deﬁned as

det(A)=

∑

∈S

(−1)

sign(

)

∏

i=1

A(i,

(i)). (9.4)

While the p roblems appear similar on the surface, the determinant of a matrix can be

computed using (to the same order) the time needed to perform matrix multiplication,

whereas ﬁnding the permanent of a matrix is a #P-complete problem. [124].

When A is a nonnegative matrix, the permanent is the normalizing constant of

the unnormalized density on permutations

∏

i=1

A(i,

(i)). (9.5)

164 ADVANCED ACCEPTANCE/REJECTION

This problem has been studied extensively. A Markov chain for this distribution

was shown to have polynomial running time in 2001 ([71, 73]) but the large constants

involved in this result (and subsequent improvements [12]) make it impractical for

applications.

The techniques of this section not only generate samples from the distribution

given by w, they also immediately give an approximation algorithm for per(A) with-

out any extra work. These methods are provably polynomial when the matrix is

dense, with small constants.

9.2.1 Basic AR

Suppose that A



is formed from A by multiplying a row of A by a constant k.Since

each term in per(A) will then be multiplied b y the same k,per(A



)=k per(A).Soif

B is formed by dividing each row o f A by its largest element,

per(B)=



∏

i=1

max

A(i, j)



per(A), (9.6)

and the weights for B give the same distribution as the weights for A . The difference

is that the elements of B now fall into [0 , 1].

To use the basic AR approach, draw

uniformly from S

, the set of all permu-

tations. Then accept as a draw from the weights associated with A with probability

∏

i=1

B(i,

(i)).

The expected number of samples required is

n!/per(B). (9.7)

Unfortunately, this can easily be exponential in n. For instance, if A is the identity

matrix, so is B, and per(B)=1, so n! samples are required to generate a single sample

using basic AR.

9.2.2 Sequential AR

Sequential acceptance/rejection can be used with better bounds on the permanent to

greatly improve upon basic AR for estimating the permanent.

First consider the case when the entries of the matrix (a

) are either 0 or 1. In

this case, w(

)=1 if and only if a

(i)

= 1foralli. Therefore for each i,thereisa

set D

such that

(i) must fall into the set to obtain positive weight. Then the goal is

to draw uniformly from the set of permutations such that

(i) ∈ D

for all i.

For example, with the matrix

A =

⎛

⎝

110

011

101

⎞

⎠

, (9.8)

= {1, 2}, D

= {2, 3},andD

= {1, 3}. The permanent o f this matrix is 2 , since

(1,2,3) and (2,3, 1) are the only two permutations that have

(i) ∈ D

for all i.

APPLICATION: APPROXIMATING PERMANENTS 165

In the ﬁrstrow,itispossibletohave

(1)=1or

(1)=2. For a matrix A,letA

i, j

be the matrix where

(i)= j,sotheentryA(i, j) is unchanged, and the other entries

in row i are changed to 0. Then each of these choices gives rise to a subproblem:

1,1

⎛

⎝

100

011

101

⎞

⎠

, A

1,2

⎛

⎝

010

011

101

⎞

⎠

, A

1,3

⎛

⎝

000

011

101

⎞

⎠

. (9.9)

The set of permutations allowed by A consists of the union of the set of allowed per-

mutations of A

1,1

, the set of allowed perm utations of A

1,2

, and the set of allowed

permutations of A

1,3

. These last three sets are disjoint. So per(A)=per(A

1,1

per(A

1,2

)+per(A

1,3

In order to use SAR on this problem, it is necessary to have a bound per(B) ≤

M(B) that can be proved inductively, so M(B) ≥ M(B

)+···+ M(B

).Themost

basic approach uses a very simple bound on the permanent.

Lemma 9.4. Let r

= #D

be the sum of the elements of row i of the matrix A. Deﬁne

(A)=

∏

i=1

. (9.10)

Then for any row i,

i,1

)+M

i,2

)+···+ M

i,n

) ≤ M

(A), (9.11)

and per(A) ≤ M(A).

Proof. Prove (9.11) ﬁrst. Note that if A(i, j)=0, M

i, j

)=0 as well, so (9.11) is

really just

(A) ≥

∑

j:A(i,j)=1

i, j

). (9.12)

But for j such that A(i, j)=1, M

i, j

)=M

(A)/r

. Hence

∑

j:A(i,j)=1

i, j

∑

j:A(i,j)=1

(A)

= M

(A), (9.13)

which completes the proof of (9.11).

Now show per(A) ≤ M(A) by induction on the number of rows i with r

> 1. If

= 0orr

= 1foralli then the result is true, so the base case holds.

Let i be any row with r

> 1. Then Z(A)=

∑

j:A(i,j)=1

Z(A

i, j

), and by induction

Z(A

i, j

) ≤ M

i, j

).So

Z(A) ≤

∑

j:A(i,j)=1

i, j

)=M

(A), (9.14)

which ﬁnishes the proof.

Therefore the conditions to apply SAR are met. Since the M

i, j

) are equal and

sum to M

(A) for all j such that A(i, j)=1, at each step choose uniformly from this

set of j to be the value f or

(i).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 9: Advanced Acceptance/Rejection (1/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 9: Advanced Acceptance/Rejection (1/5)