Chapter 2
Acceptance/Rejection
’Tis a lesson you should heed:
Try, try, try again.
If at rst you don’t succeed,
Try, try, try again.
Thomas H. Palmer
The acceptance/rejection (AR) method was the rst major protocol for perfect
simulation. The basic method gives a way of sampling from a random variable given
some condition on that variable. When used with an auxiliary variable, this method
can be extended to sampling from any bounded density. Other bounds on probabili-
ties such as Markov’s inequality and Chernoff bounds can actually be viewed as AR
algorithms, and these bounds assist in simulating from rare events.
2.1 The method
In Example 1.1 of the p revious chapter, the goal was to draw uniformly from
{1,2,...,5} given a fair six-sided die. The solution was to roll the die until the re-
sult fell into {1,...,5}. Intuitively this gives the desired distribution; the following
theorem proves this fact and generalizes the result.
Theorem 2.1. Suppose that
ν
is a nite measure over B, and A is a measurable
subset of B with
ν
(A) > 0.LetX
1
,X
2
,... be iid draws from
ν
over B, and T = inf{t :
X
t
B}.Then
X
T
[X
1
|X
1
A]. (2.1)
That is, if you repeatedly draw from the measure
ν
over B, and return the rst value
that lies in A, the resulting measure is from
ν
conditioned to lie in A.
Proof. This follows from the Fundamental Theorem of perfect simulation (Theo-
rem 1.1), using U
1
,U
2
,... iid draws from
ν
over B, b(U)=1(U A), g(U)=U ,
and f (X,U)=X.
Since f (X,U) does not depend on U, this is an interruptible algorithm.
The above theorem gives a mathematical description of the algorithm; see Fig-
ure 2.1 for a graphical illustration. In pseudocode it is quite simple.
25
26 ACCEPTANCE/REJECTION
B
A
Figure 2.1 Acceptance/rejection draws from the larger set B until the result falls in the smaller
set A. The nal point comes from the distribution over B conditioned to lie in A.
AR
1) Repeat
2) Draw X from
ν
over B
3) Until X A
The number of steps taken by this algorithm (on average) depends on how large A is
compared to B.
Lemma 2.1. The probability that X Ais
ν
(A)/
ν
(B). The expected number of
draws used by the algorithm is
ν
(B)/
ν
(A).
Proof. From the last theorem, P(X A)=
ν
(A)/
ν
(B). This is just how probability
measures induced by nite measures are dened. Given this probability of success,
the number of draws of X is a geometrically distributed random variable with param-
eter
ν
(A)/
ν
(B). The mean of a geometric is just the inverse of the parameter.
2.1.1 Example: drawing from the unit circle
Consider the problem of drawing uniformly from the unit circle. Because the d istri-
bution of the angle and radius in the unit circle can be found analytically, it is possible
to give a direct method for doing so. Let U
1
and U
2
be iid uniforms over [0,1].Then
let
θ
= 2
π
U
1
and r =
U
2
.Then(r sin(
θ
),r cos(
θ
)) will be uniform over the unit
circle. AR gives a means for generating from the circle without the need to compute
functions other than additions and multiplications.
Begin with a bounding region for the unit circle that is easy to sample from.
The square [1,1] ×[1, 1] does the trick. Using this bounding region makes the
algorithm as follows.
AR unit circle Output: (X,Y) Unif(unit circle)
1) Repeat
2) Draw U
1
Unif([0,1]), U
2
Unif([0, 1])
3) X 2U
1
1, Y 2U
2
1
4) Until X
2
+Y
2
1
This method does not need to evaluate the trigonometric functions sine and cosine
that the exact method did, so it can be useful on processors of lim ited capability.
DEALING WITH DENSITIES 27
(X,Y )
Figure 2.2 To draw X with density f , draw (X,Y ) uniformly from the region under the density,
then throw away the Y value. This pr ocedure works whether or not the density is normalized.
The area of [1,1] ×[1, 1] is 4, while the unit circle has area
π
. Therefore the
expected number of uniforms drawn is (2)(4/
π
) 2 ·1.273 compared to the just two
uniforms that the exact method uses.
One more note: for (X,Y ) uniform over the unit circle, X/Y has a standard
Cauchy distribution. Therefore this gives a perfect simulation method for Cauchy
random variables that does not utilize trigonometric functions.
2.2 Dealing with densities
The basic AR method presented does not handle densities; however, densities are
easily added to the mix u sing auxiliary random variables. Consider the following
theorem from measure theory, known also as the fundamental theorem of Monte
Carlo simulation (see [115], p. 47).
Theorem 2.2 (Fundamental theorem of Monte Carlo simulation). Suppose that
X has (possibly unnormalized) density f
X
over measure
ν
on Ω.If[Y |X]
Unif([0, f
X
]),then(X,Y ) is a draw from the product measure
ν
×Unif over the set
{(x,y ) : x Ω,0 y f
x
}.
Why is this fundamental to Monte Carlo methods? Because this theorem in
essence says that all densities are illusions, that every problem in probability can
be reduced to sampling over a uniform distribution on an augmented space. Since
pseudorandom sequences are iid uniform, this is an essential tool in building Monte
Carlo methods. Moreover, the density of X can be unnormalized, which is essential
to the applications of Monte Carlo.
A simple example of this theorem is in one dimension, where
ν
is Lebesgue
measure. This theorem says that to draw a continuous random variable X with density
f , choose a point (X ,Y ) uniformly from the area between the x-axis and the density,
andthenthrowawaytheY value. Here it is easy to see that for any a R, P(X
a)=
a
f (x) dx as desired. (See Figure 2.2.)
To see how this operates in practice, let
μ
be a measure over Ω, and suppose for
density g it is possible to draw from the product density
μ
×Unif over Ω
g
= {(x, y) :
x Ω, 0 y g}.
Once this draw is made, and the Y component is thrown away, then the remaining
X component has density g. But suppose instead that the goal is to draw X with
density f instead.
Well, if (X,Y) has measure
μ
×Unif over Ω
g
,then(X ,2Y) has m easure
μ
×Unif
28 ACCEPTANCE/REJECTION
over Ω
2g
= {(x, y) : x Ω,0 y 2g}. Of course, there is nothing special about the
use of 2 there: for any constant c > 0, (X ,cY ) has measure
μ
×Unif over Ω
cg
.
Suppose c sup
xΩ
f (x)/g(x).Then
Ω
f
Ω
cg
.
Therefore, AR can be used to draw from Ω
f
:simplydrawX from density g,drawY
uniformly from [0,c ·g(X)], and if the resulting Y lands in [0, f (X )] (so that (X ,Y )
Ω
f
) then accept (X,Y ) as a draw from Ω
f
, which means X has density f .
Of course, to test if a uniform from [0,c ·g(X )] falls in [0, f (X )], it is not neces-
sary to actually generate the uniform. Simply draw a Bernoulli random variable with
mean f (X )/[c ·g(X)]. If successful, consider X a draw from f . Note that this method
applies even when f and g are unnormalized densities. It is given in the following
pseudocode.
AR densities
Input: f and g densities (possibly unnormalized) over
ν
Output: X with density f over
ν
Let c be any number greater than f (x)/g(x) for all x Ω
Repeat
Draw X from density g over
ν
Draw C as Bern(f (X )/[cg(X)])
Until C = 1
Since the p robability of accepting varies inversely with c, the best possible choice
of c is sup
xΩ
f (x)/g(x). I t is not always possible to calculate this value exactly. In
this case, an upper bound of the supremum can be used; it just makes for a slower
algorithm.
In AR
densities,letZ
f
=
xΩ
f (x)
ν
(dx) and Z
g
=
xΩ
f (x)
ν
(dx) be the nor-
malizing constants for the possible unnormalized densities f and g.
Lemma 2.2. The probability that AR
densities accepts a draw X is Z
f
/[cZ
g
].The
expected number of draws of X used by the algorithm is cZ
g
/Z
f
.
Proof.
P(C = 1)=E[E[1(C = 1)|X ]]
=
xΩ
P(C = 1|X = x)P(X dx)
=
xΩ
f (x)/[cg(x)](g(x)/Z
g
)
ν
(dx)
=[cZ
g
]
1
xΩ
f (x)
ν
(dx)
= Z
f
/[cZ
g
].
DEALING WITH DENSITIES 29
2.2.1 Example: from Cauchy to normal
To see h ow this works in p ractice, suppose that a user has the ability to generate from
the standard Cauchy distribu tion with density g(x)=[
π
(1 + x
2
)]
1
,andwishesto
generate a standard normal Z with density f (x)=(2
π
)
1/2
exp(x
2
/2).
First, the normalization is unnecessary, so let g
1
(x)=(1 + x
2
)
1
and f
1
(x)=
exp(x
2
/2).Now f
1
(x)/g
1
(x) 2forallx, so a valid algorithm is as follows.
Normal from Cauchy
Repeat
Draw X as a standard Cauchy
Draw C as Bern((1/2)(1 + X
2
)exp(X
2
/2))
Until C = 1
The number of X draws used by the algorithm has a geometric distr ibution with
parameter equal to the chance of accepting, which is
P(C = 1)=
xR
P(X dx,C = 1)
=
xR
g(x)
1
2
(1 + x
2
)exp(x
2
/2) dx =(2
π
)
1/2
= 0.3989...
The average number of X draws used by the algorithm equal to
2
π
= 2.506....
A faster algorithm can be created by using a better bound on (1+x
2
)exp(x
2
/2).
In fact, sup
x
(1+ x
2
)exp(x
2
/2)=2/
e. This makes the algorithm faster by a factor
of square root of e, which means the algorithm requires
2
π
/e = 1 .520 ... draws
from X on average.
As another example, suppose U Unif([0, 1]),andX = 1/U,wherex is the
oor function that is the greatest integer at most x.
Then P(X = i)=P(i 1/U < i + 1)=P(1/i U > 1/(i + 1)) = 1/(i
2
+ i) for
i ∈{1, 2,...}. Suppose the goal is to generate W so that P(W = i)=1/i
2
(this is a
Zipf distribution).
Then X has density f
X
(i)=1/(i
2
+ i) and the goal is to produce W with density
f
W
(i)=1/i
2
for counting measure over {1,2,3 ,...}.TouseAR for this task, note
sup
i∈{1,2,...}
1/i
2
1/(i
2
+ i)
= 2.
Hence the AR approach is as follows.
Zipf from Uniform
1) Repeat
2) Draw U Unif([0,1])
3) X ←1/U
4) Draw C as Bern((1/2)(1 + 1/X ))
5) Until C = 1
The chance that C = 1is
i=1
P(C = 1,X = i)=
i=1
1/(2i)=
π
2
/12. Hence the
expected number o f draws of X used is 12/
π
2
< 1.216.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset