30 ACCEPTANCE/REJECTION
2.3 Union of sets
Another way to employ acceptance/rejection is to account for multiple ways of reach-
ing a sample. For instance, consider the problem of sampling from measure
ν
over
S
1
···∪S
n
,wheresetS
i
has nite measure
ν
(S
i
).
Consider the following algorithm: draw I from {1,2,...,n},whereP(I = i)
ν
(S
i
). Then conditioned on I,drawX from S
I
using measure
ν
.
Of course, X does not come from
ν
over S
1
···∪S
n
because there are multiple
ways that X could have been selected. For instance, if X S
1
and X S
2
,butX /
S
3
,...,S
n
,thenI could have been either 1 or 2 in selecting X.
Lemma 2.3. For the procedure above, X is a draw from
ν
over S
1
···S
n
with
density f (x)=#{i : x S
i
}/
ν
(S
1
···∪S
n
).
Proof. Let x S
1
···∪S
n
,andC =
ν
(S
1
···S
n
).Then
P(X dx)=
n
i=1
P(X dx,I = i)=
n
i=1
P(I = i)P(X dx|I = i)
=
n
i=1
C
1
ν
(S
i
)1(x S
i
)
ν
(dx)
ν
(S
i
)
1
= C
1
n
i=1
1(x S
i
)
= C
1
#{i : x S
i
}.
The next step is to use AR to obtain samples from the target density, which is just
proportional to 1. Again it is not necessary to learn the value o f
ν
(S
1
∪···∪S
n
) before
using AR, as knowledge of the normalizing constant of the density is not necessary.
AR union sets
1) Repeat
2) Draw I so that P(I = i)
ν
(S
i
)
3) Draw X from
ν
over S
I
4) Draw C as Bern(1/#{i : X S
i
})
5) Until C = 1
For instance, suppose that the goal is to sample uniformly from the union of the
unit circles centered at (0,0), (1/2,0),and(1/2, 1/2).ThendrawI Unif({1,2,3}).
If I = 1, draw X from the unit circle centered at (0, 0).ForI = 2or3,drawX from
the unit circle centered at (1/2,0) and (1/2,1/2) respectively. Accept X as a draw
from the union with probability equal to th e inverse of the number of unit circles that
X lies in (see Figure 2.3.)
2.3.1 Example: AR for disjunctive normal form
An example of the union of sets method in action comes from Boolean logic, which
deals with formulas that are either true or false. Begin with some denitions.
Denition 2.1. A propositional variable is a variable that can be either true or false.
UNION OF SETS 31
X
Figure 2.3 Drawing uniformly from the union of three circles of equal area. First draw I
uniformly from {1,2, 3}, then the point X uniformly from circle I. Finally, accept X with prob-
ability 1 over the number of circles that X falls into. In this example, circle 2 in the lower right
was chosen, then the point X that was uniform over this circle falls into two of the circles, so
it would be accepted as a draw with probability 1/2.
Denition 2.2. The negation of a propositional variable is false if the variable is
true, and true if the variable is false. The negation of x will be denoted here as ¬x. A
formula that is of the form x or ¬x is called a literal.
Denition 2.3. The logical operator or is a binary operation on literals, written xy,
and dened as true if either x or y (or both) are true, and false when both x and y are
false.
Denition 2.4. The logical operator and is a binary operation on literals, written
x y, and dened as true if both x and y are true, and false otherwise.
Denition 2.5. A formula is in disjunctive normal form (DNF) if it is written as
clauses of literals connected by and operators, and each clause is connected by or
operators.
For example, (x
1
x
3
) (x
2
x
4
) is in DNF form, but (x
1
x
3
) (x
2
x
4
) is not.
Denition 2.6. A satisfying assignment for a formula is a conguration where each
propositional variable is a ssigned either true or false in such a way that the formula
is true.
For a formula in DNF, at least one clause must be true. Consider the prob-
lem of drawing uniformly from the set of satisfying assignments. The basic accep-
tance/rejection technique would randomly assign to each variable either true or false,
and then accept the result if the f ormula is satised.
Unfortunately, the chance that a single clause is satised goes down exponentially
in the length of the clause. To be precise, for a clause with k literals, there is exactly
a1/2
k
chance that the clause evaluated to true. For a single clause that contains all n
variables, the chance of evaluating to true is 1/2
n
.
Let S
i
denote the set of assignments that satisfy clause i.Iftherearem clauses in
the formula, the goal is to sample uniformly from S
1
···∪S
m
.
The size of each S
i
can be calculated as follows. Suppose clause i contains (i)
literals. In a DNF formula all the literals in the clau se must be set to true, so there is
only one way to set each of the (i) variables to make the literals true.
The o ther variables can be set in any fashion desired. The result is that #S
i
=
2
n(i)
. Now simply apply the AR union sets procedure.
Since there are at most m clauses, the chance of accepting at each step is at least
32 ACCEPTANCE/REJECTION
1/m. Therefore this method draws at most (on average) m DNF formulas before
accepting, making this a polynomial time algorithm.
2.4 Randomized approximation for #P complete problems
The problem of whether a satisfying assignment to a problem in DNF exists is an
example of the problem class called NP. Intuitively speaking, a yes-no problem is
in NP if whenever the answer is yes, there is a way to prove the answer that can be
checked in polynomial time.
Denition 2.7. A decision problem is a question in a formal system (such as a Turing
machine) such that given the values of input parameters, the answer is either yes or
no.
Denition 2.8. A decision problem is in the class nondeterministic polynomial (NP)
if whenever the answer to the problem is yes, there always exists a certicate that
can be used to prove the answer true or false in a number of steps that is polynomial
in the input to the problem.
Denition 2.9. A decision problem is in the class polynomial (P) if there is a way to
determine if the answer is yes or no in a polynomial number of steps in the input.
The DNF problem is: given a DNF formula, does there exist an assignment for
which the f ormula evaluates to true? This problem is in NP, since when the a nswer is
yes, it is possible to give an assignment that can be checked in polynomial time.
This problem is also trivially in P, since every such formula has a satisfying as-
signment if and only if there exists a clause that does not contain both a variable and
its negation. This condition is easily checked in polynomial time.
One of the most inter esting question s in mathematics is: does P = NP?Thatis,
can every problem whose answer can be checked in polynomial time have a solution
algorithm that runs in polynomial time?
Problems in Number-P are slightly different. These problems are related to NP,
but are not yes-no; instead they count the number of solutions to a problem.
Denition 2.10. A problem is in the class Numb er-P (abbreviated #P) if it consists
of counting the elements of a discrete set, where it is possible to test membership in
the set in time polynomial in the input to the problem.
An example of a problem in #P is counting the number of satisfying assignments
to a DNF problem. Each individual satisfying assignment can be checked in polyno-
mial time.
In order to understand these complexity classes further, the notion of reducing
problems is needed.
Denition 2.11. Say that decision problem A is polynomial time reducible to deci-
sion problem B if given a problem input I
A
to problem A, it is possible to build in
polynomial time an input I
B
of size polynomial in I
A
where the answer to problem B
with input I
B
is the same as the answer to problem A with input I
A
.
In other words, A reduces to B if knowing how to solve B gives a method for
solving A.
RANDOMIZED APPROXIMATION FOR #P COMPLETE PROBLEMS 33
Denition 2.12. A problem is NP complete if all problems in NP are polynomial
time reducible to it. A problem is #P complete if all problems in #P are polynomial
time reducible to it.
In an NP complete problem, the goal is to determine if there is at least one solu-
tion. In #P complete problems, the goal is to count exactly the number of solutions.
Therefore #P complete problems are in one sense more difcult that NP complete
problems.
So why bring all this u p now? Because counting the number of satisfying assign-
ments to a DNF formula is a #P complete problem.
This means that it is unlikely that an efcient algorithm for counting the number
of such solutions exactly will be found. It turns out that having an AR algorithm
means that there is an efcient method for estimating the number, though!
The number of draws used by an AR algorithm is geometric with parameter equal
to the probability of success. So in basic AR where draws from
ν
over B are made
until the result falls in A, the chance of success is
ν
(A)/
ν
(B). Say that T draws are
used by the algorithm. If
ν
(B) is known, then 1/T is an estimate of
ν
(A)/
ν
(B),so
ν
(B)/T is an estimate of
ν
(A).
Denition 2.13. Let P be a problem with input in I and output in R.Callacom-
putable function f : I ×[0,1] an (
ε
,
δ
)-randomized approximation scheme (ras) if
for U Unif([0,1])
(I I )
P
f (I,U)
P(I)
1
>
ε
δ
.
For an (
ε
,
δ
)-ras, the chance that the relative error is greater than
ε
is at most
δ
.
Consider the problem of estimating the probability of heads for a coin that
can be ipped as often as needed. This is equivalent to estimating the mean of
Bernoulli random variables X
1
,X
2
,... which are identically distributed with mean
p. Dagum, Karp, Luby, and Ross gave an (
ε
,
δ
)-ras for this problem that uses
O(p
1
ε
2
ln(
δ
1
)) samples [26]. Rather than describe this algorith m here, a more
advanced version is given in Section 2.5.
This is exactly what is needed when AR is being used to estimate the solution to
a problem in #P. The value of p is
ν
(A)/
ν
(B), therefore, AR gives an (
ε
,
δ
)-ras that
uses O(
ν
(B)
ν
(A)
1
ε
2
ln(
δ
1
)) samples.
Fact 2.1. An (
ε
,
δ
)-ras for counting a DNF satisfying assignments, where the for-
mula contains n variables and m clauses, exists that uses O((n+ln(m))m
ε
2
ln(
δ
1
)
Bernoulli random variables on average.
Proof. The underlying measure
ν
here is just counting measure on the 2
n
possible
true-false values for the variables. The target m easure is f (x)=1(x is satisfying),
and so Z
f
is the number of satisfying assignments. Now g(x)=#{satised clauses},
and so Z
g
mZ
f
. Finally, c = 1. Hence Z
f
/[cZ
g
]geq1 /m.
Each draw requires choosing I Unif({1,...,m}) (which requires ln(m)
Bernoullis) and then an assignment of up to n variables (which requires n
Bernoullis).
34 ACCEPTANCE/REJECTION
2.5 Gamma Bernoulli approximation scheme
As seen in the last section, using AR to approximate integrals or sums reduces to the
question of estimating the probability p of heads on a coin that can be ipped as many
times as necessary. The Gamma Bernoulli approximation scheme (GBAS)[56]isa
simple way of building an estimate ˆp for p with the unique property that the relative
error in the estimate does not depend in any way upon the value of p.
First, the estimate. In pseudocode, it looks like this.
GBAS Input: k
1) S 0, R 0.
2) Repeat
3) X Bern(p), A Exp(1)
4) S S + X, R R + A
5) Until S = k
6) ˆp (k 1)/R
To understand what is happening in this algorithm, it helps to introduce the notion
of a Poisson process.
Denition 2.14. Suppose that A
1
,A
2
,... are iid Exp(
λ
). Then the points A
1
,A
1
+
A
2
,A
1
+ A
2
+ A
3
,... form a Poisson point process of rate
λ
on [0, ).
Poisson point processes are often used as models of customer arrivals. For in-
stance, suppose that customers are arriving at a rate of 6 per hour. Since this is a
Poisson point process, this does not tell us that exactly 6 arrive every hour, instead it
only tells us that on average 6 arrive in the rst hour. In the rst half-hour, on aver-
age 6(1/2)=3 customers arrive. In the rst quarter-hour, o n average 6(1/4)=1.5
customers arrive, and so on.
It could be that in the rst hour, no customers actually arrive or it could be that
ten arrive. Any nonnegative integer number of arrivals is possible during this time
period, but of course some are more likely than others.
Now suppose that for every customer that arrives, only one in six actually buys
something when they are in the store. Then intuitively, if overall customers are arriv-
ing at a rate of 6 per hour, and each has a 1/6 chance of m aking a purchase, then the
rate at which customers that buy something arrive is only one per hour. The follow-
ing lemma shows that this intuition is correct. It says that if each po int in the Poisso n
point processes is marked with a Bernoulli random variable (a coin ip), then the
time from 0 until the rst point that gets a 1 is itself an exponential random variable,
albeit one with a rate proportional to the old rate times the probability of a 1.
Fact 2.2. Let A
1
,A
2
,... be iid Exp(
λ
), and G Geo(p).ThenA
1
+ ···+ A
G
Exp(
λ
p).
The easiest way to show this result is with moment-generating functions.
Denition 2.15. The m oment-generating fun c tion of a random variable X is
mgf
X
(t)=E[exp(tX)]. (2.2)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset