GAMMA BERNOULLI APPROXIMATION SCHEME 35
Moment-generating functions are unique when they are dened over a nontrivial
interval of t values.
Lemma 2.4 (Moment-generating function uniqueness). Suppose mgf
X
(t)=mgf
Y
(t)
for all t [a, b] where a < 0 < b. Then X Y.
The important idea needed here is that the sum of a random number o f copies of a
random variable has moment-generating function equal to a composition of moment-
generating functions.
Lemma 2.5. If Y ∈{0,1,2,...} and X
1
,X
2
,... have nite moment-generating func-
tions for t [a,b],thenX
1
+ ···+ X
Y
has moment-generating function
mgf
X
1
+···+X
Y
(t)=mgf
Y
(mgf
X
i
(t)). (2.3)
Now to the proof of our fact.
Proof. A geometric random variable G with mean 1/p has mgf
G
(t)=pe
t
/(1 (1
p)e
t
),atleastwhent < ln(1p). The mgf of an exponential random variable with
rate
λ
is mgf
A
(t)=
λ
(
λ
t)
1
for t <
λ
.SothemgfofA
1
+ ···+ A
G
is
mgf
A
1
+···+A
t
(t)=
p
λ
(
λ
t)
1
1 (1 p)
λ
(
λ
t)
1
=
p
λ
p
λ
t
.
But this is just the mgf of an exponential of rate p
λ
, which completes the proof.
This procedure of marking points in the Poisson point process with independent
coin ips is called thinning. This concept is discussed in greater detail in Chapter 7.
Now consider the GBAS algorithm from earlier. It does this thinning procedure
starting from a rate of 1, and thinning using the p-coin ips. Therefore the new rate of
the thinned process is p. This is done k times, giving k iid Exp(p) random variables
that are then added together.
When you add independent exponentials together, you get a gamma distribution
with shape parameter equal to the number of exponentials added and a rate parameter
equal to the rate of the exponentials. This establishes the following fact.
Fact 2.3. In GBAS,R Gamma(k , p).
When a gamma distributed random variable is scaled by a constant, the rate
changes by the inverse of the constant. So R/(k 1) Gamma(k, p(k 1)).The
actual output of GBAS is (k 1)/R, which is the multiplicative inverse of a g amma
distributed random variable. This gives a distribution called the inverse gamma.
Fact 2.4. The output ˆpofGBAS is inverse gamma with shape parameter k and
scale parameter p(k 1). The estimate is unbiased: E[ ˆp]=p. Also, ( ˆp/p)
InvGamma
(k,1/(k 1)), so the relative error ( ˆp/p) 1 is a shifted inverse gamma
function.
Proof. The mean of X InvGamma(
α
,
β
) is
β
/(
α
1),soE[ ˆp]=p(k 1)/(k
1)=p. Dividing ˆp by p is the same as scaling it by 1/p, so the scale parameter is
multiplied by the same amount.
36 ACCEPTANCE/REJECTION
The key point of the fact is that ˆp/p 1 has a known distribution: the value of
p does not affect it in any way! The InvGamma(k,k 1) distribution is concentrated
around 1, and is more tightly concentrated as k increases. So GBAS can be used to
obtain an (
ε
,
δ
)ras as follows.
GBAS ras Input:
ε
,
δ
1) Let k = min{k : X InvGamma(k, k 1) P(|X 1|>
ε
) <
δ
}
2) S 0, R 0
3) Repeat
4) X Bern(p), A Exp(1)
5) S S + X , R R + A
6) Until S = k
7) ˆp (k 1)/R
2.5.1 Application: exact p-values
As an application of this method, consider the problem of estimating exact p-values.
Recall that a statistical model for data is a probability distribution such that the data
is believed to be a draw from the distribution. For instance, suppose an observer
believes that the number o f taxicabs in a town is 200, and that the number of a ran-
domly spotted taxi is equally likely to be any integer from 1 to 200. Suppose that the
observer sees one taxi, and it is numbered 14.
The observer wishes a principled method for determining if their statistical model
should be rejected based upon the data. One such method for doing so is called the
p-value approach. Let S be any statistic, that is, a function of the data. Then if D
is the data, and the hypothesis to be tested (often called the null hypothesis) is that
D
π
, then the p-value is just the probability that S(X) is at least as extreme as S(D)
where X
π
.
For instance, in the taxicab example, it would be unusual to see a low-numbered
taxi if the nu mber of cabs in the town is very large. Therefore, let S(x)=x,and
consider S(X) to be as extreme than S(D) if S(X) S(D).
Then S(D)=14, and so the p-value for the hypothesis that there are 200 cabs
and that the likelihood of seeing each cab number is the same is P(S(X) 14)=
14/200 = 7%.
In general, if X
1
,X
2
,...
π
using a perfect simulation algorithm, then the p-value
is exactly the chance that S(X) is at least as extreme as S applied to the data, and so
constitutes th e exact type of experiment where GBAS can be applied.
2.5.1 .1 Testing for Hardy-Weinberg equilibrium
An allele is a variant of a gene in an organism. For instance, one allele mig ht cause
fur to become brown, while a different allele of a gene causes f ur to be black in color.
If a gene is in Hardy-Weinberg equilibrium, the tness (propensity to reproduce)
of the organ ism is independent of which allele the organism carries. Such alleles will
not be selected either for or against in the population.
A diploid organism carries two copies of each gene, each of which could come
APPROXIMATE DENSITIES FOR PERFECT SIMULATION 37
from different alleles. The order o f the alleles does not m a tter. Hence if the two
alleles for a gene are denoted A and a, the genotype of the organism is either AA, Aa,
or aa. In Hardy -Weinberg equilibrium, the frequency of allele AA will be the square
of the frequency of allele A in the population. The frequency of Aa will be twice
the frequency of A times the frequency of a, and that of aa will be the square of the
frequency of a in the population.
Consider the problem of testing whether a population is in Hardy-Weinberg equi-
librium. For m alleles A
1
,...,A
m
, the data consists of the number of sampled individ-
uals with genotype A
i
A
j
for 1 i < j m.
Given data table D, Guo and Thompson [43] suggested using the statistics S(X)=
P(X = D),whereX is a random draw from tables of genetic data from populations
in Hardy-Weinberg equilibrium. When there are n individuals, this made the statistic
S(D)=
n!
m
i=1
m
j=i+1
D
ij
!
(2n)!
j>i
D
ij
!
2
j>i
D
ij
. (2.4)
The exact p-value is then P(S(Y ) S(D)),whereY is a perfect draw from tables
under the Hardy-Weinberg equilibrium model. The Guo and Thompson method for
directly samp ling fro m such tables was later improved in [59].
2.5.1.2 Testing for differential gene expression
A related task is to test for how gene expression changes under environmental con-
ditions. Each gene encodes proteins that are produced by the cell: the amount of
protein produced can be affected by the environment of the cell. Consider an experi-
ment (such as in [122]) where the expression rate of a gene is measured conditioned
on different environmental factors.
The data is a matrix where K
ij
is the number of experiments under condition j
that resulted in expression of gene i. If the gene is equally likely to be expressed
under all conditions, then (as in the Hardy-Weinberg example), it is easy to generate
data tables in linear tim e.
The test statistic is once again the probability of the table being generated, and the
p-value is the probability that a randomly drawn table has test statistics at m ost equal
to the test statistic applied to the data. Once again, GBAS can be used to estimate this
p-value with exact condence intervals.
2.6 Approximate densities for perfect simulation
The key step in AR fo r sampling from density f using density g is when fo r X g,
the C Bern( f (X)/[cg(X)] coin is ipped. However, it is not necessary to know
f (X )/[cg(X)] exactly in order to ip the C coin.
To draw C Bern(p),drawU Unif([0,1]),andletC = 1(U p). Now suppose
that for all n, a
n
p b
n
. Then it is not necessary to know p exactly in order to
determine if U p.
To be precise, suppose that a
1
a
2
a
3
··· and b
1
b
2
b
3
··· have
lima
i
= lim b
i
= p. Then draw U Unif([0, 1]).IfU lim a
i
= p,thenC = 1, if
38 ACCEPTANCE/REJECTION
U > lim b
i
= p,thenC = 0. Fortunately, it is not necessary to compute the entire
innite sequence of {a
i
} and {b
i
} in order to determine where U falls with respect
to the p.IfU a
n
for some n, then certainly U p. On the other hand, if U > b
n
,
then denitely U > p , and the algorithm can stop.
Bounded mean Bernoulli Input: p, {a
n
}, {b
n
} Output: C Bern(p)
1) n 1, U Unif([0,1])
2) While U (a
n
,b
n
] do
3) n n + 1
4) C 1(U a
n
)
While the algorithm only needs to generate one uniform random variable U ,it
might take quite a few oating point comparisons to nd C.
Lemma 2.6. The expected value o f n is 1 +
i=1
(b
i
a
i
).
Proof. The value of n = 1 initially. For xed i, the probability that i > n is just P(U
(a
i
,b
i
]) = b
i
a
i
. The result then follows from the tail sum formula for expectation
of nonnegative integer valued random variables.
Therefore, in order for this approach to be efcient, it is necessary to have bounds
{a
i
} and {b
i
} that approach each other rapidly.
2.6.1 Example: First passage time of Brownian motion
Standard Brownian motion {B
t
}
tR
is a stochastic process that star ts at 0 (so X
0
= 0),
is continuous with probability 1, has independent increments (so for any a < b < c <
d, B
d
B
c
and B
b
B
a
are independent), and normal increments (so for any a < b,
B
b
B
a
N(0 , b a)). See Chapter 10 for a formal denition and a more detailed
description of the properties o f Brownian motion.
Because the map t → B
t
is continuous with probability 1, it makes sense to de-
ne stopping times such as T
1
= inf{t : B
t
∈{1 , 1}}. Stopping times of this form
are known as a rst passage time. Simulation of T
1
arises in mathematical nance
applications when pricing barrier-options.
Lemma 2.7. The density of T
1
is given by the following alternating innite series
g
T
1
(t)=
1
2
π
t
3
e
1/(2t)
+
j=1
(1)
j
2
π
t
3
c
j
c
j
=(2 j + 1 ) exp
(2 j + 1)
2
2t
(2 j 1)exp
(2 j 1)
2
2t
.
Burq and Jones [19] showed that
g
T
1
(t)/[cg
λ
,b
(t)] 1 (2.5)
for a gamma density g
λ
,b
(x)=
λ
b
x
b1
e
λ
x
/Γ(b) for c = 1 .25, b = 1.09,
λ
= 1.24.
For t [0, 1],leta
1
(t)=(2
π
t
3
)
1/2
exp(1/(2t)). For i 1, let b
i
(t)=a
i
(t)+
SIMULATION USING MARKOV AND CHERNOFF INEQUALITIES 39
(2
π
t
3
)
1/2
(2i 1)exp ((2i 1)
2
/(2t)) and a
i+1
(t)=b
i
(t) (2
π
t
3
)
1/2
(2i +
1)exp((2 j + 1)
2
/(2t)).
Since max
x0
xexp(x
2
/(2t)) occurs at x =
t and the function is strictly de-
creasing thereafter, for 2i 1 >
t the sequences a
i
and b
i
become monotone and
oscillate around the target value. Moreover, the difference b
i
a
i
goes down faster
than exponentially, therefore only a constant number of steps (on average) are nec-
essary to generate one draw from the distribution using Bounded
mean Bernoulli.
2.7 Simulation using Markov and Chernoff inequalities
In full generality, any upper bound on the size or number of a set of objects is equiv-
alent to some AR-style algorithm.
As an example of this, Bucklew [18], showed h ow Markov’s inequality and Cher-
noff in e qualities [23] could actually be viewed as AR algorithms.
2.7.1 Ma rkov’s inequality as an AR algorithm
Begin with Markov’s inequality.
Lemma 2.8 (Markov’s Inequality). Consider a random variable X that is nonnega-
tive with probability 1 and has nite expectation E[X ]. Then for all a > 0,
P(X a) E[X]/a. (2.6)
From a sim ulation perspective, Markov’s inequality allows us to sample from
[X|X a]. In other words, if X has density f
X
(x), then the goal is to sample from
the unnormalized density f
X
(x)1(x > a).TouseAR, use the unnormalized density
xf
X
(x).Then f
X
(x)1(x a)/[xf
X
(x)] is 0 when x < a,and1/x when x a,andso
is never greater than 1/a. This gives rise to the following algorithm.
AR Markovs Inequality Input: density f , a Output: X f given X > a
1) Repeat
2) Draw X from unnormalized density xf(x)
3) If X a,drawC Bern(a1(X a)/x)
4) Until C = 1
Note the normalizing constant for xf(x) is just
xf(x)
ν
(dx)=E[X].Sothe
chance of accepting is just
x0
xf
X
(x)
E[X]
a1(X > a)
x
ν
(dx)=
x>a
af
X
(x)
E[X]
ν
(dx)=
aP(X > a)
E[X]
. (2.7)
Since this chance of accepting must be at most 1, this fraction is at most 1, which
proves Markov’s inequality.
Using the ideas of Section 2.4, this can also then be used to build an (
ε
,
δ
)-ras
P(X > a) in O([E[X ]/P(X > a)]
ε
2
ln(
δ
1
)) samples.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset