204 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
an estimate of Z
β
for many values of the temperature simultaneously is known as an
omnithermal approximation.
A single run of TPA generates a sequence of points that form a Poisson process
of rate 1 in the interval [ln(
ν
(B)),ln(
ν
(A))]. A nice fact about Poisson processes
is that the union of two PPP (called a superposition) forms a third PPP whose rate is
the sum of the rst two rates.
Theorem 11.1. Let P
1
and P
2
be two Poisson point processes on Ω with rates
λ
1
and
λ
2
respectively. Then for processes such that for a ll x Ω, P(x P)=0,P
1
P
2
is a
PPP on Ω of rate
λ
1
+
λ
2
.
See [105, p. 23] for a proof.
By running TPA k times from
β
A
to
β
B
, and taking the union of the resulting
output, it is possible to create a set of points P that form a Poisson point process of
rate k over [ln(
ν
(B)),ln(
ν
(A))]. Each of these points corresponds to a specic
β
value. Call these
β
values
β
1
>
β
2
> ···>
β
#P
.
For
β
[
β
A
,
β
B
], retaining only the points that have
β
i
β
gives a Poisson point
process of rate k over [ln(
ν
(B
β
)),ln(
ν
(A))]. (See the discussion of thinning from
Section 7.2 to see why.) Let n(
β
)=sup{i :
β
i
β
} count these points. Then
ˆa(
β
)=
ν
(A)exp
n(
β
)
k
(11.6)
gives an estimate for
ν
(B
β
) for every value of
β
.
It should be noted that TPA is not the only way to construct an o mnithermal
approximation. Bridge sampling [40] and nested sampling [117] also allow for om-
nithermal sampling. The difference is with TPA it is possible to explicitly bound the
probability that th e error is large for all values of
β
simultaneously.
Lemma 11.3. Let
ε
(0,1/2) and I =[ln(
ν
(B)),ln(
ν
(A))]. Then,
P
!
exp(
ε
) sup
β
I
ˆa(
β
)
ν
(B
β
)
exp(
ε
)
"
1 2exp
k
ε
2
2[ln(
ν
(B)/
ν
(A)) +
ε
]
.
(11.7)
For a proof see [57].
There are many applications for this type of estimate.
11.3.1 Maximum likelihood estimators for spatial processes
Suppose that the statistical model for data is that X is a draw from the Ising model
for some
β
.Tond the maximum likelihood estimator (MLE) for
β
over some range
[a,b], it is necessary to compute
arg max
exp(
β
H(X ))
Z
β
. (11.8)
One way to obtain a
β
where exp(
β
(H(X )))/Z
β
is close to the maximum value is
to use an omnithermal approximation to obtain Z
β
for all
β
> 0.
OMNITHERMAL APPROXIMATION 205
a
b
c
a
c
Figure 11.2 Matern type III process where R = 0.35,dist(a,b)=0.3,dist(a, c)=0.4,
dist(b,c)=0.5,t
a
= 0.63,t
b
= 0.87,t
c
= 0.22. Point c is born rst, then a, but b is not because
it is too close to a.
11.3.2 Doubly intractable distributions with one spatial parameter
The doubly intractable distributions from Section 11.1 can also be dealt with using
TPA and omnithermal sampling when the spatial component only has one varying
parameter
β
. In this case the posterior becomes
f
β
|Y =y
(b) f
β
(b)exp(
β
H(y))/Z
β
. (11.9)
With an omnithermal approximation of Z
β
in hand, it is possible to use any numerical
integration method to nd the normalizing constant for f
β
|Y =y
approximately, as well
as nd the posterior mode and construct credible intervals.
11.3.3 Example: Mat
´
ern type III processes
In the examples looked at previously, the integral (or sum) was the normalizing con-
stant that appeared in the denominator of the density. In this section an example is
given of a model where the normalizing constant is known, and it is the weight in the
numerator that has a difcult high-dimensional integral.
Point processes such as the Strauss model operate by giving a density with re-
spect to the standard Poisson point process. Essentially, they are upweighting or
downweighting point processes that have certain properties.
An alternative way of modeling point processes is constructive—give an algo-
rithm for generating the point process. This can also be used to generate process
with certain properties.
For instance, consider the Mat´ern type I II process [88] illustrated in Figure 11.2
This model (in its simplest fo rm) has two parameters, the in tensity
λ
(0,) and a
radius R [0,), together with a reference measure
ν
(·) on S.
First, generate a Poisson point process P over the original space S using rate
λ
·
ν
(·). Next, generate for each point x
i
a birth time t
i
Unif([0,1]),wherethet
i
are
all independent.
Using a city metaphor, the birth time represents the time that the city is founded.
Once a city is founded, it immediately claims all the area within distance R of the
center of the city. If later on a city attempts to be founded within distance R of the
rst city, it is not allowed and the later city is not added to the process.
This continues until all o f the points have been either added or passed over. This
206 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
1
0
0 10
Figure 11.3 Example of a shadow region for S =[0,10],x
1
= 2.8,x
2
= 5.2,x
3
= 8.0,t
1
= 0.5,
t
2
= 0.65,t
3
= 0.7,R= 1.4. If there are only hidden points in the shadowed region, they will
not be seen at time 1.
is an example of a repulsive point process, since points will be farther apart than they
would be otherwise if close points were not being eliminated.
The d ata that is actually seen is the set of points at time t = 1; earlier points that
were eliminated are not seen. Note that the set of all points (both seen and unseen)
together with their times forms a Poisson point process on S ×[0, 1] with inten sity
equal to the product measure of
λ
·
ν
(·) and Lebesgue measure. See Figure 11.3 for
an illustration.
Denition 11.1. Consider a Poisson point process over S ×[0,1].Thenlet
D
R
(x,t)={(x
i
,t
i
: (x
j
x)(dist(x
j
,x
i
) < R and t
j
< t
i
} (11.10)
be the shadow of the conguration (x,t).If(x
i
,t
i
) D
R
(x,t), call point i hidden,
otherwise it is seen.Let
A
R,
λ
(x,t)=
λ
·(
ν
×m)(D
R
(x,t)) (11.11)
be the expected number of hidden points given the seen points in (x,t).
In general, the data consists of the seen point locations (without time stamps).
The hidden points are completely unknown. It is possible to write down a formula
for the density of the Mat´ern type III process with respect to a standard Poisson point
process.
Lemma 11.4 (Theorem 2.1 of [67]). The d ensity function of the seen points (lo-
cations and times) created using intensity
λ
·
ν
(·) ×m and exclusion radius R with
respect to a PPP with intensity measure
ν
(·) ×m over S ×[0,1] is
f
seen
(x
s
,t
s
)=1

min
{p
1
,p
2
}⊂x
s
dist(p
1
, p
2
)
> R
λ
#x
s
exp(A
R,
λ
(x
s
,t
s
)
ν
(S)(
λ
1)).
(11.12)
OMNITHERMAL APPROXIMATION 207
See [67] for the proof.
Of course, the data only consists of the locations of the seen points x, and not the
time stamps t. Integrating out the possible times gives the marginal density
f
seen
(x|R,
λ
(·)) = 1(min
i= j
dist(x
i
,x
j
) > R)exp(
λ
·
ν
(S))
t[0,1]
#x
exp(A
R,
λ
(x,t)).
(11.13)
This is also (given parameter s R and
λ
) the likelihood of the data.
It is this last factor, the integral with dimension equal to the number of points in
the data, that requires the use of Monte Carlo and perfect simulation to solve.
The rst step is to build a monotonic update function where the distribution with
unnormalized density f
target
(t)=exp(A
R,
λ
(x,t)) is stationary. Usually, Metropolis-
Hastings Markov chains are not monotonic, but in this case they are. It operates
as follows. At each step, choose a point x
i
x uniformly at random. Then draw u
uniformly from [0,1],lett
i
= u,andt
j
= t
j
for all j = i.
Then if u t
i
, then accept and change t to t
. Otherwise, accept t
as the new set of
time stamps with probability exp(A
R,
λ
(x,t
))/exp(A
R,
λ
(x,t)) = exp((A
R,
λ
(x,t)
A
R,
λ
(x,t
)).
Even in one dimension, calculating A
R,
λ
(x,t) A
R,
λ
(x,t
) can be difcult, but in
two or higher dimensions it is a very difcult task. Quantities such as the area of
intersection of three or more circles can show up in the expression. Once again, the
Bernoulli factory of Beskos et al. [11] can help.
Generate a Poisson point process in D
R
(x,t) D
R
(x,t
) by generating a PPP
of constant intensity
λ
over B
R
(x) ×[t
,1] and thinning any points that appear in
D
R
(x,t
) or which have time stamp greater than t. This only involves nding the
distance of each point in the PPP to each point in x, and so it can be done quickly.
The chance that the number of thinned points is 0 is exactly exp((A
R,
λ
(x,t)
A
R,
λ
(x,t
))), and so it is possible to draw a Bernoulli random variable with this mean
without actually calculating the mean explicitly.
Now suppose for any other set of times s t, the same PPP P is used to determ ine
if s
i
u.Ifu > t
i
,thenu is accepted for both. If s
i
< u t
i
,thens
i
= u and t
i
might
accept the move to u or not. Either way, after the step s
i
t
i
.
If u < s
i
,thenift
i
accepts the move down to u, it is because the thinning pro-
cess over D
R
(x,t) D
R
(x,t
) had zero points. But D
R
(x,s) D
R
(x,t),soD
R
(x,s)
D
R
(x,t
) will also have zero points, and so s
i
will also be moved to u. Hence the
update function is monotonic.
In this pseudocode for this update, i ∈{1,...,#x}, u [0, 1],andP is a subset of
points in B
R
×[0,1].
Mat´ern Type III update Input: (x,t), i, u, P Output: (x,t)
1) Let t
i
u and for all j = i, t
j
t
j
2) Let N #{(v
j
,s
j
) P : s
i
t
i
and (v
j
,s
j
) /D
R
(x,t
)}
3) If N = 0thent t
The maximum state is all 1’s, and the minimum is all 0’s. When used with
Monotonic
coupling from the past, this returns samples from the unnormal-
ized density f
target
(t|)=exp(A
R,
(x,t)).
208 APPLICATIONS AND LIMITATIONS OF PERFECT SIMULATION
So now let us see how the ability to TPA can be employed to approximate Z
=
t
f
target
(t) dt. First note that Z
0
= 1, so the value of the normalizing constant is known
for at least one value of the parameter. Since A
R,
(x,t) is an increasing function of
,soisZ
.SotouseTPA, just use a single a uxiliary variable as with the Gibbs
distributions, y [0, exp(A
R,
λ
(x,t))].ThenZ
λ
is just the Lebesgue measure of B
=
{(t,y) : t [0, 1]
#x
,y [0,exp(A
R,
(x,t))]}.
Note this is one of the cases mentioned earlier when we know the measure of
A = B
0
, but are looking for the measure of B = B
λ
.
As mentioned earlier, omnithermal sampling can be used to approximate
f
seen
(x|R,
λ
) over a range of
λ
values. This can then be used to nd the MLE for
the
λ
value.
The description of Mat´ern Type III processes given here was over a set S of nite
size. As with other spatial point processes, both the model and the perfect simulation
procedure can be extended to a nite window looking onto an innite space. See [98]
for details.
11.4 Running time of TPA
Let r =
ν
(B)/
ν
(A). Then from Lemma 11.3, to get an estimate ˆr of r such that
e
ε
< ˆr/r < e
ε
with probability at least 1
δ
can be accomplished by setting k =
2
ε
2
[ln(r)+
ε
]ln(2
δ
1
).
At rst glance, that might appear strange, since this says that to estimate r to
within a factor of e
ε
, it is necessary to set k to a value that depends on r,thevery
thing we are trying to estimate!
To solve this dilemma, run two phases of TPA.Therst phase is quick, and gives
an estimate ˆr
1
such that |ln(ˆr
1
)ln(r)|≤(1/2)(1 +
1 + 4ln(r) with probability at
least
δ
/2. The second phase then draws the estimate ˆr using k derived from ˆr
1
.
Two phase TPA Input:
ε
,
δ
Output: ˆr
1) N 0, k
1
←2ln(4
δ
1
)
2) Repeat k
1
times
3) N N + TPA
4) ˆr
1
N/k
1
, k
2
←2
ε
2
ln(4
δ
1
)(ln(ˆr
1
)+
ln(ˆr
1
)+1 + 1 +
ε
), N 0
5) Repeat k
2
times
6) N N + TPA
7) ˆr N/k
2
Lemma 11.5. The output of Two phase TPA satises
P(e
ε
ˆr/r e
ε
) > 1
δ
. (11.14)
The expected number of samples drawn is bounded above by
(ln(r)+1)
2ln(4
δ
)
1
+ 2
ε
2
ln(4
δ
1
)(ln(r)+
ln(r)+1 + 1 +
ε
)+1
.
(11.15)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset