Chapter 6: Coalescence on Continuous and Unbounded State Spaces (2/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

102 COALESCENCE ON CONTINUOUS AND UNBOUNDED STATE SPACES

6.3.1 Random walk with normals

Now examine a different random walk, one where

(x,Z)=x + Z,whereZ is a

standard normal random variable. At ﬁrst glance it does not appear that the same

technique can be applied, but in fact adding Z can be transformed into a process

where a uniform is added.

The b asic idea is to write the normal distribution as a mixture of uniform random

variables. That is, build random variables A and B so that if [X |A , B] ∼ Unif([A,B]),

then X ∼ N(0, 1).

The easiest way to do this is to begin by generating Z ∼ N(0, 1).Thengiven

X, create the auxiliar y variable [Y |Z] ∼ Unif([0,exp(−Z

/2)]). At this point (Z,Y )

are uniform over the region under the unnormalized density f

(x)=exp(−x

/2).

Now turn it around and generate [W |Y ] ∼ Unif(x



:0≤ y ≤ exp(−x

2

/2)).Then

W ∼N(0 , 1). But before that happens, [W |Y ] ∼Unif([−



−2ln(y),



−2ln(y)]).See

Figure 6.2 for an illustration.

0 1-1 2-2

Figure 6.2 First generate Z ∼ f, then [Y |Z] ∼ Unif([0, f (Z)]). Finally, draw W ∼ Unif({x :

f (x) ≥Y }).ThenW ∼ Z. In the ﬁgure, the state space for the draw W is the dashed line.

So the W that is being added to the current state is uniform, and so multishift

coupling can be used. Pseudocode for this procedure is as follows.

Multishift update normal Input: current state x Output: next state y

1) Draw Z ← N (0,1)

2) Draw Y ← Unif([0,exp(−Z

/2)])

3) w ← 2



−2ln(Y )

4) Draw U ← Unif([0,1])

5) W ← wU −w/2

6) y ← wx/w+W

Although this was written f or a standard normal being added, to deal with a normal

with mean

and variance

being added, simply multiply by

and add

to X



step 5.

When this update is applied to an interval [x

min

max

], the result is (x

max

−

min

)/w different points returned. This can be a problem as w can be arbitrarily close

to 0.

Wilson [130] solved this issue by taking the normal distribution density, and “ﬂip-

ping over” the right-hand side. That is, instead of generating (X,Y ) uniformly over

AUTO MODELS 103

{(x,y ) :0≤ y ≤ exp(−x

/2)}, generate (X ,Y ) uniformly over

{(x,y ) : x ≤ 0,0 ≤ y ≤ exp(−x

/2)}∪{(x,y) : x > 0, 1 −exp (−x

/2) ≤ y ≤ 1}.

(6.7)

See Figure 6.3 for an illustration.

Given x, y is still uniform over an interval of length exp(−x

/2).andsoX is still

normally distributed. This leads to the following algorithm.

Multishift update normal flipped Input: x Output: y

1) Draw X ← N(0,1)

2) If X ≤0thendrawY ← Unif ([0,exp(−X

/2)])

3) Else draw Y ← Unif([1 −exp(−X

/2),1])

4) a ←−



−2ln(Y ), b ←



−2ln(1 −Y ), w ← b −a

5) Draw U ← Unif([0, 1])

6) X



← wU + a

7) y ← wx/w+ X



Unlike the ﬁrst method, here w ≥ 2



ln(4) > 2.35. Therefore, when applied to

the interval [x

min

max

], this multishift coupler results in moving to at most (x

max

−

min

)/2.35 different points.

0 1-1 2-2

Figure 6.3 Inverting the right half of the normal density. Z is still drawn from f , now [Y |Z] is

uniform over the area between the two lines with horizontal coordinate Z. Then W is uniform

over the shaded area with vertical coordinate Y . Now the state space of W is an interval of

length at least 2.35.

6.4 Auto models

Multishift coupling is especially useful in auto models. Recall that an auto model is

a Markov random ﬁeld where the density can be written as a product of functions on

the nodes and functions over labels on endpoints of edges.

In particular, in an auto model the distribution of X(v) for a single node v ∈ V

only depends on the immediate neighbors, and not on more distant nodes. That is,

[X(v)|X(V v)] ∼[X(v)|X ({w : {v, w}∈E})].

When the distribution of X (v) conditioned on its neighbors always comes from

the same d istribution family, the model can be named as auto followed by the name

of the distribution family.

For instance, suppose x ∈{0,1}

is a labeling of the nodes with either 0 or 1.

104 COALESCENCE ON CONTINUOUS AND UNBOUNDED STATE SPACES

Then trivially the distribution of x conditioned on its neighbors is Bernoulli since it

must be either 0 or 1. Hence this is an autobernoulli model.

In the hierarchical Poisson/gamma mod e l from Section 6.2.1, consider the star

graph with node v

at the center, and nodes v

,...,v

each connected only to v

Label node v

with the value

,andv

with

Then this forms an autogamma model, since conditioned on the neighbors of each

node, the distribution of the node will come from the gamma family of distributions.

6.4.1 Example: autonormal model

In [9], Besag considered a statistical model for the output from a camera. Each pixel

of the camera returns numerical values b a sed on the color of light that hits it, with

larger numbers denoting brighter values. In a standard photo, the color space is three

dimensional, with one dimension giving the red, one the blue, and one the green.

In a grayscale photo, the color space is one dimensional. For simplicity, only the

one-dimensional case will be considered here, but this easily extends to higher di-

mensions.

The model is that at each pixel of the camera, the returned value is normally

distributed with mean equal to the actual entering value, and some ﬁxed variance.

Then the prior distribution on the picture entering the camera (the data) assigns

a penalty for neighboring pixels that goes down exponentially in the square of the

difference between the pixel values.

If the observed pictur e is y, then the posterior on the true picture X has density

(x)=Z

−1

exp(−H(x,d))1(x ∈ [0, 1]

) (6.8)

H(x, y)=−

∑

(x(i) −y(i))

−

∑

{i, j}∈E

(x(i) −x( j))

, (6.9)

where

and

are parameters of the model.

Now suppose a Gibbs sampler is implemented for this problem. A random scan

Gibbs sampler chooses a node uniformly at random, then updates the value of the

node conditioned on the rest of the neighbors.

Fact 6.2. For the autonormal Gibbs sampler,

[X(v)|X(neighbors of v)] ∝ exp(−(x −a(i))

/[2b(i)]),

b(i)=(

−2

+ deg(i)

)

−1

a(i)=



−2

y(i)+

∑

j:{i, j}∈E

X( j)



b(i),

where deg(i) is the number of neighbors of i in the graph. That is, X (v) conditioned

on the values of its neighbors is normally distribu ted with mean a(i) and variance

b(i).

So the target distribution is a normal whose variance is independent of the neigh-

bors, but whose mean is an increasing function of the neighboring values.

METROPOLIS-HASTINGS FOR CONTINUOUS STATE SPACES 1 05

This is a situation where multishift coupling could b e applied, since the goal is to

couple normally distributed random variables with means over an interval from the

smallest value of a(i) to the largest value of a(i).

6.5 Metropolis-Hastings for continuous state spaces

Gibbs sampler Markov chains can often be made monotonic, but require special ap-

proaches such as multigamma coupling or multishift coupling. Metropolis-Hastings

Markov chains are usually not monotonic, but can make continuous spaces coalesce

down to a single point. To get the best from these two approaches, create an update

function that consists of several steps in a Gibbs chain to b ring the upper and lower

states close together, and then use a Metropolis-Hastings step to completely reduce

the image down to a single state.

6.5.1 Example: truncated autonormal

The multishift coupler was used previously to deal with the unrestricted autonormal

model, but in fact for image processing usually values are conditioned to lie in the

real interval from 0 (solid black) to 1 (solid white).

This is easy to deal with for the Gibbs sampler: when generating the normal with

mean a(i) and variance b(i), just condition on the normal lying between 0 and 1 .

Allison Gibbs [41] showed that this Markov chain was rapidly mixing. In [41] the

Wasserstein metric was used to assess convergence because it was easier to work

with; in [54] the method was improved to handle total variation distance. Therefore

the details of the Wasserstein me tric are omitted here.

The truncation complicates matters: the distributions at each node are no longer

shifts from the same family of distributions. For example, a normal with mean 0.5

and variance 1 conditioned to lie in [0,1] is not a shifted version of a normal of mean

0.7 and variance 1 conditioned to lie in [0,1]. So the multishift coupling used in the

previous section no longer works here. On the other hand, using the inverse transform

method to generate the two normals does give a monotonic update function.

A solution lies in the Metropolis-Hastings approach from Section 1.8.2. In this

approach a state is proposed to be moved to, an acceptance ratio formed, and if a

Unif([0,1]) falls below the acceptance ratio, the state is m oved to the proposed state.

Otherwise it stays where it is.

Suppose our proposed state is found by taking each node in turn, then indepen-

dently adding U

, which is a uniform [−

] to the value at that node, where

> 0

is a small value. When

is very small, it is likely that such a proposed move will be

accepted, regardless of th e initial state. Let U =(U

,...,U

).Then

(x + U)

(x)

≥ exp



−

#V(2

)

−

#E(2

)



Making

= Θ(min{

/#V,1/[

#E]}) gives f

(x + Y )/ f

(x) ≥ 1/2, and so

there is a 1/2 chance that the Metropolis-Hastings move will be accepted starting

from any state.

106 COALESCENCE ON CONTINUOUS AND UNBOUNDED STATE SPACES

Since a uniform is being added to each node value, multishift coupling could be

used to couple the proposed state for all initial states. However, there is a another

way of coupling that is simpler to implement.

This method works as follows. Suppose the initial value x

is in the interval [a, b],

and the goal is to couple x

together where U

∼Unif([−

]). Then from a,the

goal is to g enerate from [a −

,a +

], from b, the goal is to generate from [b −

,b +

], and from x

∈ (a,b), the goal is to generate from [x

−

Note that all of these intervals lie in the interval [a −

,b +

].SoAR can be used

to sample as follows: g enerate A

∼ Unif([a −

,b +

]).IfA

∈ [x

−

] then

,U)=A

. Otherwise start over. Repeat until all x

∈ [a,b] have been assigned.

Here is why this is good: if the ﬁrst A

falls in [b −

,a +

], then for all x

∈[a,b],

∈ [x

−

]. Hence the proposed move is the same for all states.

Inside the Gibbs sampler, it is necessary to generate a normal random variable

with mean a(i) and variance b(i) truncated to lie in [0, 1]. One way to generate

a N (a(i),b(i)) is to take a standard norm al, multiply by



b(i),andadda(i).In

order for the result to lie in [0,1], the initial standard normal must have been in

[−a(i)/



b(i),(1 −a(i))/



b(i)]. To obtain a standard normal in [a



],usethein-

verse transform method.

If Φ is the cdf of a standard normal, and U ∼ Unif([0,1]),thenΦ

−1

(U ) will

be a standard normal. Furthermore, (Φ(b



) −Φ(a



))U + Φ(a



) will be uniform over

[Φ(a



),Φ(b



)],sothenΦ

−1

((Φ(b



) −Φ(a



))U + Φ(a



)) is a standard normal condi-

tioned to lie in [a



This gives the following Gibbs sampler update. As earlier, let deg(i) denote the

number of neighbors of node i in the graph.

Truncated autonormal Gibbs update Input: x,U,v, Output: x(i)

1) b(i) ← (

−2

+ deg(i)

)

−1

, a(i) ←



−2

y(i)+

∑

j:{i, j}∈E

x( j)



b(i)

2) a



←−a(i)/



b(i), b



← (1 −a(i))/



b(i)

3) Z ← Φ

−1

((Φ(b



) −Φ(a



))U + Φ(a



))

4) x(i) ← a(i)+



b(i)Z

Next is the Metropolis-Hastings update.

Truncated autonormal MH update Input: x

min

, x

max

, x, Output: y, cflag

1) cflag←TRUE

2) For all i ∈V

3) Draw U

← Unif([x

min

−

max

])

4) If U

/∈ [x

max

−

min

] then cflag←FALSE

5) If U

/∈ [x −

,x +

]

6) U

← Unif([x −

,x +

])

7) U ← (U

,...,U

)

8) r ← f

(x + U)/ f

(x)

9) Draw U

uniformly from [0,1]

10) y ← x1(U

≤ r)+(x + U)1(U

> r)

11) If U

> 1/2thencflag←FALSE

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Coalescence on Continuous and Unbounded State Spaces (2/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 6: Coalescence on Continuous and Unbounded State Spaces (2/5)