15: Hierarchical Dynamic Generalized Linear Mixed Models for Discrete-Valued Spatio-Temporal Data (1/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Hierarchical Dynamic Generalized Linear Mixed

Models for Discrete-Valued Spatio-Temporal Data

Scott H. Holan and Christopher K. Wikle

CONTENTS

15.1 Introduction...................................................................................327

15.2 Hierarchical Models..........................................................................328

15.3 Data Models for Discrete-Valued Spatio-Temporal Data...............................332

15.4 Modeling Dynamics..........................................................................334

15.5 Example: Forecasting Migratory Bird Settling Patterns.................................337

15.5.1 Breeding Population SurveyData..................................................337

15.5.2 Spatio-Temporal Poisson Models..................................................337

15.5.3 Forecasting Application: Breeding Population Survey. . . ... . . . ... . . . ... . . ... . .339

15.6 Conclusion.....................................................................................342

References............................................................................................344

15.1 Introduction

Discrete-valued spatio-temporal data arise frequently across a diverse range of subject-

matter disciplines, including epidemiology, small area estimation in federal surveys,

environmental science, and ecology, among others. In general, modeling this type of data

can prove challenging due to the complexity of the observed data and underlying dynam-

ical processes (e.g., see Cressie and Wikle, 2011, and the references therein). In this chapter,

we focus primarily on modeling count data using spatio-temporal generalized linear mod-

els within a Bayesian hierarchical modeling (BHM) framework. In particular, we review

some of the common methods in this context and describe some recent advances. For

completeness, we provide brief discussion surrounding other types of discrete-valued

spatio-temporal data, such as Bernoulli data and others. Finally, we provide a succinct

real data illustration outlining the prediction of waterfowl migratory patterns across the

north-central United States and Canada.

In the context of modeling count spatio-temporal data, several methods have emerged,

including auto-Poisson models (Besag, 1974), generalized linear dynamical (spatio-

temporal) mixed models (Wikle, 2002), and Bayesian nonparametric methods based on

Dirichlet process mixtures (Kottas et al., 2008), among others. The direction pursued here

focuses on generalized linear mixed models (GLMMs) (see McCulloch et al., 2001, for a

brief overview of GLMMs). Specically, we consider generalized linear models (GLMs)

with a latent Gaussian process model (e.g., see the overview in Cressie and Wikle, 2011,

327

328 Handbook of Discrete-Valued Time Series

and the references therein). In this context, we have a non-Gaussian data model along with

a latent dynamic Gaussian model for the underlying unobserved process (Section 15.2)

and, thus, the latent random effects cannot be integrated out analytically (Verbeke and

Molenberghs, 2009). From this perspective, the models we describe are similar to the

dynamic linear models framework in the time series (non-spatial) case. For further discus-

sion surrounding Bayesian dynamic linear models see Gamerman et al. (2015; Chapter 8 in

this volume) and the references therein.

To date, there have been many methodological contributions in the area of BHMs for

count-valued spatio-temporal data. For example, Waller et al. (1997) consider a spatio-

temporal count model for mapping disease rates, where the observations are assumed

to come from a Poisson distribution. In the context of ecological modeling, Wikle (2003)

introduces a Bayesian hierarchical spatio-temporal Poisson model to predict the relative

population abundance of house nches over the eastern United States. Wikle and Anderson

(2003) propose a spatio-temporal zero-inated Poisson model that uses exogenous climate

processes to model tornado counts.

Other diverse application areas include Wikle and Royle (2005) where the authors pro-

pose a dynamic spatio-temporal exponential family (Poisson) model for selecting sampling

locations to estimate July brood counts in the Prairie Pothole Region of the United States.

In contrast, Schrödle and Held (2011) describe spatio-temporal disease mapping mod-

els using integrated nested Laplace approximations (INLA) to facilitate fast computation

in the context of space–time count data. Further, Lopes et al. (2011) introduce a class of

spatio-temporal latent factor models for observations belonging to the exponential fam-

ily of distributions. However, the models are illustrated using a Bernoulli data example

to model rainfall. Finally, Wu et al. (2013) develop a class of Bayesian Conway-Maxwell

Poisson (CMP) models with dynamic dispersion and illustrate the approach by estimating

migratory waterfowl settling patterns.

The area of discrete-valued spatio-temporal modeling is expansive in terms of both

methodological contributions and applications. The previous list of contributions is in no

way meant to be exhaustive. Instead, it serves to illustrate the rich literature that exists on

the subject. For further discussion, see Cressie and Wikle (2011) and the references therein.

This chapter proceeds as follows. Section 15.2 provides a general description of spatio-

temporal modeling from a BHM perspective. Specically, this section reviews the Bayesian

hierarchical framework and details effective partitioning of the model hierarchy in terms

of models for the observed data, latent processes, and parameters. Section 15.3 discusses

various data models for discrete-valued spatio-temporal data, whereas modeling dynamics

is pursued in Section 15.4. Section 15.5 provides an illustration of modeling spatio-temporal

count data in the context of an application to forecasting migratory bird settling patterns.

Specically, this section methodologically illustrates the use of a spatio-temporal Poisson

model, with a latent dynamic Gaussian process for the Poisson intensity parameter through

a real data example. Finally, Section 15.6 provides concluding discussion.

15.2 Hierarchical Models

The hierarchical paradigm has experienced signicant growth over the past two decades.

The original ideas behind the process-based hierarchical modeling approach, as presented

here, emerged largely out of the work of Berliner (1996) and have been further exposited

329 Hierarchical Dynamic Generalized Linear Mixed Models

by Wikle et al. (1998), Wikle (2003b), Cressie et al. (2009), and Wikle et al. (2013), among

others. This approach is conceptually straightforward and it provides an extremely rich

framework for modeling complex dependence structures in the context of discrete-valued

spatio-temporal processes. Importantly, in addition to the process-based emphasis, the

hierarchical framework presented here also emphasizes modeling parameters, which is

often not the case in nested error regression-type hierarchical models.

Although the hierarchical modeling paradigm has become fairly well established (e.g.,

see Cressie and Wikle, 2011, and the references therein), we provide a brief description

here for those readers less familiar with these ideas. The main idea underlying the BHMs

presented here is to consider a joint probability model for the data, process, and parame-

ters, which are generally specied through conditionally linked model components; that

is, the data conditioned on the process and parameters and the process conditioned on the

parameters. Several references focus on this type of hierarchical thinking including Royle

and Dorazio (2008) and Cressie and Wikle (2011), among others, whereas more traditional

presentations of hierarchal modeling can be found in Banerjee et al. (2003), Carlin and Louis

(2011), Gelman et al. (2013), and the references therein.

Synthesis and effective utilization of information, both from direct and from indirect

sources, are two paramount objectives in statistical modeling and data analysis. In fact,

both direct and indirect sources of information play a key role in statistical modeling and

often include expert opinion, physical laws, and previous empirical results. For specicity,

consider the case where we have an underlying scientic process of interest, denoted by

Y (a spatio-temporal process). Associated with this process we also have observed data,

say Z. We assume that we have parameters θ

associated with the measurement process

Z that might account for differences in the support and representativeness between the Z

and the underlying true process Y dened at a given resolution of interest. Additionally,

we assume that there are some parameters θ

, typically associated with the evolution oper-

ator and innovation covariances, that describe the dynamics of true underlying process of

interest, Y.

Let [Z|Y] and [Y] denote the conditional distribution of Z given Y and the marginal distri-

bution of Y, respectively. Then, assuming conditional independence of the parameters and

using the law of total probability, the joint probability distribution of the data and process

given the parameters can be decomposed as

[Z, Y|θ

, θ

] = [Z|Y, θ

][Y|θ

] , (15.1)

where [Z|Y, θ

] is the data distribution (or “data model”—assuming conditional indepen-

dence) and [Y|θ

] denotes the process distribution (or “process model”).

In traditional statistics, typically, the data Z is given some specied distributional form

along with associated parameters θ = (θ

, θ

) corresponding to the spatio-temporal mean,

variances, and covariances. Although distributional assumptions for Z can be relaxed in the

context of discrete-valued spatio-temporal data (e.g., Kottas et al., 2008), we limit our dis-

cussion to parametric models and focus primarily on count-valued spatio-temporal data.

Integrating out the random process Y in (15.1) results in [Z|θ], in which case interest resides

in estimating the parameters given the data. The disadvantage of such estimation is that

it eliminates explicit estimation of the underlying true latent process Y. Instead, the distri-

bution for Y is implicitly included through the rst and second moments as a result of the

integration.



   

330 Handbook of Discrete-Valued Time Series

Modeling spatio-temporal count data (or count time series for that matter) can pro-

ceed either from an observation-driven perspective or using a process-driven (parameter-

driven) approach. By taking a process-based approach (i.e., explicitly modeling Y) several

advantages arise. First, in many applications, one is actually interested in predicting the

true underlying latent process Y, rather than just accounting for the co-variability. Second,

given the complexity and high dimensionality of many real-world observed processes, it

is often extremely difcult to specify the dependence structure associated with Z (e.g., due

to non-Gaussianity, nonlinearity in time, and/or nonstationarity in space and/or time).

Consequently, as a result of needing to specify a realistic dependence structure, likelihood-

based inference in this context is challenging. In contrast, by placing emphasis on modeling

the process Y instead, one can directly incorporate scientic insight into the model and

more easily account for measurement (and/or sampling) and process uncertainty. For

example, Markovian approximations and spatially and/or time-varying parameters can

be readily incorporated in the model hierarchy. In other words, the hierarchical (condi-

tional) specication allows extremely complicated marginal dependence structures to be

replaced by a more scientic specication of the conditional mean as random process at a

lower stage in the model hierarchy. This type of modeling is analogous to the traditional

mixed model setting, where the practitioner must choose between the marginal model that

arises from integrating out the random effects or the conditional model, where the random

effects are predicted and the conditional covariance of the data model is less complicated

(Demidenko, 2013). In contrast to the linear mixed model case, the generalized linear mixed

model case, which is the focus here, is signicantly more complicated. Importantly, for non-

Gaussian data models, it is seldom possible to analytically integrate out the random effects.

In other words, it is rarely the case that integrating out the random effect will result in a

closed-form solution. Consequently, discrete-valued dynamic spatio-temporal generalized

linear mixed models are typically quite computationally demanding, even after some form

of dimension reduction.

In general, interest resides in estimating the posterior distribution of the process and

parameters given the data. Using Bayes theorem, the fully BHM can be represented as

[Y, θ|Z]∝[Z|Y, θ

][Y|θ

][θ

, θ

] , (15.2)

where θ =

(

, θ

)

and it is necessary to specify a prior distribution for [θ

, θ

].Notethat

in (15.2) the normalizing constant integrates over both the process Y and the parameters

and θ

Importantly, this representation facilitates a conditional way of thinking about compli-

cated applications in a probabilistically consistent manner and naturally provides a means

of quantifying uncertainty. In the context of spatio-temporal count-valued data, the data

model (i.e., [Z|Y, θ

]) will follow a count distribution such as a Poisson, negative binomial

(NegBin), or CMP, among others (see Section 15.3).

An important aspect of (15.2) is that the right-hand side can be further decomposed

into several submodels. For example, assuming conditional independence given the true

underlying process, multiple data sets with different spatial and/or temporal supports

could be accommodated through the following data model specication:

(1)

, Z

(2)

|Y, θ

(1) (2)

(1)

|Y, θ

(1)

(2)

|Y, θ

(2)

Z Z

, (15.3)









 



 











331 Hierarchical Dynamic Generalized Linear Mixed Models

where, for j = 1, 2, Z

(j)

and θ

(j)

correspond to the observations and parameters from the

jth data set, respectively (e.g., see Wang et al., 2012). In this context, Z

(1)

and Z

(2)

need

not have the same data distribution. Although, in practice, the assumption of conditional

independence is often reasonable across a wide range of applications, when possible, this

assumption should be validated.

For many applications, it is also natural to decompose the model for the process into

subcomponents. In particular, in the context of discrete-valued spatio-temporal data, it

is often natural to assume that the process has a Markov structure in time. Assuming a

rst-order Markov structure in time yields the following decomposition:

[Y]=[

{

, Y

, ..., Y

}

] = [Y

] [Y

t−1

] .

t=1

Alternatively, the process model could be further decomposed to accommodate multivari-

ate structure. In this case, letting [Y]= Y

(1)

, Y

(2)

, the process model can be expressed as

[Y]= Y

(2)

(1)

, where the order of conditioning is usually suggested by the specic

application and chosen by the practitioner (Royle and Berliner, 1999).

There is a vast literature on modeling non-Gaussian time series using a state-

space approach (e.g., Carlin et al., 1992; Fahrmeir, 1992; Fahrmeir and Kaufmann, 1991;

Gamerman, 1998; Kitagawa, 1987; West et al., 1985, among others). One major distinc-

tion between models in the time series case and the models described here is that in the

spatio-temporal setting we now need to consider spatial dependence in addition to serial

correlation, with these two dependence structures typically being nonseparable. Also, in

contrast to the pure time series case, the spatio-temporal case often suffers from being

extremely high dimensional. Specically, consider a process that is measured at n locations

and T times. Going from the pure time series case to the spatio-temporal setting results in an

increase of (n−1)T observations. Consequently, a necessary component of spatio-temporal

modeling resides in effective dimension reduction.

In the context of discrete-valued spatio-temporal models, we assume that the data model

comes from the exponential family and is non-Gaussian (e.g., Bernoulli, Poisson, NegBin,

etc.); see Section 15.3. In particular, using similar notation to Cressie and Wikle (2011)

we assume that Z

denotes an m

-dimensional vector of observations at time t from the

exponential family of distributions. That is,

|γ

] ∝ exp γ



− b

− c

(

)

where γ

denotes an m

-dimensional set of natural parameters that depend on the process

and E Z

|γ

≡ μ

. Then, assuming the usual regularity conditions for the exponential

family of distributions (McCulloch et al., 2001) we have that

g μ

= X

β + H

(

)

+ η

, (15.4)

where g(·) is a known link function, Y

is an n-dimensional spatial process vector of interest,

is a matrix of covariates (assumed known), β are the unknown “regression” coefcients

associated with X, H

is the m

× n observation matrix which is often assumed known

but could also be specied in terms of the unknown hyperparameters θ

,and η

is an

independent (across time) additive error term. It is important to note that, depending on

the particular application, the additive error term, η

, may not be warranted.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 15: Hierarchical Dynamic Generalized Linear Mixed Models for Discrete-Valued Spatio-Temporal Data (1/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
15: Hierarchical Dynamic Generalized Linear Mixed Models for Discrete-Valued Spatio-Temporal Data (1/5)