18
Spatio-Temporal Modeling for Small Area Health
Analysis
Andrew B. Lawson and Ana Corberán-Vallet
CONTENTS
18.1 Introduction...................................................................................387
18.2 Some Basic Space-Time Models.............................................................388
18.2.1 Descriptive Models...................................................................389
18.2.2 Mechanistic Models..................................................................390
18.2.3 Kalman Filtering......................................................................390
18.3 Model FittingIssues..........................................................................392
18.3.1 Posterior Sampling...................................................................392
18.3.2 INLA....................................................................................393
18.4 AdvancedModeling for Special Topics....................................................393
18.4.1 Latent Components. .................................................................393
18.4.2 Infectious Diseases...................................................................394
18.5 Prospective Analysis and Disease Surveillance..........................................396
18.6 Conclusions....................................................................................401
References............................................................................................402
18.1 Introduction
Small area data arise in a variety of contexts. Usually, arbitrary geographic units (small
areas) are the basic observation units in a study carried out within a predened geographic
study area (W). These could be administrative units such as zip codes, postal zones, census
tracts, or larger units such as municipalities, counties, parishes, or even states. The study
region W could be a predened area such as a city, county, state, or country, or an arbi-
trarily dened group of units used for the specic study. It is common for health data to
be collected within such units and that the resulting counts of disease are to be the focus
of study. Health data usually consist of a particular disease incidence (new counts of dis-
ease in a xed time period), or prevalence (counts within a longer time period). Diseases
could range from noninfectious such as diabetes, asthma, or different types of cancers to
infectious diseases such as HIV, inuenza C, inuenza A/H1N1, SARS, or corona virus.
In the following, we will conne our attention to disease incidence within small areas and
discrete time periods. Note that at a ne level of spatial and temporal resolution (residen-
tial location and date of diagnosis) the disease occurrence can form a spatio-temporal point
process (Lawson, 2013, ch 12). We do not pursue this form here.
387
388 Handbook of Discrete-Valued Time Series
Assume that a chosen disease occurs within m spatial units and is also observed within T
consecutive xed time periods. The resulting observed count is y
it
, i = 1, ..., m; t = 1, ..., T.
The time evolution of the disease within each spatial unit can be considered an example of
a discrete time series. Hence, the collection of spatial time series can be considered as an
example of multivariate discrete time series.
Usually, for health data we assume that counts are described by a discrete probability
model. For relatively rare diseases, a Poisson model is often assumed for y
it
so that
y
it
Po(λ
it
).
This assumption is in part justied theoretically from the aggregation of a Poisson process
model for the underlying case events. The specication of the mean level (λ
it
) requires some
consideration. First, as disease occurs within a population that is “at risk” for the disease,
the mean must be modulated by a population effect of some form. Usually, this modulation
is considered via a multiplicative link to a modeled component such as
λ
it
= e
it
θ
it
,
where e
it
represents the population at risk and is usually computed as an expected rate or
count. The estimator of e
it
is usually based on a standard population rate (such as the
whole study region or a larger area). Once estimated, the e
it
is usually assumed xed. It is
important to note that some inferential sensitivity could arise in relation to the estimation
method and population assumed for e
it
. Second, the model component θ
it
, which is known
as relative risk, must be nonnegative. This is usually achieved by modeling θ
it
on the log
scale, that is, a linear parameterization is assumed for log(θ
it
).The log(e
it
) is an offset.
Note that for nite populations within small areas we could assume a binomial likeli-
hood as a variant, instead of the Poisson model. In that case, we assume that a (known)
nite population n
it
is found in the small area and out of this population a set of disease
counts are observed. A classic example of this situation would be yearly births in counties
of South Carolina (n
it
) and births with abnormalities (y
it
). In this case, we would assume a
binomial model of the form
y
it
Bin(n
it
, p
it
)
and the probability of abnormal birth in the ith area would be modeled over time as p
it
.
Often in this situation, the probability will be modeled with a suitable link to linear or
nonlinear predictors and other terms. A logit, probit, or complimentary log–log link are
commonly assumed.
18.2 Some Basic Space-Time Models
Space-time models can be roughly classied into two types. First, there are purely descrip-
tive models that seek to provide a parsimonious description of the disease risk variation
in space and time. Second, there are mechanistic models that seek to include some
389 Spatio-Temporal Modeling for Small Area Health Analysis
mechanism of disease occurrence within the model. These latter models are often assumed
for infectious diseases where transmission from one time period to the next can be directly
modeled (see Section 18.4.2). Descriptive models often use random effects to provide a
parsimonious summary description of the risk variation. These are often most appro-
priate for noninfectious diseases. In what follows, we will discuss models for the rela-
tive risk under the Poisson model. Specication can be easily modied for a binomial
likelihood.
18.2.1 Descriptive Models
A basic description of space-time variation would consist of a separate spatial and temporal
effect model with a possible effect for the residual space-time interaction. Assume that
log(θ
it
) = α
0
+ S
i
+ T
t
+ ST
it
(18.1)
where S
i
, T
t
,and ST
it
represent the spatial, temporal, and space-time interaction terms,
respectively. Here, exp(α
0
) represents the overall rate in space-time.
Some simple spatial models could consist of (1) spatial trend (e.g., S
i
= α
1
s
1i
+ α
2
s
2i
where (s
1i
, s
2i
) is a coordinate pair for the geographic centroid of the ith small area), (2)
uncorrelated heterogeneity (e.g., S
i
= v
i
where v
i
is an uncorrelated heterogeneity term),
or (3) as for (2) but with correlated heterogeneity added (e.g., S
i
= v
i
+ u
i
where u
i
is a
spatially correlated heterogeneity term). This latter model is sometimes called a convolu-
tion model. The temporal effect T
t
can also take a variety of forms: (1) simple linear time
trend, that is, T
t
= βγ
t
where γ
t
is the actual time of the tth period and (2) a random time
1
effect such as an autoregressive lag 1 model (i.e., T
t
N(φT
t1
, τ
T
)) or a random walk
(when φ = 1). A simpler uncorrelated time effect could also be considered where T
t
N(0,
1
τ
T
), τ
being the precision of the respective Gaussian distribution. Combinations of
uncorrelated and correlated effects could also be considered for the time component.
Finally, as a form of residual interaction, the space-time interaction term (ST
it
) can also
be included. Often, the specication of
log(θ
it
) = α
0
+ v
i
+ u
i
+ γ
t
+ ψ
it
, (18.2)
1
where the interaction is assumed to be dened as ψ
it
N(0, τ
ψ
) is found to be a robust
and appropriate model for disease variation (see, e.g., Knorr-Held, 2000; Lawson, 2013,
ch 12). More sophisticated models with nonseparable space-time variation are also pos-
sible (see, e.g., Cai et al., 2012, 2013). These models can sometimes be more effective in
describing the space-time variation but are less immediately interpretable than separable
models. In terms of inferential paradigms, it is commonly found that a Bayesian approach
is adopted to the formulation of the hierarchical model structure and the ensuing estima-
tion methods focus on posterior sampling via Markov chain Monte Carlo (MCMC). For
the model specication in (18.2), the model hierarchy with suitable prior distributions
could be
390 Handbook of Discrete-Valued Time Series
y
it
|λit Po(λ
it
= e
it
θ
it
)
log(θ
it
) = α
0
+ v
i
+ u
i
+ γ
t
+ ψ
it
1 1
α
0
|τ N(0, τ
0
)
0
ν
i
|τ
1
N(0, τ
1
)
ν v
τ
1
u
u
i
|τ
1
ICAR
u
n
δ
i
γ
t
|γ
t1
, τ
1
N(γ
t1
, τ
γ
1
)
γ
ψ
it
|τ
1
N(0, τ
1
). (18.3)
ψ ψ
The ICAR(τ
u
1
/n
δ
i
) denotes an intrinsic conditional autoregressive spatial prior distri-
bution and implies that the term u
i
has a Markov random eld specication: a conditional
Gaussian distribution given its δ
i
neighboring region set (u
i
|··· N(u
δ
i
, τ
1
/n
δ
i
)).The
u
precisions (τ
) could be assumed to have a gamma prior distribution, that is, τ
Ga(a, b),
where a common choice is a = 0.01 and b = 0.005. Recently, the use of a noninforma-
tive uniform distribution has been recommended as a robust prior for standard deviation
parameters (Gelman, 2006). The joint posterior distribution for the model parameters is
analytically intractable but can be sampled using MCMC simulation techniques.
It has been found that the specication in (18.3) is a parsimonious and robust prescrip-
tion for relative risk modeling in space and time (Ugarte et al., 2009). An example model t
using MCMC (WinBUGS; see Section 18.3) of this Bayesian hierarchical model to 10 years
(1979–1988) of respiratory cancer mortality in Ohio (see Figure 18.1) led to a deviance infor-
mation criterion (DIC) of 5751.4 with the effective number of parameters pD =129. Whereas
a model without the interaction term yielded a DIC =5759.0 with pD =80. This suggests
that the space-time interaction model provides an improved t to these data over a simple
separable model.
18.2.2 Mechanistic Models
While descriptive models can perform well in describing space-time variation of nonin-
fectious diseases, it is often more appropriate to consider transmission mechanisms when
modeling infectious diseases. This is especially true when considering the prediction of the
infection process. Transmission mechanisms usually require the specication of a transmis-
sion rate related to a pool of potential cases. The standard model that is usually proposed
is a compartment model where a reservoir of people (susceptibles: S) can become infected
cases (infected: I) and then be removed from the process (removed: R). A fundamental fea-
ture of these models is that they resolve to linked count models within discrete time periods.
These Susceptible-Infected-Recovered (SIR) models can be formulated (and extended) in a
variety of ways. In Section 18.4.2, we discuss the application of these to infectious diseases
in space and time.
18.2.3 Kalman Filtering
Another relatively mechanistic modeling approach is to consider a linked two-component
system of equations. These two components represent a system equation and a measurement
391
Ohio SMR year 12
1.14 to 1.79 (18)
0.96 to 1.14 (16)
0.81 to 0.96 (17)
0.68 to 0.81 (18)
0 to 0.68 (19)
Ohio SMR year 15
1.15 to 2.15 (15)
0.99 to 1.15 (19)
0.84 to 0.99 (17)
0.72 to 0.84 (19)
0 to 0.72 (18)
Ohio SMR year 18
1.14 to 1.41 (19)
0.99 to 1.14 (14)
0.87 to 0.99 (17)
0.71 to 0.87 (20)
0 to 0.71 (18)
Ohio SMR year 13
1.13 to 1.46 (16)
1.01 to 1.13 (17)
0.87 to 1.01 (17)
0.64 to 0.87 (20)
0 to 0.64 (18)
Ohio SMR year 16
1.13 to 2.26 (17)
1.02 to 1.13 (14)
0.89 to 1.02 (19)
0.74 to 0.89 (20)
0 to 0.74 (18)
Ohio SMR year 19
1.1 to 1.7 (18)
0.99 to 1.1 (17)
0.88 to 0.99 (18)
0.7 to 0.88 (15)
0 to 0.7 (20)
Ohio SMR year 14
1.09 to 1.59 (21)
0.96 to 1.09 (13)
0.82 to 0.96 (18)
0.75 to 0.82 (15)
0 to 0.75 (21)
Ohio SMR year 17
1.16 to 1.62 (18)
1 to 1.16 (12)
0.87 to 1 (22)
0.74 to 0.87 (18)
0 to 0.74 (18)
Ohio_smr20
1.12 to 1.6 (20)
1.05 to 1.12 (10)
0.88 to 1.05 (21)
0.73 to 0.88 (17)
0 to 0.73 (20)
Ohio_smr21
1.15 to 1.58 (15)
1.07 to 1.15 (17)
0.88 to 1.07 (19)
0.66 to 0.88 (18)
0 to 0.66 (19)
FIGURE 18.1
County-level Ohio respiratory cancer mortality: 10 years (1979–1988) displayed as standardized mortality ratios with expected rate computed from the state × 21 year
average rate (1968–1988).
Spatio-Temporal Modeling for Small Area Health Analysis
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset