397 Spatio-Temporal Modeling for Small Area Health Analysis
Increasingly, surveillance systems are capturing data on both the time and location of
events. The use of spatial information enhances the ability to detect small localized out-
breaks of disease relative to the surveillance of the overall count of disease cases across the
entire study region, where increases in a relatively small number of regional counts may be
diluted by the natural variation associated with overall counts. In addition, spatio-temporal
surveillance facilitates public health interventions once an increased regional count has
been identied. Consequently, practical statistical surveillance usually implies analyzing
simultaneously multiple time series that are spatially correlated.
Unlike testing methods (Kulldorff, 2001; Rogerson, 2005), modeling for spatio-temporal
disease surveillance is relatively recent, and this is a very active area of statistical research
(Robertson et al., 2010). Models describing the behavior of disease in space and time allow
covariate effects to be estimated and provide better insight into etiology, spread, prediction,
and control of disease.
Kleinman et al. (2004) proposed a method based on generalized linear mixed models
to evaluate whether observed counts of disease are larger than would be expected on the
basis of a history of naturally occurring disease. In that model, the number of cases in
area i and time t (y
it
) is assumed to follow a binomial distribution with parameters n
it
and
p
it
, n
it
being the population and p
it
the probability of an individual being a case, which
is modeled as a function of covariate and spatial random effects. Once the model is tted
using historical data observed under endemic conditions, the probability of seeing more
cases than the current observed count of disease is calculated for each small area and time
period to detect unusually high counts of disease.
An alternative approach to prospective disease surveillance is the use of hidden Markov
models. Watkins et al. (2009) provided an extension of a purely temporal hidden Markov
model to incorporate spatially referenced data. More recently, Heaton et al. (2012) have
proposed a spatio-temporal conditional autoregressive hidden Markov model with an
absorbing state. By considering the epidemic state to be absorbing, the authors avoid unde-
sirable behavior such as day-to-day switching between the epidemic and nonepidemic
states. This feature, however, limits the application of the model to a single outbreak of
disease at each location.
Bayesian hierarchical Poisson models, which are extensively used in disease mapping,
have also proved to perform well in the prospective surveillance context. In Vidal Rodeiro
and Lawson (2006), a Poisson distribution with a mean which is a function of the expected
count of disease and the unknown area-specic relative risk was assumed as a data-level
model, thatis, y
it
||λ
it
Po(λ
it
= e
it
θ
it
). Thelogarithm ofthe relativeriskwas thenmodeled as
log(θ
it
) = v
i
+ u
i
+ γ
t
+ ψ
it
,
where v
i
and u
i
represent, respectively, spatially uncorrelated and correlated heterogene-
ity; γ
t
is a smooth temporal trend, and ψ
it
is the space-time interaction effect. To detect
changes in the relative risk pattern of disease, the authors proposed to monitor at each time
t = 2, 3, ..., T the surveillance residuals dened as
J
r
it
s
= y
it
1
e
it
θ
i
(
,
j
t
)
1
,
J
j=1
θ
(
i,
j
t
)
1
being a set of relative risks sampled from the posterior distribution that corre-
sponds to the previous time period.
��
398 Handbook of Discrete-Valued Time Series
In an effort to overcome the estimation problem arising when Bayesian hierarchical
Poisson models are used in a spatio-temporal surveillance context, Zhou and Lawson
(2008) presented an approximated procedure where a spatial convolution model is tted
to the data observed at each time period t. Changes in the risk pattern of disease can then
be detected by comparing the estimated relative risk for each small area
θ
ˆ
it
with a baseline
level
θ
˜
it
, i = 1, 2, ..., m, calculated as an exponentially weighted moving average (EWMA)
of historical estimates
θ
˜
it
= κ
θ
ˆ
i,t1
+ (1 κ)
θ
˜
i,t1
.
In particular, the authors dened a sample-based Monte Carlo p-value as
J
MCP
it
=
1
I(
θ
ˆ
(
it
j)
<
θ
˜
(
it
j)
),
J
j=1
extremely small p-values indicating that an increase in disease risk might have occurred.
The estimated percentage of increase in disease relative risk, which is dened as
1
J
(j) (j)
J
j=1
(
θ
ˆ
it
θ
˜
it
)
PIR
it
=
1
J
(j)
× 100,
θ
˜
J
j=1
it
can also be calculated to assess the magnitude of change in disease risk. This is a computa-
tionally quick technique that has shown to have a good performance in outbreak detection.
However, a sliding window of length one cannot guarantee accurate model estimates. Also,
an EWMA approach is only justied under Gaussian model assumptions.
More recently, Corberán-Vallet and Lawson (2011) have introduced the surveillance con-
ditional predictive ordinate (SCPO) as a general Bayesian model–based surveillance
technique to detect small areas of unusual disease aggregation. The SCPO is based on
the conditional predictive ordinate (CPO), which was introduced by Geisser (1980) as a
Bayesian diagnostic to detect observations discrepant from a given model. The CPO was
further discussed in Gelfand et al. (1992), where a cross-validation approach based on con-
ditional predictive distributions arising from single observation deletion was proposed to
address model determination. In particular, the SCPO is calculated for each small area and
time period as
SCPO
it
= f (y
it
|y
1:t1
) =
f (y
it
|θ
it
) p(θ
it
|θ
i,t1
, y
1:t1
) p(θ
i,t1
|y
1:t1
) dθ
i,t1
dθ
it
,
where y
1:t1
means all the data up to time t 1; f (y
it
|θ
it
) is Poisson, p(θ
it
|θ
i,t1
, y
1:t1
) can be
derived from the model describing the relative risk surface, and p(θ
i,t1
|y
1:t1
) represents
the marginal posterior distribution of parameter θ
i,t1
at time t 1.
399 Spatio-Temporal Modeling for Small Area Health Analysis
In that paper, the convolution model was used to model the behavior of nonseasonal
disease data, since the inclusion of adaptive time components may hinder detection of
changes in risk. In that case, the SCPO simplies to
SCPO
it
= f (y
it
|y
1:t1
) = f (y
it
|θ
i
) p(θ
i
|y
1:t1
) dθ
i
.
A Monte Carlo estimate for the SCPO, which does not have a closed form, can be obtained
from a posterior sampling algorithm as
J
SCPO
it
1
Po(y
it
| e
it
θ
i
(j)
),
J
j=1
where {θ
i
(j)
}
J
is a set of relative risks sampled from the posterior distribution that cor-
j=1
responds to the previous time period. Hence, if there is no change in risk, y
it
will be
representative of the data expected under the previously tted model. Otherwise, SCPO
values close to zero will be obtained.
Corberán-Vallet and Lawson (2011) showed an application of the SCPO to Salmonellosis
cases in South Carolina from January 1995 to December 2003 (see Figure 18.2). In order to
detect occasional outbreaks beyond seasonal patterns, a generalization of the convolution
model allowing for seasonal effects was used to model the regular behavior of disease.
Figure 18.3 displays the spatial distribution of the SCPO for a selection of 6 months periods:
September–October 1996, February–March 2001, and October–November 2002. As can be
seen, values of the SCPO close to zero alert us to counties presenting unusually high counts
of disease.
An important feature of the SCPO is that it can be easily extended to incorporate infor-
mation from the spatial neighborhood, which facilitates outbreak detection capability
when changes in risk affect neighboring areas simultaneously. Also, a multivariate SCPO
(MSCPO) integrating information from K 2 diseases can be computed to improve both
detection time and recovery of the true outbreak behavior when changes in disease inci-
dence happen simultaneously for two or more diseases. For each area i and time t,let
y
it
= (y
it1
, y
it2
, ..., y
itK
) be the vector of observed counts of disease, e
it
= (e
it1
, e
it2
, ..., e
itK
)
the vector of expected counts, θ
ˆ
i
= (θ
ˆ
i1
, θ
ˆ
i2
, ..., θ
ˆ
iK
) the vector of posterior relative risk
estimates at the previous time point using a spatial-only shared-component model, and
Monthly counts
50
100
150
200
1996 1998 2000 2002 2004
Date
FIGURE 18.2
Monthly counts of reported Salmonellosis cases in South Carolina for the period 1995–2003.
��
400
Handbook of Discrete-Valued Time Series
September 96 October 96
February 01 March 01
October 02 November 02
< 0.01
[0.01,0.08)
[0.08,0.25)
[0.25,0.50)
>= 0.50
FIGURE 18.3
Spatial distribution of the scaled SCPO for the Salmonellosis data at those months undergoing a possible outbreak
of disease.
y
it
h
= (y
itk
1
, y
itk
2
, ..., y
itk
n
) the vector of observed counts higher than expected, that is y
itk
>
e
itk
θ
ˆ
ik
. Corberán-Vallet (2012) dened the MSCPO for each small area and time period as the
conditional predictive distribution of those counts of disease higher than expected given
the data observed up to the previous time period, that is,
MSCPO
it
= f (y
itk
1
, y
itk
2
, ..., y
itk
n
|y
1:t1
)
=
...
f (y
itk
1
, y
itk
2
, ..., y
itk
n
|θ
ik
1
, θ
ik
2
, ..., θ
ik
n
)
× π(θ
ik
1
, θ
ik
2
, ..., θ
ik
n
|y
1:t1
)dθ
ik
1
dθ
ik
2
...dθ
ik
n
.
Values of the MSCPO close to zero alert to both small areas of increased disease incidence
and the diseases causing the alarm within each area.
401 Spatio-Temporal Modeling for Small Area Health Analysis
This line of research is particularly useful, since surveillance systems are often focused
on more than one disease within a predened study region. On those occasions when out-
breaks of disease are likely to be correlated, the use of multivariate surveillance techniques
enhances sensitivity and timeliness of outbreak detection for events that are present in more
than one data set. Yet, little work has been conducted within this scenario (Kulldorff et al.,
2007; Banks et al., 2012; Corberán-Vallet, 2012).
Much of the new literature in the area of prospective disease surveillance relates to syn-
dromic surveillance, which has been introduced as an efcient tool to detect the outbreaks
of disease at the earliest possible time, possibly even before denitive disease diagno-
sis is obtained. The idea is to monitor syndromes associated with disease such as school
and work absenteeism, over-the-counter medication sales, medical consultations, etc. For
instance, Kavanagh et al. (2012) adapted the Farrington et al. (1996) algorithm to moni-
tor calls received by the National Health Service telephone helpline in Scotland. As Chan
et al. (2012) emphasized, syndromic surveillance is effective to provide early awareness.
However, alerts based on syndrome aberrations surely contain uncertainty, and so they
should be evaluated with a proper probabilistic measure. Chan et al. (2012) used a space-
time Bayesian hierarchical model incorporating information from meteorological factors
to model inuenza-like illness visits. The risk of an outbreak was assessed using Pr(y
it
>
threshold |y
1:t1
). A very fruitful area for further research in the context of syndromic
surveillance would be the development of a multivariate model to describe the disease
of interest and the syndromic data jointly. This approach would allow us to quantify the
effect that changes in the behavior of the syndromes have on the incidence of the disease
under study.
18.6 Conclusions
In this chapter, we have attempted to survey the varied approaches to spatio-temporal
series found in small area health studies. Our focus has been on descriptive models of
space-time disease incidence data and also, in later sections, on prospective surveillance.
Both areas provide a ready source of challenging problems for the modeler and much work
is still needed to develop exible and relevant models that can be used more widely by the
Public Health community. While there are many testing approaches available to spatio-
temporal incidence data, we have taken a Bayesian model–based approach. We rmly
believe that this provides much greater exibility for handling the complexities of spatio-
temporal variation and will provide the greatest insight into spatial health dynamics in
the future.
Acknowledgments
This work was supported by Grant Number R03CA162029 from the National Cancer
Institute. The content is solely the responsibility of the authors and does not necessarily
represent the ofcial views of the National Cancer Institute or the National Institutes of
Health.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset