19
Models for Multivariate Count Time Series
Dimitris Karlis
CONTENTS
19.1 Introduction...................................................................................407
19.2 Models for Multivariate CountData.......................................................408
19.2.1 Use of Multivariate Extensions of SimpleModels...............................408
19.2.2 Models Based on Copulas...........................................................409
19.2.3 OtherMultivariate Modelsfor Counts............................................411
19.3 ModelsBased on Thinning..................................................................411
19.3.1 The Standard INAR Model..........................................................411
19.3.2 MultivariateINARModel...........................................................412
19.3.3 Estimation.............................................................................415
19.3.4 Other Models in This Category.....................................................417
19.4 Parameter-Driven Models...................................................................418
19.4.1 Latent Factor Model. .. .. .. .. .. .. .. .. .. .. ........... .. .. .. .. .. .. .. .. .. .. .. ........... .418
19.4.2 StateSpace Model....................................................................418
19.5 Observation-Driven Models.................................................................419
19.6 More Models...................................................................................421
19.7 Discussion......................................................................................421
References............................................................................................422
19.1 Introduction
We have seen a tremendous increase in models for discrete-valued time series over the
past few decades. Although there is a ourishing literature on models and methods for
univariate integer-valued time series, the literature is rather sparse for the multivariate
case, especially for multivariate count time series. Multivariate count data occur in several
different disciplines like epidemiology, marketing, criminology, and engineering, just to
name a few. For example, in syndromic surveillance systems, we record the number of
patients with a given symptom. An abrupt change in this number could indicate a threat
to public health, and our goal would be to discover such a change as early as possible. In
practice, a large number of symptoms are counted creating possibly associated multiple
time series of counts. An adequate analysis of such multiple time series requires models
that can take into account the correlation across time as well as the correlations between
the different symptoms.
407
408 Handbook of Discrete-Valued Time Series
Another example comes from geophysical research, where data are collected over time
on the number of earthquakes whose magnitudes are above a certain threshold (Boudreault
and Charpentier, 2011). Different series can be generated from adjacent areas, making an
important scientic question the correlation between the two areas. In criminology, one
counts the number of occurrences of one type of crime in successive time periods (say,
weeks). Analyzing together more than one type of crime generates many count time series
that may be correlated. In nance, an analyst might wish to model the number of bids and
asks for a stock, or the number of trades of different stocks in a portfolio. Similar examples
may be seen for the number of purchases of different but related products in marketing,
the number of claims of different policies in actuarial science, etc.
The underlying similarity in all the earlier mentioned examples is that the collected data
are correlated counts observed at different time points. Hence, we have two sources of
correlation, serial correlation since the data are time series and cross-correlation since the
component time series are correlated at each time point. The need to account for both serial
and cross-correlation complicates model specication, estimation, and inference. The liter-
ature on statistical models for multivariate time series of counts is rather sparse, perhaps
because the analytical and computational issues are complicated. In recent years, new
models have been developed to facilitate the modeling approach, which we discuss in
this chapter.
We start in Section 19.2 with a brief review of some models for multivariate count data
and a discussion of the problems that arise. These models form the basis for the time
series models that will be discussed in the following sections along three main avenues:
models based on thinning (Section 19.3), parameter-driven models (Section 19.4), and
observation-driven models (Section 19.5). Additional models will be mentioned in Section
19.6. Concluding remarks are given in Section 19.7.
19.2 Models for Multivariate Count Data
Even ignoring the time correlation there are not many models for multivariate counts in the
literature. Inference for multivariate counts is analytically and computationally demand-
ing. Perhaps the case is easier and more developed in the bivariate case but there are several
bivariate models that cannot easily generalize to the multivariate. This is a major obstacle
for the development of exible models to be used also in the time series context. We briey
explore some of the issues.
19.2.1 Use of Multivariate Extensions of Simple Models
Consider, for example, the simplest extension of the univariate Poisson distribution to
the bivariate case. As in Kocherlakota and Kocherlakota (1992), the bivariate Poisson has
probability mass function (pmf )
P(y
1
, y
2
) = P(Y
1
= y
1
, Y
2
= y
2
; )
y
1
y
2
min(y
1
,y
2
)

s
= e
(θ
1
+θ
2
+θ
0
)
θ
y
1
1
!
θ
y
2
2
!
y
s
1
y
s
2
s!
θ
θ
1
θ
0
2
, (19.1)
s = 0
409 Models for Multivariate Count Time Series
where θ
1
, θ
2
, θ
0
0, y
1
, y
2
= 0, 1, ..., = (θ
1
, θ
2
, θ
0
). θ
0
is the covariance while the
marginal means and variances are equal to θ
1
+θ
0
and θ
2
+θ
0
, respectively. The marginal
distributions are Poisson. One can easily see that this pmf involves a nite summation that
can be computationally intensive for large counts. This bivariate Poisson distribution only
allows positive correlation. We denote this by BP(θ
1
, θ
2
, θ
0
). For θ
0
=0, we get two indepen-
dent Poisson distributions. We may generalize this model by considering mixtures of the
bivariate Poisson. Although there are a few schemes, two ways to do this have been stud-
ied in detail. Most of the literature assumes a BP(αθ
1
, αθ
2
, αθ
0
) distribution and places a
mixing distribution on α. Depending on the choice of the distribution of α, such a model
produces overdispersed marginal distributions but with always positive correlation. The
correlation comes from two sources, the rst is the intrinsic one from θ
0
and the second is
due to the use of a common α.
A more rened model can be produced by assuming a BP(θ
1
, θ
2
,0) and letting θ
1
, θ
2
jointly vary according to some bivariate continuous distribution, as, for example, in
Chib and Winkelmann (2001) where a bivariate lognormal distribution is assumed. Here,
the correlation comes from the correlation of the joint mixing distribution, and thus, it can
be negative as well. The obstacle is that we do not have exible bivariate distributions
to use for the mixing, or some of them may lead to computational problems. The bivariate
Poisson lognormal distribution in Chib and Winkelmann (2001) does not have closed-form
pmf and bivariate integration is needed.
It is interesting to point out that generalization to higher dimensions is not straight-
forward even for simple models. For example, generalizing the bivariate Poisson to the
multivariate Poisson with one correlation parameter for every pair of variables leads to
multiple summation, see the details in Karlis and Meligkotsidou (2005). We will see later
some ideas on how to overcome these problems.
19.2.2 Models Based on Copulas
A different avenue to build multivariate models is to apply the copula approach.
Copulas (see Nelsen, 2006) have found a remarkably large number of applications in
nance, hydrology, biostatistics, etc., since they allow the derivation and application of
exible multivariate models with given marginal distributions. The key idea is that the
marginal properties can be separated from the association properties, thus leading to a
wealth of potential models. For the case of discrete data, copula-based modeling is less
developed. Genest and Nešlehová (2007) provided an excellent review on the topic. It
is important to keep in mind that some of the desirable properties of copulas are not
valid when dealing with count data. For example, dependence properties cannot be fully
separated from marginal properties. To see this, consider the Kendall’s tau correlation coef-
cient. The probability for a tie is not zero for discrete data and depends on the marginal
distribution, hence the value of Kendall’s tau is also dependent on the marginal distribu-
tions. Furthermore, the pmf cannot be derived through derivatives but via nite differences
which can be cumbersome in larger dimensions. For a recent review on copulas for discrete
data, see Nikoloulopoulos (2013b).
To help the exposition we rst discuss bivariate copulas.
Denition (Nelsen, 2006). A bivariate copula is a function C from [0, 1]
2
to [0, 1] with
the following properties: (a) for every {u, v}∈[0, 1], C(u,0) =0 =C(0, v) and C(u,1) =u,
410 Handbook of Discrete-Valued Time Series
C(1, v) = v and (b) for every {u
1
, u
2
, v
1
, v
2
}∈[0, 1] such that u
1
u
2
and v
1
v
2
,
C(u
2
, v
2
) C(u
2
, v
1
) C(u
1
, v
2
) + C(u
1
, v
1
) 0.
That is, copulas are bivariate distributions with uniform marginals. Recall the inver-
sion theorem, central in simulation, where starting from a uniform random variable and
applying the inverse transform of a distribution function we can generate any desired dis-
tribution. Copulas extend this idea in the sense that we start from two correlated uniforms
and hence we end up with variables from whatever distribution we like which are still
correlated.
If F(x) and G(y) are the cdfs of the univariate random variables X and Y, then
C(F(x), G(y)) is a bivariate distribution for (X, Y) with marginal distributions F and G,
respectively. Conversely, if H is a bivariate cdf with univariate marginal cdf ’s F and G, then,
according to Sklar’s theorem (Sklar, 1959) there exists a bivariate copula C such that for all
(X, Y), H(x, y) = C(F(x), G(y)).If F and G are continuous, then C is unique, otherwise,
C is uniquely determined on range F × range G. This lack of uniqueness is not a prob-
lem in practical applications as it implies that there may exist two copulas with identical
properties.
Copulas provide the joint cumulative function. In order to derive the joint density (for
continuous data) or the joint probability function (for discrete data) we need to take the
derivatives
or the nite differences of the copula. For bivariate discrete data, the pmf is
obtained by nite differences of the cdf through its copula representation (Genest and
Nešlehová, 2007), that is,
h(x, y; α
1
, α
2
, θ) = C(F(x; α
1
), G(y; α
2
); θ) C(F(x 1; α
1
), G(y; α
2
); θ)
C(F(x; α
1
), G(y 1; α
2
); θ) + C(F(x 1; α
1
), G(y 1; α
2
); θ),
where F(·) and G(·) are the marginal cdfsand α
1
and α
2
are the parameters associ-
ated with the respective marginal distributions and θ denotes the parameter(s) of the
copula. This poses a big problem in higher dimensions. In order to take differences,
we need to evaluate the copula eight times in the trivariate case and 2
d
times for d
dimensions.
Copulas are cdfs and thus in many cases they are given as multidimensional integrals and
not as simple formulas. A simple example is the bivariate Gaussian copula which is dened
as a bivariate integral. In this case, one needs to evaluate multidimensional integrals many
times in order to evaluate the pmf. To avoid this extensive integration, one can switch to
copulas that are given in simple form without the need to integrate (e.g., the Frank copula).
But even in this case, one needs to add and subtract several numbers (which are usually
very close to 0) leading to possible truncation errors.
Another problem relates to the shortage of copulas that can allow for exible correlation
structure. For example, the multivariate Archimedean copulas assign the same correlation
to all pairs of variables, which is too restrictive in practice. Also if one needs to specify
both positive and negative correlations, more restrictions apply. To sum up, a big issue in
working with models dened via copulas is the lack of a framework that allows exible
structure while maintaining computational simplicity.
411 Models for Multivariate Count Time Series
19.2.3 Other Multivariate Models for Counts
There are other strategies to build exible models for multivariate counts such as models
based on conditional distributions (Berkhout and Plug, 2004), or nite mixtures (Karlis and
Meligkotsidou, 2007). In most cases, things are more complicated than continuous models
where the multivariate normal distributions is a cornerstone allowing for great exibility
and feasible calculations.
19.3 Models Based on Thinning
19.3.1 The Standard INAR Model
Among the most successful integer-valued time series models proposed in the litera-
ture are the INteger-valued AutoRegressive (INAR) models, introduced by McKenzie
(1985) and Al-Osh and Alzaid (1987). Since then, several articles have been published
to extend and generalize these models. The reader is referred to McKenzie (2003) and
Jung and Tremayne (2006) for a comprehensive review of such models. The extension
of the simple INAR(1) process to the multidimensional case is interesting as it pro-
vides a general framework for multivariate count time series modeling. The model had
been considered in Franke and Rao (1995) and Latour (1997) but since then, there has
been a long hiatus since this topic was addressed again by Pedeli and Karlis (2011) and
Boudreault and Charpentier (2011).
Denition A sequence of random variables {Y
t
: t = 0, 1, ...} is an INAR(1) process if it
satises a difference equation of the form
Y
t
= α Y
t1
+ R
t
; t = 1, 2, ...,
where α ∈[0, 1], R
t
is a sequence of uncorrelated nonnegative integer-valued random
variables with mean µ and nite variance σ
2
(called hereafter as the innovations), and Y
0
represents an initial value of the process.
The operator is dened as
Y
α Y =
Z
i
= Z, (19.2)
i=1
where Z
i
are independently and identically distributed Bernoulli random variables with
P(Z
i
= 1) = 1 P(Z
i
= 0) = α. This operator, known as the binomial thinning operator, is
due to Steutel and van Harn (1979) and mimics the scalar multiplication used for normal
time series models so as to ensure that only integer values will occur.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset