19: Models for Multivariate Count Time Series (2/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google





412 Handbook of Discrete-Valued Time Series

Note that by assuming any distribution other than Bernoulli for Z

s in (19.2), we get a

generalized Steutel and van Harn operator. Other operators can also be similarly dened.

A review on thinning operators can be found in Weiß (2008).

The model can be generalized to have p terms (i.e., INAR(p)), but there is no unique way

to do this. Moving average (MA) terms can be added to the model leading to INARMA

models. Covariates can also be introduced to model the mean of the innovation term to

allow measuring the effect of additional information leading to INAR regression models.

We next present extensions to the multivariate case by rst extending the thinning operator

to a matrix-valued form and then presenting the multivariate INAR model.

Let A be an r × r matrix with elements α

, i, j = 1, ..., r and Y be a nonnegative

integer-valued r-dimensional vector. The matrix-valued operator “◦” is dened as

⎛

⎞

◦ Y

⎜

⎟

⎜

j = 1

⎟

⎜

⎟

⎜

⎟

A ◦ Y =

⎜

⎟

⎜

⎟

⎝

⎠

◦ Y

j = 1

The univariate operations α ◦ X and β ◦ Y are independent if and only if the counting

processes in their denitions are independent. Hence, the matrix-valued operator implies

independence between the univariate operators. Properties of this operator can be found

in Latour (1997).

Using this operator, Latour (1997) dened a multivariate generalized INAR process of

order p (MGINAR(p)) by assuming that

= A

◦ Y

t−j

+ 

j = 1

where Y

and 

are r-vectors and A

, j = 1, ..., p are r × r matrices and gave conditions

for existence and stationarity. A more focused presentation of the model follows.

19.3.2 Multivariate INAR Model

Let Y and R be nonnegative integer-valued random r-vectors and let A be an r × r matrix

with elements {α

}

i,j = 1,...,r

. The MINAR(1) process can be dened as

⎛

... α

⎞

⎛

1,t−1

⎞

⎛

⎞

= A ◦ Y

t−1

+ R

⎜

⎝

...

⎟

⎠

◦

⎜

⎝

⎟

⎠

⎜

⎝

⎟

⎠

, t = 1, 2 ... (19.3)

... α

r,t−1

The vector of innovations R

follows an r-variate discrete distribution, which characterizes

the marginal distribution of the Y

as well. More on this will follow. When A is diagonal,

we will call the model diagonal MINAR. Clearly, this has less structure.



413 Models for Multivariate Count Time Series

The nonnegative integer-valued random process {Y

}

t∈Z

is the unique strictly stationary

solution of (19.3), if the largest eigenvalue of the matrix A is less than 1 and E



< ∞

(see also Franke and Rao, 1995; Latour, 1997).

To help the exposition consider the case with r = 2. The two series can be written as

= α

◦ Y

1,t−1

+ α

◦ Y

2,t−1

+ R

= α

◦ Y

2,t−1

+ α

◦ Y

1,t−1

+ R

This helps to understand the dynamics. The cross correlation between the two series

comes from sharing common elements as well as from the joint distribution of (R

, R

If A is a diagonal matrix in this bivariate example, so that α

= α

= 0, then

the two series are univariate INAR models but are still correlated due to the joint pmf

of (R

, R

Taking expectations on both sides of (19.3), it is straightforward to obtain

μ = E(Y

) =[I − A]

−1

E(R

). (19.4)

The variance–covariance matrix γ(0) = E [(Y

− μ)(Y

− μ)



] satises a difference equation

of the form

γ(0) = Aγ(0)A



+ diag(Bμ) + Var(R

). (19.5)

The innovation series R

consists of identically distributed sequences {R

}

and has mean

i=1

E(R

) = λ = (λ

, ..., λ

) and variance

⎡

⎤

... φ

⎢

... φ

⎥

⎢

⎥

Var(R

) =

⎢

⎥

⎣

⎦

... υ

where υ

> 0, i = 1, ..., r. Depending on the value of the parameter υ

, the assumptions of

equidispersion (υ

= 1), overdispersion (υ

> 1), and underdispersion (υ

∈ (0, 1)) can be

obtained.

In the bivariate case, that is, when r = 2, it can be proved that the vector of expectations

(19.5) has elements

(1 − α

)λ

+ α

= ,

(1 − α

)(1 − α

) − α

(1 − α

)λ

+ α

= ,

(1 − α

)(1 − α

) − α





414 Handbook of Discrete-Valued Time Series

while the elements of γ(0) are

(0) = Var(Y

)

(1 −



Var(Y

) + 2α

Cov(Y

, Y

)

+ α

(1 − α

)µ

+ α

(1 − α

)µ

+ υ

(0) = Var(Y

)



(1 − α

Var(Y

) + 2α

Cov(Y

, Y

)

+ α

(1 − α

)µ

+ α

(1 − α

)µ

+ υ

(0) = γ

(0) = Cov(Y

, Y

)

Var(Y

) + α

Var(Y

) + φ

1 − α

− α

where φ is the covariance between the innovations.

Note that Cov(Y

, R

) = Cov(R

, R

), i, j = 1, ..., r, i = j (Pedeli and Karlis, 2011).

That is, the covariance between the current value of one process and the innovations of the

other process at time t is equal to the covariance of the innovations of the two series at the

same time t.

Regarding the covariance function γ(h) =E [(Y

t+h

− μ)(Y

− μ)



] for h > 0, iterative cal-

culations provide us with an expression of the form

γ(h) = Aγ(h − 1) = A

γ(0), h ≥ 1, (19.6)

where γ(0) is given by (19.5).

Applying the well-known Cayley–Hamilton theorem to (19.6), it is straightforward

to show that the marginal processes will have an ARMA (r,r − 1) correlation structure.

Since A is an r ×r matrix, the Cayley–Hamilton theorem ensures that there exist constants

,…, ξ

, such that A

− ξ

r−1

−···−ξ

I = 0.Thus, γ(h) satises

γ(h) − ξ

γ(h − 1) −···−ξ

γ(h − r) = 0, h ≥ r. (19.7)

Equations (19.6) and (19.7) hold for every element in γ(h), and hence, the autocorrelation

function of {Y

}, j = 1, ..., r satises

(h) − ξ

(h − i) = 0, h ≥ r.

i=1



415 Models for Multivariate Count Time Series

Thus, each component has an ARMA(r,r−1) correlation structure (see also McKenzie, 1988;

Dewald et al., 1989). In the simplest case of a BINAR(1) model, the marginal processes have

ARMA(2,1) correlations with ξ

= α

+ α

and ξ

= α

− α

. For the diagonal

MINAR(p) case, the marginal process is the simple univariate INAR(p) process.

Al-Osh and Alzaid (1987) expressed the marginal distribution of the INAR(1) model in



∞

terms of the innovation sequence {R

},thatis, Y

◦ R

t−i

. This result was easily

extended to the case of a diagonal MINAR(1) process (Pedeli and Karlis, 2013c) where

∞

= α

i=0



◦ R

j,t−i

For the general MINAR(1) process, the distribution of such a process can also be expressed

in terms of the multivariate innovation sequence R

∞

= A

◦ R

t−i

i=0

where A

= PD

−1

. Here, P is the matrix of the eigenvectors of A and D is the diagonal

matrix of the eigenvalues of A. Since all the eigenvalues should be smaller than 1 in order

for stationarity to hold, the matrix D

tends to a zero matrix as i →∞and hence A

tends

to zero as well.

The usefulness of such expressions is that they facilitate the derivation of the (joint)

probability generating function (pgf) of the (multivariate) process, thus revealing its

distribution. Assuming stationarity, the joint pgf G

(s) satises the difference equation

(s) = G

s)G

(s).

More details can be found in Pedeli and Karlis (2013c).

Extensions of the model mentioned earlier are possible. One can add covariates to the

mean of the innovations using a log link function. This allows us to t the effect of some

other covariates to the observed multivariate time series, see Pedeli and Karlis (2013b) for

such an application. Also, extensions to higher order are straightforward but lead to rather

complicated models.

19.3.3 Estimation

The least squares approach for estimation was discussed in Latour (1997). However, based

on parametric assumptions for the innovations, other estimation methods are available.

Parametric models also offer more exibility for predictions.

For the estimation of the BINAR(1) model, the method of conditional maximum likeli-

hood can be used. The conditional density of the BINAR(1) model can be constructed as

the convolution of



 



416 Handbook of Discrete-Valued Time Series

  

(k) =



−1

−

t−

(1 − α

)

1,t−1

−j

k−j

(1 − α

)

2,t−1

−k+j

  

2,t−1

1,t−1

s−j

(s) =

s − j

(1 − α

)

2,t−1

−j

(1 − α

)

1,t−1

−s+j

and a bivariate distribution of the form f

, r

) = P(R

= r

, R

= r

). The functions

(·) and f

(·) are the pmfs of a convolution of two binomial variates. Thus, the conditional

density takes the form

f (y

t−1

, θ) = f

(k)f

(s)f

− k, y

− s),

k=0

s=0

where g

= min(y

, y

1,t−1

) and g

= min(y

, y

2,t−1

). Maximum likelihood estimates of

the vector of unknown parameters θ can be obtained by maximization of the conditional

likelihood function

L(θ|y) =

f (y

t−1

, θ) (19.8)

t=1

for some initial value y

. The asymptotic normality of the conditional maximum likelihood

estimate θ

has been shown in Franke and Rao (1995) after imposing a set of regular-

ity conditions and applying the results of Billingsley (1961) for the estimation of Markov

processes.

Numerical maximization of (19.8) is straightforward with standard statistical packages.

The binomial convolution implies nite summation and hence it is feasible. Note also that

since the pgf of a binomial distribution is a polynomial, one can derive the pmf of the convo-

lution easily via polynomial multiplication using packages in R. Depending on the choice

for the innovation distribution, the conditional maximum likelihood (CML) approach can

be applied. In Pedeli and Karlis (2013c), a bivariate Poisson and a bivariate negative bino-

mial distribution were used. For the parametric models prediction was discussed. An

interesting result is that for the bivariate Poisson innovations the univariate series have

a Hermite marginal distribution. In Karlis and Pedeli (2013), a copula-based bivariate

innovation distribution was used allowing negative cross-correlation.

When moving to the multivariate case things become more demanding. First of all, a

multivariate discrete distribution is needed for the innovations. As discussed in Section

19.2, such models can be complicated. In Pedeli and Karlis (2013a), a multivariate Pois-

son distribution is assumed with a diagonal matrix A. Even in this case, the pmf of

the multivariate Poisson distribution is demanding since multiple summation is needed.

The conditional likelihood can be derived as in the bivariate case but now this is a con-

volution of several binomials and a multivariate discrete distribution. Alternatively, a

composite likelihood approach can be used. Composite likelihood methods are based

on the idea of constructing lower-dimensional score functions that still contain enough

information about the structure considered but they are computationally more tractable

(Varin, 2008). See also Davis and Yau (2011) for asymptotic properties of composite

likelihood methods applied to linear time series models.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 19: Models for Multivariate Count Time Series (2/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
19: Models for Multivariate Count Time Series (2/4)