2: Markov Models for Count Time Series (2/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google



  

34 Handbook of Discrete-Valued Time Series

where the K

(α) are independent over t and i. This can be viewed as a dynamic

system so that K(α) �

t−1

is a sum where each countable unit at time t − 1 may

be absent, present, or split into more than one new unit at time t,and 

consists

of the new units at time t.Also K(α) � y can be considered as a compounding or

branching operator, and the time series model can be considered as a branching

process model with immigration.

• Time series based on random coefcient thinning: this is random binomial thin-

ning, where the chance of survival to the next time is a random variable that

depends on t.

t−1

= A

◦ Y

t−1

+ 

= I

) + 

, (2.3)

i=1

A beta-binomial thinning operator based on the construction in Section 2.4 ts

within this class.

Because all of the above models have a conditional expectation that is linear in the previous

observation, they have been called integer-autoregressive models of order 1, abbreviated

INAR(1). The models are not truly autoregressive in the sense of linear in the previous

observations (because such an operation would not preserve the integer domain).

2.3.1 Analogues of Gaussian AR(p)

An extension of (2.3) to a higher-order Markov time series model is given in Section 2.4 for

one special case. Otherwise, binomial thinning is a special case of generalized thinning. We

next extend (2.2) to higher-order Markov:

p p

t−j

= K(α

) �

t−j

+ 

= K

tji

(α

) + 

, (2.4)

j=1 j=1 i=1

where 0 ≤ α

≤ 1forj = 1, ..., p and the K

tji

(α

) are independent over t, j and i,and

is the

innovation at time t. This is called GINAR(p) in (Gauthier and Latour 1994; Latour 1997,

1998). It can also be interpreted as a branching process model with immigration, where

aunitattime t has independent branching at times t + 1, ..., t + p. The most common

form of INAR(p) in the statistical literature involves the binomial thinning operator; see

Du and Li (1991). For the binomial thinning operator, Alzaid and Al-Osh (1990) dene

INAR(p) in a different way from the above with a conditional multinomial distribution

for (α

◦ Y

, ..., α

◦ Y

). Because the survival/continuation interpretation for (2.1) does

not extend to second and higher orders, it is better to consider (2.4) with more general

thinning operators; if the K

tji

are Bernoulli random variables, this can still be interpreted as

a branching process model with immigration (with limited branching).

More specically, for a GINAR(2) model based on compounding, unit i at time t



con-

tributes K



+1,i

(α

) units to the next time and K



+2,i

(α

) units in two time steps. That is,

at time t, the total count comes from branching of units at times t − 1and t − 2plusthe

innovation count.

 

 





35 Markov Models for Count Time Series

It will be shown below that the GINAR(p) model has an overdispersion property if

{K(α)} satises Var[K(α)]= σ

= a

α(1 − α) where a

≥ 1. A sufcient condition

K(α)

for this is the self-generalizability of {K(α)} (dened below in Section 2.3.3).

Let R

t−1

, ..., y

t−p

) =



j=1



t−

tji

(α

). Then its conditional mean and variance are



t−j

and

t−j

Var[K(α

)], respectively. That is, this GINAR(p) model has linear

j= j=

conditional expectation and variance, given previous observations. The mean is the same

for all {K(α)} that have E[K(α)]=α, but the conditional variance depends on the family of

{K(α)}. With a self-generalized family, the conditional variance is a



t−j

(1 − α

) so

that different families of {K(α)} lead to differing amounts of conditional heteroscedasticity,

and a larger value of a

leads to more heteroscedasticity.

The condition for stationarity of (2.4) is



< 1. In this case, in stationary state, the

equations for the mean and variance lead to





= μ

+ μ



, μ

(1 − α

−···−α

)

, (2.5)

j=1

and

 

= σ





+ 2



k−j



+ μ



K(α

)

+ σ



, (2.6)

j=1 1≤j<k≤p j=1

where ρ



is the autocorrelation at lag . If the innovation is overdispersed relative to

Poisson (that is, σ



/μ



≥ 1), then we show that the stationary distribution of Y is also

overdispersed. From (2.5) and (2.6), and assuming σ

K(α)

= a

α(1 − α),



1 − α

+ σ



1 −



/μ



1 −



j=1

− 2



1≤j<k≤p

k−j



 



j=1

1 − α

+ 1 −

j=1

≥

1 −



j=1

− 2



1≤j<k≤p

k−j

1 −



1 −



j=1

− 2



1≤j<k≤p

k−j

≥ 1,

because ρ



≥0 and the denominator is positive. The inequality is strict for p > 1withρ

> 0.

If the innovation is Poisson with σ



/μ



= 1and a

= 1 for binomial thinning, then one

still has σ

/μ

> 1forp > 1, so that the stationary distribution cannot be Poisson. A Markov

model of order p with stationary Poisson marginal distributions and Poisson innovations

is developed in Section 2.4.

With p =1and α

=α, the above becomes D =σ

/μ

=(1 + α)

−1

α + σ



/μ



For a GINAR(p) stationary time series without a self-generalized family {K(α)}, no general

overdispersion property can be proved.



 

  



36 Handbook of Discrete-Valued Time Series

2.3.2 Analogues of Gaussian Autoregressive Moving Average

To dene analogues of Gaussian moving-average (MA) and ARMA-like models, let

{

: ..., −1, 0, 1, ...}be a sequence of independent and identically distributed random vari-

ables with support on N

; 

is an innovation random variable at time t. In a general context,

the extension of moving average of order q becomes q-dependent where observations more

than q apart are independent.

The model, denoted as INMA(1), is



t−1

= 

+ K(α



) � 

t−1

= 

+ K

(α



), (2.7)

i=1

with independent K

(α



) over t, i, and the model denoted as INMA(q)is



t−j

= 

+ K

tji

(α



), (2.8)

j=1 i=1

with independent K

tji

(α



) over t, j, i. The model denoted as INARMA(1, q), with a construc-

tion analogous to the Poisson ARMA(1, q) in McKenzie (1986), is the following:

q−1 q−1



t−j

= W

t−q

+ K(α



) � 

t−j

= W

t−q

+ K

tji

(α



j=0 j=0 i=1

(2.9)

s−1

= K(α) � W

s−1

+ 

= K

s

(α) + 

=1

with independent K

tji

(α



), K

s

(α) over t, j, i, s, .If α = 0, then W

= 

and Y

= 

t−q



−

K(α



) � 

t−j

is q-dependent (but not exactly the same as (2.8)).

2.3.3 Classes of Generalized Thinning Operators

In this subsection, some classes of generalized thinning operators and known results about

the stationary distribution for GINAR series with p = 1 are summarized. The following

denitions are needed:

Denition 2.1 (Generalized discrete self-decomposability and innovation).

(a) A nonnegative integer-valued random variable Y is generalized discrete self-

decomposable (GDSD) with respect to {K(α)} if and only if (iff)

K(α) � Y + (α) for each α ∈[0, 1].

In this case, (α) has pgf G

(s)/G

(s; α)).



37 Markov Models for Count Time Series

(b) Under expectation thinning compounding and a GDSD marginal (with pgf

(s)), the stationary time series model is (2.2), where the innovation 

has pgf

(s)/G

(s; α)).

Denition 2.2 (Self-generalized {K

(

)

]

}). Consider a family of K(α) ∼ F

(·; α) with

E[K(α)]=α and pgf G

(s; α) = E[s , α ∈[0, 1]. Then {F

(·; α)} is self-generalized iff

(s; α); α



) = G

(s; αα



), ∀ α, α



∈ (0, 1).

For binomial thinning, the class of possible margins is called the discrete self-

decomposable (DSD) class. Note that unless Y is Poisson and {K(α)} corresponds to

binomial thinning, the distribution of the innovation is in a different parametric family

than F

The terminology of self-generalizability is used in Zhu and Joe (2010b), and the concept

is called a semigroup operator in Van Harn and Steutel (1993). Zhu and Joe (2010a) show

that (1) Var[K(α)]=σ

= a

α(1 − α), where a

≥ 1 for a self-generalized family {K(α)}

K(α)

and (2) that generalized thinning operators without self-generalizability lack some closure

properties. Also self-generalizability is a nice property for embedding into a continuous-

time process.

For NB, Zhu and Joe (2010b) show that NB(θ, ξ) is GDSD for three self-generalizable

thinning operators that are given below. For NB(θ, ξ), with parametrization as given in

Section 2.2, the pgf is G

(s; θ, ξ) =[π/{1 − (1 − π)s}]

,for s > 0, θ > 0and ξ > 0.

Three types of thinning operators based on {K(α)} are given below in terms of the pgf,

together with Var[K(α)]; the second operator (I2) has been used by various authors in sev-

eral different parametrizations; the specication is simplest via pgfs. The different {K(α)}

families allow different degrees of conditional heteroscedasticity.

(I1) (binomial thinning) G

(s; α) = (1 − α) + αs,with Var[K(α)]=α(1 − α).

(I2) G

(s; α; γ) =

(1−α)+(α−γ)s

,0 ≤ γ ≤ 1, with Var[K(α)]=α(1−α)(1+γ)/(1−γ).

(1−αγ)−(1−α)γs

Note that γ = 0 implies G

(z; α) = (1 − α) + αs.

(I3) G

(s; α; γ) = γ

−1

[1 +γ −(1 +γ −γs)

],0 ≤ γ,withVar[K(α)]=α(1 −α)(1 +γ).

Note that γ → 0 implies G

(s; α) = (1 − α) + αs.

For NB(θ, ξ), GDSD with respect to I2(γ)holds for0 ≤ γ ≤ 1 − π = ξ/(1 + ξ),and

GDSD with respect to I3(γ)holdsfor0 ≤ γ ≤ (1 − π)/π = ξ. For GP(θ, η), the property

of DSD is shown in Zhu and Joe (2003), and it can be shown that GP(θ, η)isGDSDwith

respect to I2(γ(η)), where γ(η) increases as the overdispersion η increases. Note that the

GP distribution does not have a closed-form pgf.

2.3.4 Estimation

For parameter estimation in count time series models, a common estimation approach is

CLS. This involves the minimization of

− E[Y

i−1

, y

i−2

, ...])

for a time series of

length n. For a stationary model, it is straightforward to get point estimators of μ

and

some autocorrelation parameters. One problem with conditional least squares (CLS) is that

it cannot distinguish overdispersed Poisson models for 

and Y

. For example, if a NB or

GP time series is assumed with one of the above generalized thinning operators, then the

overdispersion cannot be reliably estimated with an extra moment equation after CLS.



  





38 Handbook of Discrete-Valued Time Series

We next mention what can be done for computations of pmfs and the likelihood for

binomial thinning and generalized thinning.

1. Zhu and Joe (2006) have an iterative method for computing pmfs with binomial

thinning and a DSD stationary margin.

2. The pgf of K � y has closed form if the pgf of K(α) has closed form and the pgf of

the innovation has closed form if the pgf of Y has closed form. In this case, Zhu

and Joe (2010b) invert a characteristic function for the pgf of G

K(α)�y

(α)

using an

algorithm of Davies (1973) to compute the conditional pmf of Y

given Y

t−1

= y

t−1

Let ϕ

(s) = E

isW

= G

for a nonnegative integer random variable W and

�

(u)e

−iuw

dene a(w) :=

− (2π)

−1

−π

1−e

−iu

du. Then Pr(W < w) = a(w).The

pmf of W is

(0) = Pr(W < 1) = a(1), f

(w) = a(w + 1) − a(w), w = 1, 2, ....

This works for NB but not GP because the latter does not have a closed-form pgf.

2.3.5 Incorporation of Covariates

For a NB(θ, ξ) stationary INAR(1) model, the pdf of the innovation is

(s; θ, ξ)

(s; α); θ, ξ)

For a time-varying θ

that depends on covariates with xed ξ (xed overdispersion index),

suppose the innovation 

has pgf

(s; θ

, ξ)

. (2.10)

(s; α); θ

, ξ)

An advantage of this assumption is that a NB stationary margin results when θ

is constant.

More generally, for GINAR(p) series where the stationary distribution does not have a

simple form, the simplest extension to accommodate covariates is to assume an overdis-

persed distribution for 

and absorb a function of covariates into the mean of 

Alternatively, other parameters can be made into functions of the covariates.

2.4 Operators in Convolution-Closed Class

The viewpoint in this section is to construct a stationary time series of order p based on

a joint pmf f

1···(p+1)

for (Y

t−p

, ..., Y

), where marginal pmfs satisfy f

1:m

= f

(1+i):(m+i)

for

i = 1, ..., p + 1 − m and m = 2, ..., p. Suppose the univariate marginal pmfs of f

1···(p+1)

are

all f

= f

. From this, one has a transition probability f

p+1|1···p

= f

t−p

,...,Y

t−1

. For p = 1,

2|1

leads to a stationary Markov time series of order 1. For p = 2, f

3|12

leads to a stationary

Markov time series of order 2 if f

123

has bivariate marginal pmfs f

= f

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2: Markov Models for Count Time Series (2/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
2: Markov Models for Count Time Series (2/5)