2: Markov Models for Count Time Series (3/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

 

39 Markov Models for Count Time Series

There is some theory that covers several count time series models when f

is convolution-

closed and innitely divisible (CCID), because there is a way to construct a joint multivari-

ate distribution based on these properties and they lead to thinning operators. This theory

provides a bridge between the thinning operator approach of Section 2.3 and the general

Markov approach with copulas in Section 2.5.

The operators have been studied in specic discrete cases by McKenzie (1985, 1986,

1988), Al-Osh and Alzaid (1987), and Alzaid and Al-Osh (1993), and in a more general

framework in Joe (1996) and Jørgensen and Song (1998).

The general operator is presented rst for the Markov order 1 case and then it is men-

tioned how it can be extended to higher-order Markov or q-dependent, etc. series. Also it

will be mentioned how covariates can be accommodated. For this construction, Markov

order 1 implies linear conditional expectation but not Markov orders of 2 or higher.

Let {F(·; θ) : θ > 0} be a CCID parametric family such that F(·; θ

) ∗ F(·; θ

) = F(·; θ

+θ

where ∗ is the convolution operator; F(·;0) corresponds to the degenerate distribution at 0.

For X

∼ F(·; θ

), j = 1, 2, with X

, X

independent, let H(·; θ

, θ

, y) be the distribution of

given that X

+ X

= y.Let R(·) = R(·; α, θ) (0 < α ≤ 1) be a random operator such that

R(Y) given Y = y has distribution H(·; αθ, (1−α)θ, y),andR(Y) ∼ F(·; αθ) when Y ∼ F(·; θ).

A stationary time series with margin F(·; θ) and autocorrelation 0 < α < 1 (at lag 1) can

be constructed as

= R

t−1

) + 

, R

t−1

) ∼ H(·; αθ, (1 − α)θ, y

t−1

), (2.11)

since F(·; θ) = F(·; θα) ∗ F(·; θ(1 − α)), when the innovations 

are independent and iden-

tically distributed with distribution F(·; (1 − α)θ).Notethat {R

: t ≥ 1} are independent

replications of the operator R.

The intuitive reasoning is as follows. A consecutive pair (Y

t−1

, Y

) has a common latent

or unobserved component X

through the stochastic representation:

t−1

= X

+ X

, Y

= X

+ X

where X

, X

are independent random variables with distributions F(·; αθ), F(·; (1 −

α)θ), F(·; (1−α)θ), respectively. The operator R

t−1

) “recovers” the unobserved common

component X

; hence the distribution of R

(y) given Y

t−1

= y must be the same as the

distribution of X

given X

+ X

= y.

Examples of CCID operations for the innite divisible distributions of Poisson, NB and

GP are given below.

1. If F(·; θ) is Po(θ), then H(·; αθ, (1 − α)θ, y) is Bin(y, α). The resulting operator is

binomial thinning.

2. If F(·; θ) = F

(·; θ, ξ) with xed ξ > 0, then H(·; αθ, (1 − α)θ, y) or Pr(X

= x |

+ X

= y) with X

independently NB(θ

, ξ), is Beta-binomial(y, αθ, (1 − α)θ)

independent of ξ. The pmf of H is

y B(θ

+ x, θ

+ y − x)

h(x; θ

, θ

, y) =

, x = 0, 1, ..., y,

B(θ

, θ

)



40 Handbook of Discrete-Valued Time Series

The operator matches the random coefcient thinning in Section 2.3, but not bino-

mial thinning or generalized thinning. This rst appeared in McKenzie (1986). For

(2.11) based on this operator E[Y

t−1

= y]=αy + (1 − α)θξ,and

(1 − α)θξ(1 + ξ) + y(θ + y)α(1 − α)

Var(Y

t−1

= y) =

(θ + 1)

The conditional variance is quadratically increasing in y for large y, and hence this

process has more conditional heteroscedasticity than those based on compounding

operators in Section 2.3.

3. If F(·; θ) = F

(·; θ, η) with 0 < η < 1 xed, then H(·; αθ, (1 − α)θ, y) or Pr(X

x | X

+ X

= y) with X

independently GP(θ

, η) is a quasi-binomial distribution

with parameters π = θ

/(θ

+ θ

), ζ = η/(θ

+ θ

). The quasi-binomial pmf is:





π(1 − π)



π + ζx



x−1



1 − π + ζ(y − x)



y−x−1

h(x; π, ζ, y) =

1 + ζy 1 + ζy 1 + ζy

for x = 0, 1, ... , y. For (2.11) with this operator, E[Y

t−1

=y]=αy +

(1 − α)θ/(1 − η),





Var[Y

t−1

= y]=α(1 − α)



−

y−2

y!ζ



(y − j − 2)!(1 + yζ)

j+1

j=0

(1 − α)θ

ζ =

;

(1 − η)

see Alzaid and Al-Osh (1993). Numerically this is superlinear and asymptotically

O(y

) as y →∞.

These operators can be used for INMA(q) and INARMA(1, q) models in an analogous

manner to the models in (2.8) and (2.9).

Next, we present the Markov order 2 extension of Joe (1996) and Jung and Tremayne

(2011). Consider the following model for three consecutive observations

t−2

= X

123

+ X

t−1

= X

123

+ X

(2.12)

= X

123

+ X

where X

, X

123

have distributions in the family F(·; θ) with respective

parameters θ



= θ − θ

− θ

, θ



= θ − θ

− 2θ

, θ



, θ

(θ is dened so

that θ



, θ



are nonnegative). The conditional probability Pr(Y

= y

new

t−1

= y

prev1

, Y

t−2

prev2

) does not lead to a simple operator for Markov order 1, so that computationally one

can just use

Pr(Y

t−2

= w

, Y

t−1

= w

, Y

= w

)

Pr(Y

= w

t−1

= w

, Y

t−2

= w

) =

Pr(Y

t−2

= w

, Y

t−1

= w

)

   

41 Markov Models for Count Time Series

The numerator involves a quadruple sum:

∧w

−x

123

)∧(w

−x

123

)

−x

123

)∧(w

−x

123

−x

)

−x

123

−x

)∧(w

−x

123

−x

)

123

=0 x

f (x

123

, θ

)f (x

; θ

)f (x

; θ

)f (x

; θ

)f (w

−x

123

−x

; θ



)

· f (w

− x

123

− x

; θ



)f (w

− x

123

− x

; θ



For a model simplication, let θ

= 0sothat X

= 0; then the numerator becomes a triple

sum. Letting X

= 0 is sufcient to get a one-parameter extension of Markov order 1; this

Markov order 2 model becomes the Markov order 1 model when θ

= α

θ, θ

= α(1− α)θ,



= (1 − α)θ, θ



= (1 − α)

θ.

When θ

= 0, and α

≥ α

≥ 0 are the autocorrelations at lags 1 and 2, the time series

model has a stochastic representation:

= R

t−1

, Y

t−2

) + 

, 

∼ F(·; (1 − α

)θ), (2.13)

where via (2.12), R

t−1

, y

t−2

) has the the conditional distribution of X

123

given

t−1

= y

t−1

, Y

t−2

= y

t−2

and the convolution parameters of X

123

, X

are, respec-

tively, α

θ, (α

− α

)θ, (1 − α

)θ, (1 − 2α

+ α

)θ with α

≥ 2α

− 1. If θ

> 0, then the

convolution parameters of X

123

, X

are, respectively, θ

, θ



, θ



with

= (θ

+ θ

)/θ, α

= (θ

+ θ

)/θ.

The pattern extends to higher-order Markov but numerically the transition probability

becomes too cumbersome because the most general p-dimensional distribution of this type

involves 2

− 1 independent X

for S being a nonempty subset of {1, ..., p}. As mentioned

by Jung and Tremayne (2011), the autocorrelation structure of this Markov model for p ≥ 2

with Poisson, NB, or GP margins does not mimic the Gaussian counterpart, because of a

nonlinear conditional mean function.

Because the distribution of the innovation is in the same family as the stationary marginal

distribution, the models can be extended easily so that the convolution parameter of Y

, which depends on time-varying covariates. For example, for the Markov order 1 model,

with Y

∼ F(·; θ

), R

t−1

) ∼ F(·; αθ

t−1

) and 

∼ F(·; ζ

) with ζ

= θ

− αθ

t−1

≥ 0 (Joe 1997,

Section 8.4.4). For NB and GP, this means the univariate regression models are NB1 and

GP1, respectively.

2.5 Copula-Based Transition

The copula modeling approach is a way to get a joint distribution for (Y

t−p

, ..., Y

) with-

out an assumption of innite divisibility. Hence univariate margins can be any distribution

in the stationary case. However, the property of linear conditional expectation for the

Markov order 1 process will be lost. For a (p + 1)-variate copula C

1:(p+1)

, then F

1:(p+1)

, ..., F

) is a model for the multivariate discrete distribution of (Y

t−p

, ..., Y

For stationarity, marginal copulas satisfy C

1:m

= C

(1+i):(m+i)

for i = 1, ..., p + 1 − m and

m = 2, ..., p. The resulting transition probability Pr(Y

= y

t−p

= y

t−p

, ..., Y

t−1

= y

t−1

)

can be computed from F

1:(p+1)

.If Y were a continuous random variable, there is a simple



42 Handbook of Discrete-Valued Time Series

stochastic representation for the copula-based Markov model in terms of U(0, 1) random

variables, but this is not the case for Y discrete.

If there are time-varying covariates z

so that F

= F(·; β, z

), then one can use F

1:(p+1)

t−p

, ... , F

) for the distribution of (Y

t−p

, ... , Y

) with Markov dependence and a

time-varying parameter in the univariate margin.

For q-dependence, one can get a time series model {F

−

)} with stationary margin

if {U

} is a q-dependent sequence of U(0, 1) random variables. For mixed Markov/

q-dependent, a copula model that combines features of Markov and q-dependence can be

dened. Chapter 8 of Joe (1997) has the copula time series models for Markov dependence

and 1-dependence.

More specic details of parametric models are given for Markov order 1, followed by

brief mention of higher-order Markov, q-dependent and mixed Markov/q-dependent.

For a stationary time series model, with stationary univariate distribution F

,let F

C(F

, F

; δ) be the distribution of (Y

t−1

, Y

) where C is a bivariate copula family with

dependence parameter δ. Then the transition probability Pr(Y

= y

t−1

= y

t−1

) is

− − − −

t−1

, y

) − F

t−1

, y

) − F

t−1

, y

) + F

t−1

, y

)

2|1

t−1

) =

t−1

)

−

where y

is shorthand for y

− 1for i = t − 1and t.

Below are a few examples of one-parameter copula models that include independence,

perfect positive dependence, and possibly an extension to negative dependence. Different

tail behavior of the copula leads to different asymptotic tail behavior of the conditional

expectation and variance, but the conditional expectation is roughly linear in the middle.

If a copula C is the distribution of a bivariate uniform vector (U

, U

), then the distribution

of the reection (1 − U

,1− U

) is C(u

, u

) := u

+ u

− 1 + C(1 − u

,1 − u

). The copula

C is reection symmetric if C =



C. Otherwise for a reection asymmetric bivariate copula

C, one can also consider C as a model with the opposite direction of tail asymmetry.

The bivariate Gaussian copula can be considered as a baseline model from which other

copula families deviate from in tail behavior. Based on Jeffreys’ and Kullback–Leibler diver-

gences of Y

, Y

that are NB or GP, the bivariate distribution F

from the binomial thinning

operator or the beta/quasi-binomial operators are very similar, with typically a sample

size of over 500 needed to distinguish the models when the (lag 1) correlation is moderate

(0.4–0.7).

Below is a summary of bivariate copula families with different tail properties and

hence different tail behavior of the conditional mean E(Y

t−1

=y) and variance

Var(Y

t−1

=y) as y →∞, when F

= F

or F

1. Bivariate Gaussian: reection symmetric, with , 

being the univariate

and bivariate Gaussian cdf with mean 0 and variance 1, C(u

, u

; ρ) =



(

−1

), 

−1

); ρ), −1 < ρ < 1. The conditional mean is asymptotically

slightly sublinear and the conditional variance is asymptotically close to linear.

2. Bivariate Frank: reection symmetric, C(u

, u

; δ) =−δ

−1

log[1 − (1 − e

−δu

)(1 −

−δu

)/(1 − e

−δ

)], −∞ < δ < ∞. Because the upper tail behaves like 1 − u

−u

C(u

, u

) ∼ ζ(1 − u

)(1 − u

) for some ζ > 0as u

, u

→ 1

−

, the conditional mean

and variance are asymptotically at.







 



43 Markov Models for Count Time Series

3. Bivariate Gumbel: reection asymmetric with stronger dependence in the joint

upper tail. C(u

, u

; δ) = exp{−[(− log u

)

+ (− log u

)

]

1/δ

} for δ ≥ 1. The condi-

tional mean is asymptotically linear and the conditional variance is asymptotically

sublinear.

4. Reected or survival Gumbel: reection asymmetric with stronger dependence in the

joint lower tail. C(u

, u

; δ) = u

+ u

− 1 + exp{−[(− log{1 − u

})

+ (− log{1 −

})

]

1/δ

} for δ ≥ 1. The conditional mean and variances are asymptotically

sublinear.

The Gumbel or reected Gumbel copula can be recommended when there is some tail

asymmetry relative to Gaussian. The Gumbel copula can be recommended when it is

expected that there is some clustering of large values (exceeding a large threshold).

The Frank copula is the simple copula that is reection symmetric and can allow nega-

tive dependence. However its bivariate joint upper and lower tails are lighter than the

Gaussian’s copula, and this has implication that the conditional expectation E(Y

t−1

= y)

converges to a constant for large y. For Gaussian, Gumbel, and reected Gumbel,

E(Y

t−1

= y) is asymptotically linear or sublinear for large y for {Y

} with a stationary

distribution that is exponentially decreasing (like NB and GP). Some of these results can be

proved with the techniques in Hua and Joe (2013).

For second order Markov chains, one just needs a trivariate copula that satises a good

choice is the trivariate Gaussian copula with lag 1 and lag 2 latent correlations being ρ

, ρ

respectively. If closed-form copula functions are desired, a class to consider has form

 



ψ,H

, u

) = ψ





− log H e

−0.5ψ

−1

)

, e

−0.5ψ

−1

)

−1

)



j∈{1,3}

where ψ is the Laplace transform of a positive random variable, H is a bivariate permuta-

tion symmetric max-innite divisible copula; it has bivariate margins:

= ψ − log H e

−0.5ψ

−1

)

, e

−0.5ψ

−1

)

−1

) +

−1

) , j = 1, 3,

and C

, u

) = ψ(ψ

−1

) + ψ

−1

)).This C

ψ,H

is a suitable copula, with closed-form

cdf, for the transition of a stationary time series of Markov order 2, when there is more

dependence for measurements at nearer time points. If a model with clustering of large

values is desired, then one can take H to be the bivariate Gumbel copula and ψ to be

the positive stable Laplace transform, and then C

ψ,H

is a trivariate extreme value copula.

Other simple choices used for the data set in Section 2.6 are the Frank copula for H together

with the positive stable or logarithmic series Laplace transform for ψ.

Both the Gaussian copula and C

ψ,H

can be extended to AR(p). Other alternatives for

copulas for Markov order p ≥ 2 are based on discrete D-vines (see Panagiotelis et al. 2012).

For copula versions of Gaussian MA(q) series, analogues of (2.8) can be constructed for

dependent U(0, 1) sequences which can then be converted with the inverse probability

transform F

−

. For a q-dependent sequence, a (q + 1)-variate copula K

1:(q+1)

is needed with

margin K

1:q

being an independence copula. Then K

1:(q+1)

is the copula of (

t−q

, ..., 

t−1

, U

where {

} is a sequence of independent U(0, 1) innovation random variables, and

= K

−

1|1:q

(

|

t−q

, ..., 

t−1

).HereK

q+1|1:q

is the conditional distribution of variable q + 1

given variables 1, ..., q.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2: Markov Models for Count Time Series (3/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
2: Markov Models for Count Time Series (3/5)