4: Count Time Series with Observation-Driven Autoregressive Parameter Dynamics (1/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Count Time Series with Observation-Driven

Autoregressive Parameter Dynamics

Dag Tjøstheim

CONTENTS

4.1 Introduction.....................................................................................77

4.2 Probabilistic Structure. ........................................................................81

4.2.1 Random Iteration Approach.........................................................81

4.2.2 The General Markov Chain Approach... .. .. .. .. .. .. .. .. ........... .. .. .. .. .. .. .. ..83

4.2.3 Coupling Arguments..................................................................87

4.2.4 Ergodicity...............................................................................87

4.2.5 Weak Dependence.. ... .. .. . .. .. . .. .. ... .. ... .. ........................................88

4.2.6 Markov Theory with φ-Irreduciblity.. ... .. ... .....................................90

4.2.7 Perturbation Method..................................................................90

4.3 Statistical Inference.............................................................................92

4.3.1 Asymptotic Estimation Theorywithout Perturbation............................92

4.3.2 The Perturbation Approach.. ... .. ... .. ... .. ... .. ... .. ... .. ... .. ... .. ... .. ... .. ... .. .94

4.4 Extensions.......................................................................................97

4.4.1 Higher-Order Models.................................................................97

4.4.2 VectorModels..........................................................................97

4.4.3 Generalizing the Poisson Assumption. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .98

4.4.4 Specicationand Goodness-of-FitTesting.........................................98

References.............................................................................................98

4.1 Introduction

A count time series {Y

} is a time series that takes its values on a subset N

of the nonneg-

ative integers N . Most often this subset will be all of N , but it can also be the case that,

for example, e.g. N

={0, 1} or N

={0, 1, 2, ..., k}, for a binary and a binomial time series,

respectively.

In this chapter, we will look at count time series with dynamics driven by an autore-

gressive mechanism. This means that the distribution of {Y

} is modeled by a parametric

distribution, for example, a Poisson distribution, whose parameters are assumed to be

stochastic processes. The dynamics of the {Y

} process is created through a recursive

autoregressive scheme for the parameter process. More precisely, in the rst order case,

P(Y

= n|X

) = p(n, X

), (4.1)



78 Handbook of Discrete-Valued Time Series

where

n∈N

p(n, X

) = 1, X

is a parameter-driven process given by a possibly vector and

possibly nonlinear AR(p)-type process

= f(X

t−1

, ..., X

t−p

, ε

t−1

)

and {ε

}is a series of innovations or random shocks driving the process {X

}. The parameter

process is a genuine nonlinear autoregressive process if {ε

} consists of iid (indepen-

dent identically distributed) random variables such that ε

t−1

is independent of F

−1

,the

σ-algebra generated by {X

, s ≤ t − 1}. In the terminology of Cox (1981), the process {Y

} is

then a parameter-driven process; see, for example, Davis and Dunsmuir (2015; Chapter 6

in this volume). However, this is not the case for the processes we will be mainly concerned

with. We will rather look at the class of processes obtained by replacing {ε

} by {Y

} so that

= f(X

t−1

, ..., X

t−p

, Y

t−1

) (4.2)

and the more general

= f(X

t−1

, ..., X

t−p

, Y

t−1

, ..., Y

t−q

) (4.3)

with appropriate initial conditions. Clearly these are not genuine AR or ARMA processes

because of the presence of lagged values of {Y

}, which themselves depend on lagged val-

ues of {X

}. In the terminology of Cox (1981), the resulting {Y

} processes are examples of

observation-driven processes. We will concentrate our analysis on (4.1) and (4.2) because

it yields a Markov structure more or less directly, whereas (4.1) and (4.3) need a redeni-

tion of the state space to obtain a Markov structure. Such models have been widely used in

applications recently. For specic applications and many references the reader is referred

to Fokianos (2015; Chapter 1 in this volume). In the current chapter, the emphasis will be

on theory.

The main mathematical tool that has been used to handle the theory of these models

is Markov chain theory. To see why, it is advantageous to rewrite the model slightly and

at the same time make it more precise. To this end let {N

} be a sequence of nonnegative

integer-valued random variables that are independent given {X

} and have the probability

distribution function p(·, X

).If {X

} is nonrandom and equal to a constant, then {N

} is an

iid sequence. In the general case we can write

= N

This means that as we move from t−1tot, then rst we obtain a value of X

from (4.2) again

with appropriate initial conditions. Then, given X

, there is a separate and independent

random mechanism where N

, and as a result Y

, is drawn from p(·, X

). This makes {X

}

into a pth order Markov chain with respect to the σ-eld {F

}={σ(X

, s ≤ t)} and {(X

, Y

)}

 

is a Markov chain on F

X,Y

={σ(X

, s ≤ t; Y

, u ≤ t)}.

The perhaps simplest example of such a process is a rst-order Poisson autoregression

with X

= λ

, where λ

is a scalar Poisson intensity parameter, and where

= N

(λ

). (4.4)

 

79 Count Time Series with Observation-Driven Autoregressive Parameter Dynamics

Here {N

(·)} could be looked at as a sequence of independent Poisson processes of unit

intensity, and where

= d + aλ

t−1

+ bY

t−1

(4.5)

with a, b, d being nonnegative unknown scalars. This kind of model was treated in Fokianos

et al. (2009) and other papers referred to in that paper. To start the recursion an initial value

of λ

and Y

is needed. The process dened by (4.4) and (4.5) is often compared to a GARCH

process, see, for example Francq and Zakoïan (2011),

= h

(4.6)

with h

being the conditional variance of Y

given its past, and where h

is given by a

recursive equation

= d + ah

t−1

+ bY

t−1

. (4.7)

Here of course h

corresponds to λ

, and the series of iid random variables {ε

} with mean

zero and variance 1 corresponds to the series of Poisson processes {N

(·)} of unit intensity

in (4.4). There are two problems which make the analysis of (4.4), (4.5) more difcult than

the analysis of the GARCH system (4.6), (4.7): (1) {λ

} is driven by integer-valued inno-

vations {Y

}, whereas {λ

} itself, as an intensity parameter, is continuous valued. In the

GARCH situation both X

and h

are usually taken to be continuous valued, although there

are exceptions (Francq and Zakoïan 2004), and (2) the quite innocent looking relationship

(4.4) in fact represents a complex nonlinear structure compared to the multiplicative struc-

ture of (4.6). These two problems will be discussed throughout the paper, as they are at the

core of more or less everything that concerns these processes.

The analogy with the GARCH structure has led to the acronym INGARCH (integer

generalized autoregressive conditional heteroscedastic) for these processes, but this is an

acronym that I nd to be unfortunate, since (4.4) is not in general a variance property. It

may be difcult to change the terminology at the present point in time. I would rather pre-

fer INGAR (integer generalized autoregressive processes). In an early version of the model

it was called Bin(1) by Rydberg and Shephard (2001).

The Poisson distribution is of course just one example of a distribution p(·, X

). Other

examples are a binary probability where the probability of success {p

} would serve as a

parameter process, Wang and Li (2011), or it could be a binomial distribution or a negative

binomial. The case of the negative binomial has been treated by Christou and Fokianos

(2014), Davis and Wu (2009), and Davis and Liu (2014), and we will return to processes

governed by this distribution later.

When it comes to a specication of (4.1 through 4.3), there are two main categories of

choices. The rst and most obvious one has to do with the choice of the function f in (4.2)

and (4.3). One special case is the rst-order linear model (4.5). A nonlinear additive model

with higher-order lags is the specication

p q

f (X

t−1

, ..., X

t−p

, Y

t−1

, ..., Y

t−q

) = g

t−i

) + h

t−i

) (4.8)

i=1 i=1

80 Handbook of Discrete-Valued Time Series

for some (usually nonnegative) functions {g

} and {h

}. So far, I do not know about any

systematic attempts to analyse models such as (4.8) except in the case where g

’s and h

’s

are linear functions and some special nonlinear models treated by Davis and Liu (2014).

Clearly there are many other possibilities, and in the course of this paper we will put

various restrictions on f .

The other category of specication has to do with the choice of distribution p in (4.1)

and the choice of parametrization of this distribution. The parametrization is not unique

and could have to do with the characterization of the exponential family of distributions,

where there are canonical choices of parameters. For example, instead of the intensity λ

a choice of parameter, one could choose the canonical parameter ν

= ln λ

in the Poisson

case. For a binary parameter process with probability parameter p

, one could choose the

parameter α

= ln p

/1 − p

. One obvious advantage of using ν

instead of λ

in the Poisson

linear case is that in an equation

= d + aν

t−1

+ b ln(Y

t−1

+ 1)

corresponding to (4.5), it is not required any more that the parameters d, a, b be nonnegative,

since ν

itself can take negative values, and in a sense it is easier to implement explanatory

variables. We have treated such processes in Fokianos and Tjøstheim (2011).

We will be concerned with both the probabilistic structure of the system (4.1), (4.2) and

the asymptotic inference theory of parameter estimates of the parameters characterizing

the autoregressive process {X

}. Examples of parameters that have to be estimated are the

parameters a, b, d in the linear case (4.5).

Somewhat different techniques have been used to characterize the probabilistic struc-

ture, but common for most of them is the Markov chain theory. This is of course an integral

part of (nonlinear) AR processes (see Tjøstheim 1990), but it is made more difcult and

nonstandard in the present case due to the incompatibility problems of values of {X

} or

{λ

} as compared to {Y

One technique for obtaining asymptotic results for the parameter estimates uses stan-

dard Markov chain theory by perturbing the original recursive relationship (4.2) or (4.5)

with a continuously distributed perturbation ε

, so that in the linear case one obtains the

perturbed equation

= d + aλ

t−1

+ bY

t−1

+ ε

(4.9)

and then letting this perturbation tend to zero. This is perhaps not as direct approach as

one could wish for (cf. Doukhan 2012 for a critical view), but it leads relatively efciently to

results (Fokianos et al. 2009). A disadvantage of this approach is that one is not concerned

so much with the probabilistic structure of the processes ({X

}, {Y

}) themselves but rather

of the perturbed versions. To look at the probabilistic structure of (4.1), (4.2) and more

specically (4.4), (4.5), again there are different approaches. We will highlight all of this as

we proceed.

As mentioned, we will cover both the probabilistic structure and the theory of inference.

But the emphasis will be on the former because there are recent review papers, Fokianos

(2012), Tjøstheim (2012), and Fokianos (2015; Chapter 1 in this volume), with focus on

statistical inference and applications. We will start by a discussion of the existence of a sta-

tionary, that is invariant, probability measure for ({X

}, {Y

}). This problem is fundamental

for most of what follows, such as ergodicity, irreducibility, and recurrence and henceforth

for statistical inference. These topics are treated in Section 4.2. We draw the connection to



81 Count Time Series with Observation-Driven Autoregressive Parameter Dynamics

consistency and asymptotic theory of parameter estimates in Section 4.3 and mention some

extensions in Section 4.4.

4.2 Probabilistic Structure

4.2.1 Random Iteration Approach

The recursive system (4.1), (4.2) can be looked at as a random iteration scheme. General

random iteration schemes have been studied among others by Diaconis and Freedman

(1999). They look at an iterative scheme which in our notation can most conveniently be

written as

= x, X

= f

(x),

or generally as

t+1

= f

). (4.10)

In the context of our system (4.1), (4.2), N

, N

, ... can be thought of as iterative and inde-

pendent drawings from the distribution p, or in the linear Poisson case as independent

drawings from the Poisson processes of unit intensity; that is, from the Poisson distribu-

tion with intensity parameter 1. The {X

} process of (4.10) can be directly identied with

the {X

} process of (4.2) or λ

of (4.5). The {Y

} process of (4.1) and of (4.4) is implicitly a part

of (4.10) through the drawings from the distribution function p.

In the setup of Diaconis and Freedman (1999) {X

} has as its state space S, a complete

metric space with a metric ρ. In the bulk of their paper, and in particular in their Theo-

rems 1.1 and 5.1, p is not allowed to depend on x ∈ S, but they state (p. 49) that “Theorem

1.1 can be extended to cover p that depends on x, but further conditions are needed.”

In our setup most of the time {X

} (or {λ

}) would have R

as its state space. Note that

in order to use the results of Diaconis and Freedman the random mechanism N

should

not depend on X

. For the Poisson setup in (4.4), (4.5) this is obtained by letting N

be the

realizations of Poisson processes of unit intensity. In, for example, Davis and Liu (2014), it is

obtained by setting X

= E(Y

t−1

) with F

t−1

= σ(X

, Y

, ..., Y

t−1

), and by considering the

inverse of the cumulative distribution function of an exponential family distribution of Y.

Theorems 1.1 and 5.1 of Diaconis and Freedman (1999) both give sufcient conditions

for the existence of a unique stationary measure for the Markov chain {X

}.Thisinturnis

an essential condition for establishing limit results and a theory of inference for parameter

estimates. The conditions of their Theorem 1.1 are somewhat more restrictive than those of

Theorem 5.1, but they are easier to formulate and understand intuitively:

First, the functions f

(·) are supposed to be Lipschitz such that

ρ(f

(x), f

(y)) ≤ C

ρ(x, y) (4.11)

for some C

and all x and y in S. In fact, f is assumed to be contracting in average, since it is

assumed that ln C

p(dN)<0 (which is a sum in our case since p is discrete). This makes

< 1 for a typical N. The statement in (4.11) is the statement of the stationarity condition

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 4: Count Time Series with Observation-Driven Autoregressive Parameter Dynamics (1/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
4: Count Time Series with Observation-Driven Autoregressive Parameter Dynamics (1/5)