7: Estimating Equation Approaches for Integer-Valued Time Series Models (1/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Estimating Equation Approaches for Integer-Valued

Time Series Models

Aerambamoorthy Thavaneswaran and Nalini Ravishanker

CONTENTS

7.1 Introduction...................................................................................145

7.2 AReview of Estimating Functions (EFs)..................................................147

7.3 Models and Moment Properties for Count Time Series .... . .. . .. . .. . ... ... . .. . .. . .. . ..148

7.3.1 Models for Nominally Dispersed Counts. ........................................149

7.3.2 Models for Counts with Excess Zeros.............................................150

7.3.3 Models in the GAS Framework.....................................................152

7.4 ParametricInference via EFs................................................................152

7.4.1 Linear EFs..............................................................................152

7.4.2 Combined EFs.........................................................................154

7.5 Hypothesis Testing and Model Choice....................................................160

7.6 Discussion and Summary....................................................................161

References............................................................................................162

7.1 Introduction

There is considerable current interest in the study of integer-valued time series models,

and for time series of counts, in particular. Applications abound in biometrics, ecology,

economics, engineering, nance, public health, etc. Given the increase in stochastic com-

plexity and data sizes, there is a need for developing fast and optimal approaches for model

inference and prediction. Several observation-driven and parameter-driven (Cox, 1981)

modeling frameworks for count time series have been discussed over the past few decades.

Further, although there is a large literature for count time series without zero-ination,

including both observation-driven and parameter-driven models, very few papers have

been published for modeling time series with excess zeros.

In parameter-driven models, temporal association is modeled indirectly by specifying

the parameters in the conditional distribution of the count random variable to be a function

of a correlated latent stochastic process (West and Harrison, 1997). In observation-driven

models, temporal association is modeled directly via lagged values of the count variable,

adopting strategies such as binomial thinning to preserve the integer nature of the data

(Al-Osh and Alzaid, 1987; McKenzie, 2003). Davis et al. (2003), Jung and Tremayne (2006),

and Neal and Subba Rao (2007), among others, have discussed estimation and inference

145

146 Handbook of Discrete-Valued Time Series

for these models. Heinen (2003) and Ghahramani and Thavaneswaran (2009b) described

autoregressive conditional Poisson (ACP) models. Ferland et al. (2006) and Zhu (2011,

2012a,b) dened classes of integer-valued time series models following different con-

ditional distributions, which they called INGARCH models, and studied the rst two

process moments. Although these are called INGARCH models, only the conditional

mean of the count variable is modeled, and not its conditional variance. In a recent paper,

Creal et al. (2013) described generalized autoregressive score (GAS) models to study time-

varying parameters in an observation-driven modeling framework, while MacDonald

and Zucchini (2015; Chapter 12 in this volume) discussed a hidden Markov modeling

framework.

Estimating functions (EFs) have a long history in statistical inference. For instance,

Fisher (1924) showed that maximum likelihood and minimum chi-squared methods are

asymptotically equivalent by comparing the rst order conditions of the two estimation

procedures, that is, by analyzing properties of estimators by focusing on the correspond-

ing EFs rather than on the objective functions or estimators themselves. Godambe (1960)

and Durbin (1960) gave a fundamental optimality result for EFs for the scalar parameter

case. Following Godambe (1985), who rst studied inference based on the EF approach for

discrete-time stochastic processes, Thavaneswaran and Abraham (1988) described estima-

tion for nonlinear time series models using linear EFs. Naik-Nimbalkar and Rajarshi (1995)

and Thavaneswaran and Heyde (1999) studied problems in ltering and prediction using

linear EFs in the Bayesian context. Merkouris (2007), Ghahramani and Thavaneswaran

(2009a, 2012), and Thavaneswaran et al. (2015), among others, studied estimation for time

series via the combined EF approach. Bera et al. (2006) gave an excellent survey on the

historical development of this topic.

Except for a few papers, (Dean, 1991), who discussed estimating equations for mixed

Poisson models given independent observations, application of the EF approach to count

time series is still largely unexplored. In the following sections, we extend this approach

for count time series models. For some recently proposed integer-valued time series mod-

els (such as the Poisson, generalized Poisson (GP), zero-inated Poisson, or negative

binomial models), the conditional mean and variance are functions of the same param-

eter. This motivates considering more informative quadratic EFs for joint estimation of

the conditional mean and variance parameters, rather than only using linear EFs. It is

also possible to derive closed form expressions for the information gain (Thavaneswaran

et al., 2015).

In this chapter, we describe a framework for optimal estimation of parameters in

integer-valued time series models via martingale EFs and illustrate the approach for some

interesting count time series models. The EF approach only relies on a specication of the

rst few moments of the random variable at each time conditional on its history, and does

not require specication of the form of the conditional probability distribution. We start

with a brief review of the general theory of EFs in Section 7.2. In Section 7.3, we describe the

conditional moment properties for some recently proposed classes of generalized integer-

valued models, such as those discussed in Ferland et al. (2006). Specically, we derive the

rst four conditional moments, which are typically required for carrying out inference

on model parameters using the theory of combined martingale EFs (Liang et al., 2011).

Section 7.4 describes the optimal EFs that enable joint parameter estimation for such mod-

els. We also derive fast, recursive, on-line estimation techniques for parameters of interest

and provide examples. In Section 7.5, we describe how hypothesis testing based on opti-

mal estimation facilitates model choice. Section 7.6 concludes with a summary and a brief

discussion of parameter-driven doubly stochastic models for count time series.













 





 



147 Estimating Equation Approaches for Integer-Valued Time Series Models

7.2 A Review of Estimating Functions (EFs)

Godambe (1985) rst described an EF approach for stochastic process inference. Suppose

that {y

, t = 1, ..., n} is a realization of a discrete time stochastic process, and suppose

its conditional distribution depends on a vector parameter θ belonging to an open subset

�

of the p-dimensional Euclidean space, with p  n.Let (, F, P

) denote the under-

lying probability space, and let F

be the σ-eld generated by {y

, ..., y

, t ≥ 1}.Let

= m

, ..., y

, θ),1 ≤ t ≤ n, be specied q-dimensional martingale difference vec-

tors. Consider the class M of zero-mean, square integrable p-dimensional martingale

EFs, viz.,

M =

(θ) : g

(θ) = a

t−1

(θ)m

, (7.1)

t=1

where a

t−1

(θ) are p × q matrices that are functions of θ and y

, ..., y

t−1

,1 ≤ t ≤ n.Itis

further assumed that g

(θ) are almost surely differentiable with respect to the components

of θ, and are such that for each n ≥ 1, E

∂g

∂

(θ)



n−1

and E(g

(θ)g

(θ)



| F

n−1

) are non-

singular for all θ ∈ �, where all expectations are taken with respect to P

. An estimator of

θ is obtained by solving the estimating equation g

(θ) = 0. Furthermore, the p × p matrix

E(g

(θ)g

(θ)



n−1

) is assumed to be positive denite for all θ ∈ �. Then, in the class of all

zero-mean and square integrable martingale EFs M, the optimal EF g

∗

(θ) that maximizes,

in the partial order of nonnegative denite matrices, the information



















(θ) =

∂g

∂

(θ)



n−1

E(g

(θ)g

(θ)



| F

n−1

)



−1

∂g

∂

(θ)



n−1

is given by

n n











∗

(θ) = a

∗

t−1

(θ)m

∂m



t−1

[E(m



| F

t−1

)]

−1

, (7.2)

∂θ

t=1 t=1

and the corresponding optimal information reduces to

∗

(θ) = E(g

∗

(θ)g

∗

(θ)



| F

n−1

). (7.3)

The function g

∗

(θ) is also called the “quasi-score” and has properties similar to those of a

score function: E(g

∗

(θ)) =0 and E(g

∗

(θ)g

∗

(θ)



) =−E(∂g

∗

(θ)/∂θ



). This is a general result

in that we do not need to assume that the true underlying conditional distribution belongs

to an exponential family of distributions. The maximum correlation between the optimal

EF and the true unknown score justies the terminology “quasi-score” for g

∗

(θ).Itisuse-

ful to note that the same procedure for derivation of optimal estimating equations may

be used when the time series is stationary or nonstationary. Moreover, the nite sample

















 

148 Handbook of Discrete-Valued Time Series

properties of the EFs remain the same, although asymptotic properties will differ. In

Chapter 12 of his book, Heyde (1997) discussed general consistency and asymptotic

distributional results.

Consider an integer-valued discrete-time scalar stochastic process {y

, t = 1, 2, ...} with

conditional mean, variance, skewness, and kurtosis given by

(θ) = E y

t−1

(θ) = Var y

t−1





(θ) =

(θ)

E y

− μ

(θ) |F

t−1

,and





(θ) =

(θ)

E y

− μ

(θ) |F

t−1

. (7.4)

To jointly estimate the conditional mean and variance, which are both functions of θ,Liang

et al. (2011) dened optimal combined EFs. We assume that μ

(θ) and σ

(θ) are differen-

tiable with respect to θ, and that the skewness and kurtosis of the standardized y

do not

depend on additional parameters beyond θ. For each data/model combination, our esti-

mation approach for θ requires (1) computation of the rst four moments of y

conditional

on the process history, (2) selection of suitable linear and/or quadratic martingale differ-

ences, (3) construction of optimal combined EFs, and (4) derivation of recursive estimators

of θ when possible. In Section 7.4, we describe optimal estimating equations for θ for some

of the integer-valued models discussed in Section 7.3.

7.3 Models and Moment Properties for Count Time Series

Several models have been discussed in the literature for count time series, where param-

eter estimation using maximum likelihood or Bayesian approaches have been described.

For the estimating equations framework described in this chapter, we start from the con-

ditional moments of the process {y

} given the history F

t−1

. The conditional moments are

assumed to be functions of an unknown parameter vector θ and form the basis for con-

structing the optimal estimating equation. For simplicity, we suppress θ in the notation for

the conditional moments and other derived quantities in the following examples. Consider

the discrete-time model for μ

with P + Q + 1 parameters dened by

= δ + α

t−i

+ β

t−j

, (7.5)

i=1 j=1

where δ > 0, α

≥ 0for i = 1, ..., P and β

≥ 0for j = 1, ..., Q.Let θ = (δ, α



, β



)



where

α = (α

, ..., α

)



and β = (β

, ..., β

)



. We assume that the conditional variance σ

as well

as μ

depend on θ, and that the conditional skewness γ

and conditional kurtosis κ

are

available and do not depend on any additional parameters. The higher order conditional

moment properties for the models described in Sections 7.3.1 and 7.3.2, especially for the

 

149 Estimating Equation Approaches for Integer-Valued Time Series Models

zero-inated case, are obtained using Mathematica. Section 7.3.3 proposes a model in the

framework of the GAS models of Creal et al. (2013).

Equation (7.5) posits an ARMA model for {y

}. This ARMA representation is useful for

obtaining unconditional moments such as skewness and kurtosis under the stationarity

assumption and is often useful in model identication in data analysis. We consider the

martingale difference m

= y

− μ

, with conditional mean 0 and conditional variance σ

Then (7.5) can be written as

− m

= δ + α

t−i

+ β

t−j

− m

t−j

i=1 j=1

Rearranging terms and simplifying, we can write

(α

+ β

 



1 −

= δ +



1 −

,or

i=1 j=1

φ(B)y

= δ + β(B)m

where B denotes the backshift operator. That is, (7.5) can be written as an ARMA model for



max(P,Q)



} with φ(B) = 1 − φ

, φ

= α

+ β

, β(B) = 1 −

i=1

,and ψ(B)φ(B) =

i=1

β(B) with ψ(B) = 1 +



∞

i=1

. Similar to the continuous-valued case (Gourieroux, 1997),

this model has the same second-order properties as an INARMA(max(P, Q), Q) model.

When all solutions to φ(z) = 0 lie outside the unit circle, we may write the moving aver-

age representation of the causal process as y

= μ + ψ(B)m

, where ψ(B) = β(B)/φ(B)

and μ = δ/(1 − φ

− ... − φ

max(P,Q)

) is the marginal mean of y

.The lag k autocovari-

(y)



∞

ance and autocorrelation of the process are, respectively, γ

= E(σ

)

j=0

j+k

and

(y) (y) (y)



∞



∞

ρ =

/γ =

j=0

j+k

j=0

, where E(σ

) is the unconditional variance of {y

Note that the temporal correlation ρ

(y)

depends only on the model parameters in (7.5) and

not on the conditional distribution of the observed process {y

}. Also, the kurtosis of {y

} is

given by



∞

(y)

= 3 +



)

− 3)



, (7.6)



∞

j=0

where K

(m)

= E(m

)/[E(m

)]

. These results follow directly from properties of stationary

ARMA processes and often provide guidance in model order choice. By substituting suit-

able values of ψ

, we can derive the kurtosis for the integer-valued processes discussed in

the following sections.

7.3.1 Models for Nominally Dispersed Counts

Considerable attention has been paid in the literature for modeling count time series via

observation-driven models (Zeger and Qaqish, 1988; Davis et al., 2003) and parameter-

driven models (Chan and Ledolter, 1995; West and Harrison, 1997). We consider three

examples.









max(P,Q) Q

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7: Estimating Equation Approaches for Integer-Valued Time Series Models (1/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
7: Estimating Equation Approaches for Integer-Valued Time Series Models (1/4)