7
Estimating Equation Approaches for Integer-Valued
Time Series Models
Aerambamoorthy Thavaneswaran and Nalini Ravishanker
CONTENTS
7.1 Introduction...................................................................................145
7.2 AReview of Estimating Functions (EFs)..................................................147
7.3 Models and Moment Properties for Count Time Series .... . .. . .. . .. . ... ... . .. . .. . .. . ..148
7.3.1 Models for Nominally Dispersed Counts. ........................................149
7.3.2 Models for Counts with Excess Zeros.............................................150
7.3.3 Models in the GAS Framework.....................................................152
7.4 ParametricInference via EFs................................................................152
7.4.1 Linear EFs..............................................................................152
7.4.2 Combined EFs.........................................................................154
7.5 Hypothesis Testing and Model Choice....................................................160
7.6 Discussion and Summary....................................................................161
References............................................................................................162
7.1 Introduction
There is considerable current interest in the study of integer-valued time series models,
and for time series of counts, in particular. Applications abound in biometrics, ecology,
economics, engineering, nance, public health, etc. Given the increase in stochastic com-
plexity and data sizes, there is a need for developing fast and optimal approaches for model
inference and prediction. Several observation-driven and parameter-driven (Cox, 1981)
modeling frameworks for count time series have been discussed over the past few decades.
Further, although there is a large literature for count time series without zero-ination,
including both observation-driven and parameter-driven models, very few papers have
been published for modeling time series with excess zeros.
In parameter-driven models, temporal association is modeled indirectly by specifying
the parameters in the conditional distribution of the count random variable to be a function
of a correlated latent stochastic process (West and Harrison, 1997). In observation-driven
models, temporal association is modeled directly via lagged values of the count variable,
adopting strategies such as binomial thinning to preserve the integer nature of the data
(Al-Osh and Alzaid, 1987; McKenzie, 2003). Davis et al. (2003), Jung and Tremayne (2006),
and Neal and Subba Rao (2007), among others, have discussed estimation and inference
145
146 Handbook of Discrete-Valued Time Series
for these models. Heinen (2003) and Ghahramani and Thavaneswaran (2009b) described
autoregressive conditional Poisson (ACP) models. Ferland et al. (2006) and Zhu (2011,
2012a,b) dened classes of integer-valued time series models following different con-
ditional distributions, which they called INGARCH models, and studied the rst two
process moments. Although these are called INGARCH models, only the conditional
mean of the count variable is modeled, and not its conditional variance. In a recent paper,
Creal et al. (2013) described generalized autoregressive score (GAS) models to study time-
varying parameters in an observation-driven modeling framework, while MacDonald
and Zucchini (2015; Chapter 12 in this volume) discussed a hidden Markov modeling
framework.
Estimating functions (EFs) have a long history in statistical inference. For instance,
Fisher (1924) showed that maximum likelihood and minimum chi-squared methods are
asymptotically equivalent by comparing the rst order conditions of the two estimation
procedures, that is, by analyzing properties of estimators by focusing on the correspond-
ing EFs rather than on the objective functions or estimators themselves. Godambe (1960)
and Durbin (1960) gave a fundamental optimality result for EFs for the scalar parameter
case. Following Godambe (1985), who rst studied inference based on the EF approach for
discrete-time stochastic processes, Thavaneswaran and Abraham (1988) described estima-
tion for nonlinear time series models using linear EFs. Naik-Nimbalkar and Rajarshi (1995)
and Thavaneswaran and Heyde (1999) studied problems in ltering and prediction using
linear EFs in the Bayesian context. Merkouris (2007), Ghahramani and Thavaneswaran
(2009a, 2012), and Thavaneswaran et al. (2015), among others, studied estimation for time
series via the combined EF approach. Bera et al. (2006) gave an excellent survey on the
historical development of this topic.
Except for a few papers, (Dean, 1991), who discussed estimating equations for mixed
Poisson models given independent observations, application of the EF approach to count
time series is still largely unexplored. In the following sections, we extend this approach
for count time series models. For some recently proposed integer-valued time series mod-
els (such as the Poisson, generalized Poisson (GP), zero-inated Poisson, or negative
binomial models), the conditional mean and variance are functions of the same param-
eter. This motivates considering more informative quadratic EFs for joint estimation of
the conditional mean and variance parameters, rather than only using linear EFs. It is
also possible to derive closed form expressions for the information gain (Thavaneswaran
et al., 2015).
In this chapter, we describe a framework for optimal estimation of parameters in
integer-valued time series models via martingale EFs and illustrate the approach for some
interesting count time series models. The EF approach only relies on a specication of the
rst few moments of the random variable at each time conditional on its history, and does
not require specication of the form of the conditional probability distribution. We start
with a brief review of the general theory of EFs in Section 7.2. In Section 7.3, we describe the
conditional moment properties for some recently proposed classes of generalized integer-
valued models, such as those discussed in Ferland et al. (2006). Specically, we derive the
rst four conditional moments, which are typically required for carrying out inference
on model parameters using the theory of combined martingale EFs (Liang et al., 2011).
Section 7.4 describes the optimal EFs that enable joint parameter estimation for such mod-
els. We also derive fast, recursive, on-line estimation techniques for parameters of interest
and provide examples. In Section 7.5, we describe how hypothesis testing based on opti-
mal estimation facilitates model choice. Section 7.6 concludes with a summary and a brief
discussion of parameter-driven doubly stochastic models for count time series.
147 Estimating Equation Approaches for Integer-Valued Time Series Models
7.2 A Review of Estimating Functions (EFs)
Godambe (1985) rst described an EF approach for stochastic process inference. Suppose
that {y
t
, t = 1, ..., n} is a realization of a discrete time stochastic process, and suppose
its conditional distribution depends on a vector parameter θ belonging to an open subset
of the p-dimensional Euclidean space, with p n.Let (, F, P
θ
) denote the under-
lying probability space, and let F
t
be the σ-eld generated by {y
1
, ..., y
t
, t 1}.Let
m
t
= m
t
(y
1
, ..., y
t
, θ),1 t n, be specied q-dimensional martingale difference vec-
tors. Consider the class M of zero-mean, square integrable p-dimensional martingale
EFs, viz.,
n
M =
g
n
(θ) : g
n
(θ) = a
t1
(θ)m
t
, (7.1)
t=1
where a
t1
(θ) are p × q matrices that are functions of θ and y
1
, ..., y
t1
,1 t n.Itis
further assumed that g
n
(θ) are almost surely differentiable with respect to the components
of θ, and are such that for each n 1, E
g
n
θ
(θ)
F
n1
and E(g
n
(θ)g
n
(θ)
| F
n1
) are non-
singular for all θ , where all expectations are taken with respect to P
θ
. An estimator of
θ is obtained by solving the estimating equation g
n
(θ) = 0. Furthermore, the p × p matrix
E(g
n
(θ)g
n
(θ)
|F
n1
) is assumed to be positive denite for all θ . Then, in the class of all
zero-mean and square integrable martingale EFs M, the optimal EF g
(θ) that maximizes,
n
in the partial order of nonnegative denite matrices, the information


I
g
n
(θ) =
E
g
n
θ
(θ)
F
n1
E(g
n
(θ)g
n
(θ)
| F
n1
)
1
E
g
n
θ
(θ)
F
n1
,
is given by
n n

g
n
(θ) = a
t1
(θ)m
t
=
E
m
t
F
t1
[E(m
t
m
t
| F
t1
)]
1
m
t
, (7.2)
θ
t=1 t=1
and the corresponding optimal information reduces to
I
g
(θ) = E(g
n
(θ)g
n
(θ)
| F
n1
). (7.3)
n
The function g
n
(θ) is also called the “quasi-score” and has properties similar to those of a
score function: E(g
n
(θ)) =0 and E(g
n
(θ)g
n
(θ)
) =−E(∂g
n
(θ)/∂θ
). This is a general result
in that we do not need to assume that the true underlying conditional distribution belongs
to an exponential family of distributions. The maximum correlation between the optimal
EF and the true unknown score justies the terminology “quasi-score” for g
n
(θ).Itisuse-
ful to note that the same procedure for derivation of optimal estimating equations may
be used when the time series is stationary or nonstationary. Moreover, the nite sample
148 Handbook of Discrete-Valued Time Series
properties of the EFs remain the same, although asymptotic properties will differ. In
Chapter 12 of his book, Heyde (1997) discussed general consistency and asymptotic
distributional results.
Consider an integer-valued discrete-time scalar stochastic process {y
t
, t = 1, 2, ...} with
conditional mean, variance, skewness, and kurtosis given by
μ
t
(θ) = E y
t
|F
t1
,
σ
2
t
(θ) = Var y
t
|F
t1
,
1
3
γ
t
(θ) =
σ
3
t
(θ)
E y
t
μ
t
(θ) |F
t1
,and
1
4
κ
t
(θ) =
σ
4
t
(θ)
E y
t
μ
t
(θ) |F
t1
. (7.4)
To jointly estimate the conditional mean and variance, which are both functions of θ,Liang
et al. (2011) dened optimal combined EFs. We assume that μ
t
(θ) and σ
2
t
(θ) are differen-
tiable with respect to θ, and that the skewness and kurtosis of the standardized y
t
do not
depend on additional parameters beyond θ. For each data/model combination, our esti-
mation approach for θ requires (1) computation of the rst four moments of y
t
conditional
on the process history, (2) selection of suitable linear and/or quadratic martingale differ-
ences, (3) construction of optimal combined EFs, and (4) derivation of recursive estimators
of θ when possible. In Section 7.4, we describe optimal estimating equations for θ for some
of the integer-valued models discussed in Section 7.3.
7.3 Models and Moment Properties for Count Time Series
Several models have been discussed in the literature for count time series, where param-
eter estimation using maximum likelihood or Bayesian approaches have been described.
For the estimating equations framework described in this chapter, we start from the con-
ditional moments of the process {y
t
} given the history F
t1
. The conditional moments are
assumed to be functions of an unknown parameter vector θ and form the basis for con-
structing the optimal estimating equation. For simplicity, we suppress θ in the notation for
the conditional moments and other derived quantities in the following examples. Consider
the discrete-time model for μ
t
with P + Q + 1 parameters dened by
P
Q
μ
t
= δ + α
i
y
ti
+ β
j
μ
tj
, (7.5)
i=1 j=1
where δ > 0, α
i
0for i = 1, ..., P and β
j
0for j = 1, ..., Q.Let θ = (δ, α
, β
)
where
α = (α
1
, ..., α
P
)
and β = (β
1
, ..., β
Q
)
. We assume that the conditional variance σ
2
t
as well
as μ
t
depend on θ, and that the conditional skewness γ
t
and conditional kurtosis κ
t
are
available and do not depend on any additional parameters. The higher order conditional
moment properties for the models described in Sections 7.3.1 and 7.3.2, especially for the
149 Estimating Equation Approaches for Integer-Valued Time Series Models
zero-inated case, are obtained using Mathematica. Section 7.3.3 proposes a model in the
framework of the GAS models of Creal et al. (2013).
Equation (7.5) posits an ARMA model for {y
t
}. This ARMA representation is useful for
obtaining unconditional moments such as skewness and kurtosis under the stationarity
assumption and is often useful in model identication in data analysis. We consider the
martingale difference m
t
= y
t
μ
t
, with conditional mean 0 and conditional variance σ
2
t
.
Then (7.5) can be written as
P
Q
y
t
m
t
= δ + α
i
y
ti
+ β
j
(y
tj
m
tj
).
i=1 j=1
Rearranging terms and simplifying, we can write
(α
i
+ β
i
)B
i
β
j
B
j
1
y
t
= δ +
1
m
t
,or
i=1 j=1
φ(B)y
t
= δ + β(B)m
t
,
where B denotes the backshift operator. That is, (7.5) can be written as an ARMA model for
max(P,Q)
Q
{y
t
} with φ(B) = 1 φ
i
B
i
, φ
i
= α
i
+ β
i
, β(B) = 1
i=1
β
i
B
i
,and ψ(B)φ(B) =
i=1
β(B) with ψ(B) = 1 +
i=1
ψ
i
B
i
. Similar to the continuous-valued case (Gourieroux, 1997),
this model has the same second-order properties as an INARMA(max(P, Q), Q) model.
When all solutions to φ(z) = 0 lie outside the unit circle, we may write the moving aver-
age representation of the causal process as y
t
= μ + ψ(B)m
t
, where ψ(B) = β(B)/φ(B)
and μ = δ/(1 φ
1
... φ
max(P,Q)
) is the marginal mean of y
t
.The lag k autocovari-
(y)
ance and autocorrelation of the process are, respectively, γ
k
= E(σ
2
t
)
j=0
ψ
j
ψ
j+k
and
(y) (y) (y)
ρ =
γ
k
/γ =
j=0
ψ
j
ψ
j+k
/
j=0
ψ
2
, where E(σ
t
2
) is the unconditional variance of {y
t
}.
k
0
j
Note that the temporal correlation ρ
k
(y)
depends only on the model parameters in (7.5) and
not on the conditional distribution of the observed process {y
t
}. Also, the kurtosis of {y
t
} is
given by
K
(y)
= 3 +
(K
(m
)
3)
j
=
2
0
ψ
4
j
, (7.6)
j=0
ψ
2
j
where K
(m)
= E(m
4
t
)/[E(m
2
t
)]
2
. These results follow directly from properties of stationary
ARMA processes and often provide guidance in model order choice. By substituting suit-
able values of ψ
j
, we can derive the kurtosis for the integer-valued processes discussed in
the following sections.
7.3.1 Models for Nominally Dispersed Counts
Considerable attention has been paid in the literature for modeling count time series via
observation-driven models (Zeger and Qaqish, 1988; Davis et al., 2003) and parameter-
driven models (Chan and Ledolter, 1995; West and Harrison, 1997). We consider three
examples.
max(P,Q) Q
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset