
8 Handbook of Discrete-Valued Time Series
after repeated substitution. Hence, we obtain again that the hidden process {ν
t
} is deter-
mined by past functions of lagged responses. Equivalently, the log-linear model (1.8)
belongs to the class of observation-driven models and possesses similar properties to the
linear model (1.5). For more details, see Fokianos (2012).
1.2.3 Nonlinear Models for Count Time Series
A large class of models for the analysis of count time series is given by the following
nonlinear mean specication
λ
t
= f(λ
t1
, Y
t1
), t 1, (1.9)
where f (·) is known up to an unknown nite dimensional parameter vector such that
f : (0, ) × N (0, ). The function f (·) is assumed to satisfy the contraction condition
|f (λ, y) f (λ
, y )|≤α
1
|λ λ |+γ
1
|y y |, (1.10)
for (λ, y) and (λ
, y ) in (0, ) × N, where α
1
+γ
1
< 1; see Fokianos et al. (2009), Neumann
(2011), Doukhan et al. (2012, 2013), and Fokianos and Tjøstheim (2012).
An interesting example of a nonlinear regression model for count time series is
given by
f (λ, y) = d + (a
1
+ c
1
exp(γλ
2
))λ + b
1
y, (1.11)
where d, a
1
, c
1
, b
1
, γ are positive parameters, which is similar to the exponential autoregres-
sive model (Haggan and Ozaki 1981). In Fokianos et al. (2009), (1.11) was studied for the
case d =0. The parameter γ introduces a perturbation of the linear model (1.5), in the sense
that when γ tends either to 0 or innity, then (1.11) approaches two distinct linear mod-
els. It turns out that the regression coefcients of model (1.11) must satisfy the condition
a
1
+ b
1
+ c
1
< 1 to guarantee ergodicity and stationarity of the joint process (Y
t
, λ
t
);see
Doukhan et al. (2012) for more precise conditions. Model (1.11) shows that there is a smooth
transition between two linear models for count time series in terms of the unobserved pro-
cess. This transition might be difcult to estimate because of the nonlinear parameter γ and
the fact that λ
t
is not directly observed. An alternative method for introducing the smooth
transition is by employing the observed data. In other words, instead of (1.11), we can
consider
f (λ, y) = d + a
1
λ +
b
1
+ c
1
exp
γy
2
y.
The previous model is interpreted in a similar manner to (1.11) and must satisfy the
condition a
1
+ b
1
+ c
1
< 1 to obtain ergodicity and stationarity of the joint process (Y
t
, λ
t
).
Another nonlinear model studied by Fokianos and Tjøstheim (2012) is given by
d
f (λ, y) =
+ a
1
λ
t1
+ b
1
Y
t1
, (1.12)
(1 + λ
t1
)
γ
where all regression parameters are assumed positive. Here, the inclusion of γ introduces
a nonlinear perturbation, in the sense that small values of γ cause (1.12) to approach (1.5).
9 Statistical Analysis of Count Time Series Models
Moderate values of γ introduce a stronger perturbation. Models of the form (1.9) have also
been studied in the context of the negative binomial distribution by Christou and Fokianos
(2014). The condition max{a
1
, dγ a
1
}+b
1
< 1 guarantees ergodicity and stationarity of
the joint process (Y
t
, λ
t
). Following the arguments made earlier in connection with model
(1.11), we can alternatively consider the following modication of (1.12):
d
f (λ, y) =
+ a
1
λ
t1
+ b
1
Y
t1
,
(1 + Y
t1
)
γ
with the required stationarity condition max{b
1
, dγ b
1
}+a
1
< 1.
An obvious generalization of model (1.9) is given by the following specication of the
mean process (see Franke 2010 and Liu 2012):
λ
t
= f(λ
t1
, ..., λ
tp
, Y
t1
, ..., Y
tq
), (1.13)
where f (.) is a function such that f : (0, )
p
× N
q
(0, ). It should be clear that mod-
els (1.11) and (1.12) can be extended according to (1.13). Such examples are provided by
the class of smooth transition autoregressive models of which the exponential autoregres-
sive model is a special case (cf. Teräsvirta 1994, Teräsvirta et al. 2010). Further examples of
nonlinear time series models can be found in Tong (1990) and Fan and Yao (2003). These
models have not been considered earlier in the literature in the context of generalized linear
models for count time series, and they provide a exible framework for studying depen-
dent count data. For instance, nonlinear models can be quite useful when testing departures
from linearity; this topic is partially addressed in Section 1.6.1. A more general approach
would have been to estimate the function f of (1.13) by employing nonparametric methods.
However, such an approach is missing from the literature.
1.3 Inference
Maximum likelihood inference for the Poisson model (1.3) and the negative binomial
model (1.4) has been developed by Fokianos et al. (2009), Fokianos and Tjøstheim (2012),
and Christou and Fokianos (2014). They develop estimation procedures based on the
Poisson likelihood function, which for the Poisson model (1.3) is obviously the true like-
lihood. However, for the negative binomial model (1.4), and more generally for mixed
Poisson models, this method resembles the QMLE method for GARCH models which
employs the Gaussian likelihood function irrespective of the assumed error distribution.
The QMLE method, in the context of GARCH models, has been studied in detail by
Berkes et al. (2003), Francq and Zakoïan (2004), Mikosch and Straumann (2006), Bardet
and Wintenberger (2010), and Meitz and Saikkonen (2011), among others. This approach
yields consistent estimators of regression parameters under a correct mean specication
and it bypasses complicated likelihood functions (Godambe and Heyde 1987, Zeger and
Qaqish 1988, Heyde 1997).
In the case of mixed Poisson models (1.2), it is impossible, in general, to have a readily
available likelihood function since the distribution of Z’s is generally unknown. Hence,
we resort to QMLE methodology and, for dening properly the QMLE, we consider the
10 Handbook of Discrete-Valued Time Series
Poisson log-likelihood function, as in Fokianos et al. (2009), where θ denotes the unknown
parameter vector,
n n
l(Y; θ) =
l
t
(θ) =
(
Y
t
log λ
t
(θ) λ
t
(θ)
)
. (1.14)
t= 1 t= 1
The quasi-score function is dened by
l(Y; θ)
n
l
t
(θ)
n
Y
t
λ
t
(θ)
S
n
(θ) =
= = 1 , (1.15)
θ θ λ
t
(θ) θ
t= 1 t= 1
where the vector λ
t
(θ)/∂θ is dened recursively depending upon the mean specication
employed. The solution of the system of nonlinear equations S
n
(θ) = 0, if it exists, yields
the QMLE of θ which we denote by θ
ˆ
. The conditional information matrix is dened by
n
n

G
n
(θ) =
Var
l
t
(
θ
θ)
F
t
Y
,λ
1
=
λ
t
(
1
θ)
+ σ
2
Z
λ
t
θ
(θ)
λ
t
θ
(θ)
.
t= 1 t= 1
The asymptotic properties of θ
ˆ
have been studied in detail by Fokianos et al. (2009) and
Fokianos and Tjøstheim (2012) for the case of the Poisson model (1.3). For the case of the
negative binomial distribution, see Christou and Fokianos (2014). In both cases, consis-
tency and asymptotic normality of θ
ˆ
is achieved. In fact, it can be shown that the QMLE is
asymptotically normally distributed, i.e.,
D
n(θ
ˆ
θ
0
) −→ N (0, G
1
(θ
0
)G
1
(θ
0
)G
1
(θ
0
)),
where the matrices G and G
1
are given by the following:
G(θ) = E
λ
t
(
1
θ)
λ
θ
t

λ
θ
t
, G
1
(θ) = E

λ
t
(
1
θ)
+ σ
2
Z

λ
θ
t

λ
θ
t
. (1.16)
For the case of Poisson distribution (1.3), we obtain that
D
n(θ
ˆ
θ
0
) −→ N (0, G
1
(θ
0
)),
since σ
2
Z
= 0. If σ
2
Z
> 0, then we need to estimate this parameter for maximum likelihood
inference. We propose to estimate σ
2
Z
by solving the equation
n
(Y
t
λ
ˆ
t
)
2
t= 1
λ
ˆ
t
(1 + λ
ˆ
t
σ
Z
2
)
= n m, (1.17)
where m denotes the dimension of θ and λ
ˆ
t
= λ
t
(θ
ˆ
); see Lawless 1987. In particular, we
recognize that for the negative binomial case, σ
2
Z
= 1/ν. Therefore, the previous estimation
method is standard in applications; see Cameron and Trivedi (1998, Ch. 3) among others.
11 Statistical Analysis of Count Time Series Models
Although the earlier mentioned formulas are given for the linear model (1.5), they can be
modied suitably for the log-linear model (1.8) and the nonlinear model (1.9).
1.3.1 Transactions Data
Recall the transactions data discussed in Section 1.2. To t model (1.5) to those data we
proceed as follows: we set λ
0
= 0and λ
0
/∂θ = 0 for initialization of the recursions; recall
(1.15). Starting values for the parameter vector θ are obtained after observing that (1.5) can
be expressed as an ARMA(1,1) model of the form
d
d
Y
t
=
(a
1
+ b
1
) Y
t1
+
t
a
1
t1
(1.18)
1 (a
1
+ b
1
) 1 (a
1
+ b
1
)
where
t
= Y
t
λ
t
is a white noise process.
The rst three lines of Table 1.1 report estimation results after tting model (1.5) to these
data. The rst line reports estimates of the regression parameters (previously reported in
Fokianos 2012). Regardless of the assumed distribution, the estimators are identical because
we maximize the log-likelihood function (1.14). If the Poisson distribution is assumed to be
the true data generating process, then these estimators are MLE; otherwise they are QMLE.
If the true data generating process is assumed to be the negative binomial distribution,
then we can also estimate the dispersion parameter ν by means of (1.17) and the fact that
σ
2
Z
= 1/ν. The next two rows report the standard errors of the estimators. For the regression
parameters, these are calculated either by using the Poisson assumption and the matrix G
from (1.16) (second row of Table 1.1) or by using the sandwich matrix obtained by G and
G
1
from (1.16) (third row of Table 1.1). We note that the standard errors obtained from
the sandwich matrix are somewhat larger than those obtained from G. These are robust
standard errors in the sense that we are using a working likelihood—namely the Poisson
likelihood—to carry out inference (see White 1982). The standard error of νˆ has been com-
puted by parametric bootstrap. In other words, given θ
ˆ
and νˆ, generate a large number
TABLE 1.1
QMLE and their standards errors (in parentheses) for the Linear Model (1.5), the Nonlinear Model
(1.12) with γ = 0.5 and the Loglinear Model (1.8), for the total number of transactions per minute
for the Stock Ericsson B for the time period between July 2 and July 22, 2002
Maximum Likelihood Estimates Estimates of ν
d
ˆ
aˆ
1
b
ˆ
1
ν
ˆ
Linear model (1.5) 0.581 0.745 0.199 7.158
(0.148) (0.030) (0.022) (0.907)
(0.236) (0.047) (0.035)
Nonlinear model (1.12) 1.327 0.774 0.186 7.229
(0.324) (0.026) (0.021) (1.028)
(0.506) (0.041) (0.034)
Log-linear model (1.8) 0.105 0.746 0.207 6.860
(0.032) (0.028) (0.022) (1.067)
(0.125) (0.081) (0.070)
Note: The total number of observations is 460.
12 Handbook of Discrete-Valued Time Series
of count time series models by means of (1.5) and using (1.4). For each of the simulated
count time series, carry out QML estimation and get an estimator of ν by using (1.17). The
standard error of these replicates is reported in Table 1.1, underneath νˆ.
We now t model (1.12) to these data with γ = 0.5. Although we have assumed for sim-
plicity that γ is known, it can be estimated along with the regression parameters as outlined
in Fokianos and Tjøstheim (2012). In principle, additional nonlinear parameters can be esti-
mated by QMLE, but the sample size should be sufciently large. For tting this model, we
set again λ
0
= 0andλ
0
/∂θ = 0. Starting values for the parameter vector θ are obtained by
initially tting a linear model (1.5). Table 1.1 shows the results of this exercise. The estima-
tors of a
1
and b
1
from model (1.12) are close to those obtained from tting model (1.5). The
same observation holds for νˆ and the standard errors of these coefcients which are com-
puted in an analogous manner to that of the linear model. It is worth pointing out that the
sum of a
ˆ
1
and b
ˆ
1
is close to unity. This fact provides some evidence of nonstationarity when
we t these types of models to transactions data. These observations are repeated when the
log-linear model (1.8) is tted to those data; see the last three lines of Table 1.1. Some empir-
ical experience with these models has shown that when there is positive correlation among
the data, then all models will produce similar output. In general, the log-linear model will
be more useful when some covariate information is available.
1.4 Diagnostics
A detailed discussion concerning diagnostic methods for count time series models has
been given by Kedem and Fokianos (2002, Sec. 1.6); also see the chapter by Jung et al.
(2015; Chapter 9 in this volume). Various quantities, like Pearson and deviance residu-
als, for example, have been suggested and it was shown that they can be easily calculated
using standard software. In what follows we focus our attention on the so-called Pearson
residuals and on a new test statistic proposed recently by Fokianos and Neumann (2013)
for testing the goodness of t of the model under a conditional Poisson model. However,
properties of different types of residuals (raw and deviance residuals for instance) have not
been investigated in the literature. A slightly different denition of the residuals than the
one which we will be using has been given recently by Zhu and Wang (2010). They also
study the large sample behavior of the autocorrelation function of the residuals that they
propose, but for a model of the form (1.5) without feedback.
1.4.1 Residuals
To examine the adequacy of the t, we consider the so-called Pearson residuals. Set
Y
t
E Y
t
| F
t
Y
,λ
1
Y
t
λ
t
e
t
=
, t 1, (1.19)
Var Y
t
| F
Y,λ
λ
t
+ λ
2
t
σ
2
Z
t1
where the rst equality is the general denition of the Pearson residuals and the second
equality follows from (1.2). Under the true model, the sequence {e
t
, t 1} is a white noise
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset