18 Handbook of Discrete-Valued Time Series
the t is based on the Poisson distribution, for the linear and nonlinear models. Clearly, the
plots show deviations from the Poisson distribution, indicating underdispersed predictive
distributions. The right plots indicate no apparent deviations from uniformity; these plots
are based on the negative binomial distribution (1.4). Similar ndings were obtained after
tting the log-linear model (1.8) to the transactions data.
1.5.2 Assessment of Marginal Calibration
We now turn to the question of examining marginal calibration. We suppose that the
observed time series {Y
t
, t 1} is stationary with marginal c.d.f. G(·). In addition, we
assume that we pick a probabilistic forecast in the form of a predictive c.d.f. P
t
(x) =
P(Y
t
x | F
t
Y
,λ
1
). In our case, P
t
(·) is either the c.d.f. of a Poisson random variable with
mean λ
ˆ
t
, or a negative binomial distribution evaluated at λ
ˆ
t
and νˆ. We follow Gneiting
et al. (2007) to assess marginal calibration by comparing the average predictive c.d.f.
1
n
P
¯
(x) =
P
t
(x), x R,
n
t=1
to the empirical c.d.f. of the observations given by
G
ˆ
(x) =
1
n
1(Y
t
x), x R.
n
t=1
To display the marginal calibration plot, we plot the difference of the two c.d.f.,
P
¯
(x) G
ˆ
(x), x R. (1.23)
Figure 1.3 shows that the negative binomial assumption provides a better t than the Pois-
son assumption for the transactions data. These gures were drawn by direct calculation
of the average c.d.f. P
¯
and the empirical c.d.f, as explained earlier.
1.5.3 Assessment of Sharpness
The assessment of sharpness is accomplished via scoring rules. These rules provide numer-
ical scores and form summary measures for the assessment of the predictive performance.
In addition, scoring rules help us to rank the competing forecast models. They are nega-
tively oriented penalties that the forecaster wishes to minimize, see also Czado et al. (2009).
Table 1.3 shows a few examples of scoring rules, following Czado et al. (2009). The calcula-
tion of all these scores requires an assumption on the conditional distribution of the process.
The squared error score is identical for both the Poisson and the negative binomial distri-
butions, since the conditional means are equal. Note that the normalized square error score
is formed by the Pearson residuals dened in (1.19).
19 Statistical Analysis of Count Time Series Models
Marginal calibration plot
Marginal calibration plot
0.10
5 10 15 20 25 30
0.10
0.05
0.05
0.00
0.00
−0.05
−0.05
−0.10
−0.10
0
0 5 10 15 20 25 30
(a)
x
(b)
x
FIGURE 1.3
Marginal calibration plot for the transactions data. (a) corresponds to model (1.5) and (b) corresponds to model
(1.12). Solid line corresponds to the negative binomial prediction, while dashed line is for the Poisson forecast.
A similar plot is obtained for the case of log-linear model (1.8) but it is not shown.
TABLE 1.3
Denition of scoring rules
Scoring Rule Notation Denition
Logarithmic score logs log p
y
Quadratic or Brier score qs 2p
y
+p
2
Spherical score sphs p
y
/p
Ranked probability score rps
x=0
(P
t
(x) 1(Y
t
x))
2
2
Dawid–Sebastiani score dss (Y
t
μ
P
t
)/σ
P
t
+ 2 log σ
P
t
2
Normalized squared error score nses (Y
t
μ
P
t
)/σ
P
t
Squared error score ses (Y
t
μ
P
t
)
2
Note: For notational purposes, set p
y
= P(Y
t
= y | F
t
Y
,λ
1
), p
2
=
y
=0
p
y
2
,and μ
P
t
and σ
P
t
are the mean and
the standard deviation of the predictive distribution P
t
, respectively.
Table 1.4 shows all scoring rules applied to transactions data. It is clear that the negative
binomial model ts these data considerably better than the Poisson model regardless of
the assumed model. We note again that all models yield almost identical scoring rules.
This further supports our point that when there exists positive persistent correlation in the
data, then all models will produce similar output.
For the transactions data, a simple linear model of the form (1.5) under the negative
binomial assumption seems to describe the data adequately. This conclusion is a direct
consequence of the earlier ndings with additional evidence provided by the values of
goodness-of-t test (1.21) reported in Table 1.2.

20 Handbook of Discrete-Valued Time Series
TABLE 1.4
Scoring rules calculated for the transactions data after tting the Linear Model (1.5), the Nonlinear
Model (1.12) for γ = 0.5, and the Log-Linear Model (1.8)
Scoring Rules
Forecaster logs qs sphs rps dss nses ses
Linear model (1.5) Poisson 3.126 0.076 0.276 3.633 4.585 2.326 23.477
NegBin 2.902 0.080 0.292 3.284 4.112 0.993 23.477
Nonlinear model (1.12) Poisson 3.123 0.075 0.274 3.605 4.579 2.318 23.435
NegBin 2.901 0.080 0.289 3.267 4.107 0.985 23.435
Log-linear model (1.8) Poisson 3.144 0.081 0.286 3.764 4.633 2.376 23.894
NegBin 2.910 0.083 0.300 3.334 4.132 0.993 23.894
Note: The two forecasters are compared by the mean logarithmic, quadratic, spherical, ranked probability,
Dawid–Sebastiani, normalized squared error and squared error scores. Bold face numbers in each column
indicate the minimum value obtained between the two forecasters.
1.6 Other Topics
In this section, we will discuss other interesting research topics in the context of count
time series analysis. This list is not exhaustive, and several other interesting topics will be
covered in the following chapters. The list below reects our personal research interests in
the framework of generalized linear models.
1.6.1 Testing for Linearity
Consider the nonlinear model (1.11) and suppose that we are interested in testing the
hypothesis H
0
: c
1
= 0 which is equivalent to testing linearity of the model. This test-
ing problem is not standard because under the hypothesis, the nonlinear parameter γ is
not identiable. Furthermore, c
1
= 0 implies that the parameter is on the boundary of
the parameter space. Similar comments can be made for model (1.12) when testing the
hypothesis H
0
: γ = 0, but without the additional challenge implied by the nonidentia-
bility issue. These type of testing problems have been recently discussed by Christou and
Fokianos (2013).
Suppose that, in general, the vector of unknown parameters can be decomposed as
θ =
θ
(1)
, θ
(2)
and let S
n
=
S
n
(1)
, S
(
n
2)
be the corresponding partition of the score func-
tion. Consider testing H
0
: θ
(2)
= 0 vs. H
1
: θ
(2)
> 0, componentwise. This problem is
attacked by using the score test statistic which is given by
= S
(2)
1
LM
n
n
(
θ
˜
n
) (
θ
˜
n
)S
n
(2)
(
θ
˜
n
),
˜
where
θ
˜
n
=
θ
n
(1)
, 0
is the QMLE of θ under the hypothesis and
is an appropriate esti-
mator for the covariance matrix = Var
1
S
n
(2)
θ
˜
n
. If all the parameters are identied
n
21 Statistical Analysis of Count Time Series Models
under the null hypothesis, the score statistic (1.11) follows asymptotically a X
2
distribu-
m
2
tion under the null, where m
2
= dim(θ
(2)
) (Francq and Zakoïan 2010, Ch. 8). Model (1.12)
belongs to this class.
When the parameters are not identied under the null, a supremum type test statistic
resolves this problem; see for instance Davies (1987). Consider model (1.11), for example,
and let be a grid of values for the nuisance parameter, denoted by γ. Then the sup-score
test statistic is given by
LM
n
= sup LM
n
(γ).
γ
Critical values of the test statistics can be either based on the asymptotic chi-square
approximation or by employing parametric bootstrap as in the case of the test
statistic (1.21).
1.6.2 Intervention Analysis
Occasionally, some time series data may show that both variation and the level of the data
change during some specic time interval. Additionally, there might exist outlying values
(unusual values) at some time points. This is the case for the campylobacterosis infections
data reported from January 1990 to the end of October 2000 in the north of the Province
of Québec, Canada; see Fokianos and Fried (2010, Fig. 1). It is natural to ask whether
these uctuations can be explained by (1.5) or whether the inclusion of some interven-
tions will yield better results; see Box and Tiao (1975), Tsay (1986) and Chen and Liu (1993)
among others.
Generally speaking, types of intervention effects on time series data are classied accord-
ing to whether their impact is concentrated on a single or a few data points, or whether they
affect the whole process from some specic time t = τ on. In classical linear time series
methodology, an intervention effect is included in the observation equation by employing
a sequence of deterministic covariates {X
t
} of the form
X
t
= ξ(B)I
t
(τ), (1.24)
where ξ(B) is a polynomial operator, B is the shift operator such that B
i
X
t
= X
ti
,andI
t
(τ)
is an indicator function, with I
t
(τ) = 1if t = τ,and I
t
(τ) = 0if t = τ. The choice of the
operator ξ(B) determines the kind of intervention effect: additive outlier (AO), transient
shift (TS), level shift (LS), or innovational outlier (IO). Since models of the form (1.5) are
not dened in terms of innovations, we focus on the rst three types of interventions (but
see Fried et al. 2015 for a Bayesian point of view).
However, a model like (1.5) is determined by a latent process. Therefore, a formal lin-
ear structure, as in the case of the Gaussian linear time series model, does not hold any
more and interpretation of the interventions is a more complicated issue. Hence, a method
which allows the detection of interventions and estimation of their size is needed so that
structural changes can be identied successfully. Important steps to achieve this goal are
the following; see Chen and Liu (1993):
1. A suitable model for accommodating interventions in count time series data.
2. Derivation of test procedures for their successful detection.
22 Handbook of Discrete-Valued Time Series
3. Implementation of joint maximum likelihood estimation of model parameters and
outlier sizes.
4. Correction of the observed series for the detected interventions.
All these issues and possible directions for further developments of the methodology have
been addressed by Fokianos and Fried (2010, 2012) for the linear model (1.5) and the
log-linear model (1.8), under the Poisson assumption.
1.6.3 Robust Estimation
The previous work on intervention analysis is complemented by developing robust estima-
tion procedures for count time series models. The works by El Saied (2012) and El Saied and
Fried (2014) address this research topic in the context of the linear model (1.5) when a
1
= 0.
In the context of log-linear model (1.8), the work of Kitromilidou and Fokianos (2015) devel-
ops robust estimation for count time series by adopting the methods suggested by Künsch
et al. (1989) and Cantoni and Ronchetti (2001). In particular, Cantoni and Ronchetti (2001)
robustied the quasi-likelihood approach for estimating the regression coefcient of gen-
eralized linear models by considering robust deviances which are natural generalizations
of the quasi-likelihood functions. The robustication proposed by Cantoni and Ronchetti
(2001) is performed by bounding and centering the quasi-score function.
1.6.4 Multivariate Count Time Series Models
Another interesting topic of research is the analysis of multivariate count time series mod-
els; see Liu (2012), Pedeli and Karlis (2013), and Section V of this volume which contains
many interesting results. The main issue for attacking the problem of multivariate count
time series is that multivariate count distributions are quite complex to be analyzed by
maximum likelihood methods.
Assume that {Y
t
= (Y
i,t
), t = 1, 2, ..., n} denotes a p-dimensional count time series and
suppose further that {λ
t
= (λ
i,t
), t = 1, 2, ..., n} is a corresponding p-dimensional intensity
process. Here the notation p denotes dimension but not order as in (1.13). Then, a natural
generalization of (1.5) is given by
Y
i,t
= N
i,t
(0, λ
i,t
], i = 1, 2, ..., p, λ
t
= d + Aλ
t1
+ BY
t1
, (1.25)
where d is a p-dimensional vector and A, B are p × p matrices, all of them unknowns to be
estimated. Model (1.25) is a direct extension of the linear autoregressive model (1.5) and
assumes that marginally the count process is Poisson distributed. However, the statistical
problem of dealing with the joint distribution of the vector process {Y
t
} requires further
research; some preliminary results about ergodicity and stationarity of (1.25) have been
obtained by Liu (2012). More on multivariate models for count time series is given in the
chapter by Karlis (2015; Chapter 19 in this volume), and an application is discussed by
Ravishanker et al. (2015; Chapter 20 in this volume).
1.6.5 Parameter-Driven Models
So far we have discussed models that fall under the framework of observation-driven mod-
els. This implies that even though the mean process {λ
t
} is not observed directly, it can still
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset