1: Statistical Analysis of Count Time Series Models: A GLM Perspective (4/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google



18 Handbook of Discrete-Valued Time Series

the t is based on the Poisson distribution, for the linear and nonlinear models. Clearly, the

plots show deviations from the Poisson distribution, indicating underdispersed predictive

distributions. The right plots indicate no apparent deviations from uniformity; these plots

are based on the negative binomial distribution (1.4). Similar ndings were obtained after

tting the log-linear model (1.8) to the transactions data.

1.5.2 Assessment of Marginal Calibration

We now turn to the question of examining marginal calibration. We suppose that the

observed time series {Y

, t ≥ 1} is stationary with marginal c.d.f. G(·). In addition, we

assume that we pick a probabilistic forecast in the form of a predictive c.d.f. P

(x) =

P(Y

≤ x | F

−

,λ

). In our case, P

(·) is either the c.d.f. of a Poisson random variable with

mean λ

, or a negative binomial distribution evaluated at λ

and νˆ. We follow Gneiting

et al. (2007) to assess marginal calibration by comparing the average predictive c.d.f.

(x) =

(x), x ∈ R,

t=1

to the empirical c.d.f. of the observations given by

(x) =

1(Y

≤ x), x ∈ R.

t=1

To display the marginal calibration plot, we plot the difference of the two c.d.f.,

(x) − G

(x), x ∈ R. (1.23)

Figure 1.3 shows that the negative binomial assumption provides a better t than the Pois-

son assumption for the transactions data. These gures were drawn by direct calculation

of the average c.d.f. P

and the empirical c.d.f, as explained earlier.

1.5.3 Assessment of Sharpness

The assessment of sharpness is accomplished via scoring rules. These rules provide numer-

ical scores and form summary measures for the assessment of the predictive performance.

In addition, scoring rules help us to rank the competing forecast models. They are nega-

tively oriented penalties that the forecaster wishes to minimize, see also Czado et al. (2009).

Table 1.3 shows a few examples of scoring rules, following Czado et al. (2009). The calcula-

tion of all these scores requires an assumption on the conditional distribution of the process.

The squared error score is identical for both the Poisson and the negative binomial distri-

butions, since the conditional means are equal. Note that the normalized square error score

is formed by the Pearson residuals dened in (1.19).

19 Statistical Analysis of Count Time Series Models

Marginal calibration plot

0.10

5 10 15 20 25 30

0.10

0.05

0.00

−0.05

−0.10

0 5 10 15 20 25 30

(a)

(b)

FIGURE 1.3

Marginal calibration plot for the transactions data. (a) corresponds to model (1.5) and (b) corresponds to model

(1.12). Solid line corresponds to the negative binomial prediction, while dashed line is for the Poisson forecast.

A similar plot is obtained for the case of log-linear model (1.8) but it is not shown.

TABLE 1.3

Denition of scoring rules

Scoring Rule Notation Denition

Logarithmic score logs −log p

Quadratic or Brier score qs 2p

+p

Spherical score sphs −p

/p

Ranked probability score rps



∞

x=0

(x) − 1(Y

≤ x))





Dawid–Sebastiani score dss (Y

− μ

)/σ

+ 2 log σ





Normalized squared error score nses (Y

− μ

)/σ

Squared error score ses (Y

− μ

)

Note: For notational purposes, set p

= P(Y

= y | F

−

,λ

), p



∞

,and μ

and σ

are the mean and

the standard deviation of the predictive distribution P

, respectively.

Table 1.4 shows all scoring rules applied to transactions data. It is clear that the negative

binomial model ts these data considerably better than the Poisson model regardless of

the assumed model. We note again that all models yield almost identical scoring rules.

This further supports our point that when there exists positive persistent correlation in the

data, then all models will produce similar output.

For the transactions data, a simple linear model of the form (1.5) under the negative

binomial assumption seems to describe the data adequately. This conclusion is a direct

consequence of the earlier ndings with additional evidence provided by the values of

goodness-of-t test (1.21) reported in Table 1.2.









 



 

20 Handbook of Discrete-Valued Time Series

TABLE 1.4

Scoring rules calculated for the transactions data after tting the Linear Model (1.5), the Nonlinear

Model (1.12) for γ = 0.5, and the Log-Linear Model (1.8)

Scoring Rules

Forecaster logs qs sphs rps dss nses ses

Linear model (1.5) Poisson 3.126 −0.076 −0.276 3.633 4.585 2.326 23.477

NegBin 2.902 −0.080 −0.292 3.284 4.112 0.993 23.477

Nonlinear model (1.12) Poisson 3.123 −0.075 −0.274 3.605 4.579 2.318 23.435

NegBin 2.901 −0.080 −0.289 3.267 4.107 0.985 23.435

Log-linear model (1.8) Poisson 3.144 −0.081 −0.286 3.764 4.633 2.376 23.894

NegBin 2.910 −0.083 −0.300 3.334 4.132 0.993 23.894

Note: The two forecasters are compared by the mean logarithmic, quadratic, spherical, ranked probability,

Dawid–Sebastiani, normalized squared error and squared error scores. Bold face numbers in each column

indicate the minimum value obtained between the two forecasters.

1.6 Other Topics

In this section, we will discuss other interesting research topics in the context of count

time series analysis. This list is not exhaustive, and several other interesting topics will be

covered in the following chapters. The list below reects our personal research interests in

the framework of generalized linear models.

1.6.1 Testing for Linearity

Consider the nonlinear model (1.11) and suppose that we are interested in testing the

hypothesis H

: c

= 0 which is equivalent to testing linearity of the model. This test-

ing problem is not standard because under the hypothesis, the nonlinear parameter γ is

not identiable. Furthermore, c

= 0 implies that the parameter is on the boundary of

the parameter space. Similar comments can be made for model (1.12) when testing the

hypothesis H

: γ = 0, but without the additional challenge implied by the nonidentia-

bility issue. These type of testing problems have been recently discussed by Christou and

Fokianos (2013).

Suppose that, in general, the vector of unknown parameters can be decomposed as

θ =

(1)

, θ

(2)

and let S

(1)

, S

(

be the corresponding partition of the score func-

tion. Consider testing H

: θ

(2)

= 0 vs. H

: θ

(2)

> 0, componentwise. This problem is

attacked by using the score test statistic which is given by

= S

(2)





−1

(

)� (

(2)

(



where

(1)

, 0

is the QMLE of θ under the hypothesis and �

is an appropriate esti-

mator for the covariance matrix � = Var

√

(2)

. If all the parameters are identied

21 Statistical Analysis of Count Time Series Models

under the null hypothesis, the score statistic (1.11) follows asymptotically a X

distribu-

tion under the null, where m

= dim(θ

(2)

) (Francq and Zakoïan 2010, Ch. 8). Model (1.12)

belongs to this class.

When the parameters are not identied under the null, a supremum type test statistic

resolves this problem; see for instance Davies (1987). Consider model (1.11), for example,

and let  be a grid of values for the nuisance parameter, denoted by γ. Then the sup-score

test statistic is given by

= sup LM

(γ).

γ∈

Critical values of the test statistics can be either based on the asymptotic chi-square

approximation or by employing parametric bootstrap as in the case of the test

statistic (1.21).

1.6.2 Intervention Analysis

Occasionally, some time series data may show that both variation and the level of the data

change during some specic time interval. Additionally, there might exist outlying values

(unusual values) at some time points. This is the case for the campylobacterosis infections

data reported from January 1990 to the end of October 2000 in the north of the Province

of Québec, Canada; see Fokianos and Fried (2010, Fig. 1). It is natural to ask whether

these uctuations can be explained by (1.5) or whether the inclusion of some interven-

tions will yield better results; see Box and Tiao (1975), Tsay (1986) and Chen and Liu (1993)

among others.

Generally speaking, types of intervention effects on time series data are classied accord-

ing to whether their impact is concentrated on a single or a few data points, or whether they

affect the whole process from some specic time t = τ on. In classical linear time series

methodology, an intervention effect is included in the observation equation by employing

a sequence of deterministic covariates {X

} of the form

= ξ(B)I

(τ), (1.24)

where ξ(B) is a polynomial operator, B is the shift operator such that B

= X

t−i

,andI

(τ)

is an indicator function, with I

(τ) = 1if t = τ,and I

(τ) = 0if t = τ. The choice of the

operator ξ(B) determines the kind of intervention effect: additive outlier (AO), transient

shift (TS), level shift (LS), or innovational outlier (IO). Since models of the form (1.5) are

not dened in terms of innovations, we focus on the rst three types of interventions (but

see Fried et al. 2015 for a Bayesian point of view).

However, a model like (1.5) is determined by a latent process. Therefore, a formal lin-

ear structure, as in the case of the Gaussian linear time series model, does not hold any

more and interpretation of the interventions is a more complicated issue. Hence, a method

which allows the detection of interventions and estimation of their size is needed so that

structural changes can be identied successfully. Important steps to achieve this goal are

the following; see Chen and Liu (1993):

1. A suitable model for accommodating interventions in count time series data.

2. Derivation of test procedures for their successful detection.

22 Handbook of Discrete-Valued Time Series

3. Implementation of joint maximum likelihood estimation of model parameters and

outlier sizes.

4. Correction of the observed series for the detected interventions.

All these issues and possible directions for further developments of the methodology have

been addressed by Fokianos and Fried (2010, 2012) for the linear model (1.5) and the

log-linear model (1.8), under the Poisson assumption.

1.6.3 Robust Estimation

The previous work on intervention analysis is complemented by developing robust estima-

tion procedures for count time series models. The works by El Saied (2012) and El Saied and

Fried (2014) address this research topic in the context of the linear model (1.5) when a

= 0.

In the context of log-linear model (1.8), the work of Kitromilidou and Fokianos (2015) devel-

ops robust estimation for count time series by adopting the methods suggested by Künsch

et al. (1989) and Cantoni and Ronchetti (2001). In particular, Cantoni and Ronchetti (2001)

robustied the quasi-likelihood approach for estimating the regression coefcient of gen-

eralized linear models by considering robust deviances which are natural generalizations

of the quasi-likelihood functions. The robustication proposed by Cantoni and Ronchetti

(2001) is performed by bounding and centering the quasi-score function.

1.6.4 Multivariate Count Time Series Models

Another interesting topic of research is the analysis of multivariate count time series mod-

els; see Liu (2012), Pedeli and Karlis (2013), and Section V of this volume which contains

many interesting results. The main issue for attacking the problem of multivariate count

time series is that multivariate count distributions are quite complex to be analyzed by

maximum likelihood methods.

Assume that {Y

= (Y

i,t

), t = 1, 2, ..., n} denotes a p-dimensional count time series and

suppose further that {λ

= (λ

i,t

), t = 1, 2, ..., n} is a corresponding p-dimensional intensity

process. Here the notation p denotes dimension but not order as in (1.13). Then, a natural

generalization of (1.5) is given by

i,t

= N

i,t

(0, λ

i,t

], i = 1, 2, ..., p, λ

= d + Aλ

t−1

+ BY

t−1

, (1.25)

where d is a p-dimensional vector and A, B are p × p matrices, all of them unknowns to be

estimated. Model (1.25) is a direct extension of the linear autoregressive model (1.5) and

assumes that marginally the count process is Poisson distributed. However, the statistical

problem of dealing with the joint distribution of the vector process {Y

} requires further

research; some preliminary results about ergodicity and stationarity of (1.25) have been

obtained by Liu (2012). More on multivariate models for count time series is given in the

chapter by Karlis (2015; Chapter 19 in this volume), and an application is discussed by

Ravishanker et al. (2015; Chapter 20 in this volume).

1.6.5 Parameter-Driven Models

So far we have discussed models that fall under the framework of observation-driven mod-

els. This implies that even though the mean process {λ

} is not observed directly, it can still

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 1: Statistical Analysis of Count Time Series Models: A GLM Perspective (4/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
1: Statistical Analysis of Count Time Series Models: A GLM Perspective (4/6)