9: Model Validation and Diagnostics (3/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google









199

Model Validation and Diagnostics

0.3

Departure residuals

Arrival residuals

0.2

0.1

–0.0

–0.1

–0.2

Component residuals

12345678

9 101112131415

Lags

Arrival residuals

0.3

0.2

0.1

–2

–0.0

–0.1

–4

–0.2

10 20 30 40 50 60

70 80 90 100 110 120

12345678

9 101112131415

(a) Time (b)

Lags

FIGURE 9.5

Time series plot (a) and ACF plots (b) of the component residuals from tting a PINAR(1) model to the cuts data.

9.3.3 Overdispersion and the Information Matrix Test

A very general procedure to check if a model is correctly specied, the Information

Matrix (IM) test, was proposed by White (1982). Basically, the IM test checks to see if the

log-likelihood equality

∂



∂

−E =

Var

∂θ

holds, as it should in a well-specied model. Here,  is the log-likelihood of the model and

θ is the parameter (which could be a vector). Another motivation for this test was given

by Chesher (1983) who considered the parameter θ, under the misspecied alternative, to

be random. Hence, the model is well specied when the variance of θ,Var(θ), is zero and

checking Var[θ] = 0 is essentially the IM test. McCabe and Leybourne (2000) show that such

tests, in the multiparameter case, are locally mean most powerful.

Freeland and McCabe (2004) investigated the behavior of the IM test in the context

of integer time series with a focus on a random coefcient interpretation. Suppose a

PINAR(1) model with xed coefcients is to be tested against an alternative with ran-

dom coefcients, where the thinning and arrivals parameters are considered random with

beta and gamma distributions, respectively. Then, if the appropriate IM test rejects, a

model with greater dispersion would be a natural candidate for consideration as a new

specication.

When (as in the example of the previous paragraph) the parameter vector can be par-

titioned into natural subgroups, it turns out that the IM test can also be decomposed into

subtests, each associated with a natural subgrouping of the parameters. Thus, for exam-

ple, in the case of the INAR(p) model, there is a subtest associated with the parameters

of the thinning operators and another associated with the arrivals process. Unfortu-

nately, the combined IM test is often not very useful in practice as it may be difcult

to interpret a rejection constructively and, even when it does not reject, it may obscure

 

 

 



200 Handbook of Discrete-Valued Time Series

behavior in the individual components. Hence, it is more productive to apply the subtests

individually.

Specializing to the case in (9.2), consider the parameters to be random and that the

sequences

{

}

and

{

}

are i.i.d with means E [α

] =α and E [λ

] =λ. Each of α

and

may be vectors, but we do not complicate the notation by treating them explicitly

as such. The IM procedure tests the hypothesis that Var[α

]=Var [λ

] =0, t =1, 2, ..., T,

against the alternative that at least one of them is not. The subtest associated with the

thinning parameters is given by

T T

 

= u

D,t



+ 

t=1 t=1

where we denote the score of the model with respect to α

by 

and the second deriva-

tives of the log-likelihood by 

. We evaluate these derivatives at the mean values, that is,

using 

= 

and

. In the vector parameter case, 

and

are

=α,λ

=λ



=α,λ

=λ



matrices and their traces should be used in the computations. Since terms like 

depend

on the parameters, estimates are required to implement these tests. Hence, we construct

D,t

based on estimated quantities using 



and



. By means of

=ˆα,λ

=λ



=ˆα,λ

=λ

the martingale properties of the likelihood process, it usually follows, under mild regu-

larity, that U

= s



t=1

D,t

→

(

0, 1

)

where s



t=1

D,t

. Similarly, the subtest

associated with the arrivals process is U



A,t





+ 

and this may

be implemented with u

A,t

estimated using 



and 

=ˆα,λ

=λ

. It also follows

=ˆα,λ

=λ

that U

= s



t=1

A,t

→

(

0, 1

)

where s



t=1

A,t

. See Freeland and McCabe (2004),

Sec. 5 for further details.

The use of the IM test is again illustrated by reference to a PINAR(1) model tted to the

cuts data. Relevant computations show that the p-values of the IM tests, U

and U

,for

the departure and arrival processes are 0.0250 and 0.0159, respectively. The low p-values

indicate that there is more variation in the data than is described by the model. The rela-

tively larger p-value for the departure process may indicate that a greater problem exists

for the arrival process. Note again the pervasive suggestion of seasonality in this monthly

data, an issue revisited in Section 9.4.3.

9.4 Analyses Based on the Predictive Distributions

The literature on probabilistic forecasting has developed a number of tools to compare

and evaluate predictive distributions, see, for example, Diebold et al. (1998). These tools

can also be fruitfully employed in diagnostic checking, as proposed by Gneiting et al.

(2007) or Jung and Tremayne (2011a). Following Geweke and Amisano (2010), we rst dis-

cuss the probability integral transform (PIT) method as a means of assessing the absolute

performance of models. Then, in the second subsection, the relative performance of mod-

els within a group of competing ones is addressed using scoring rules and information

criteria.

201 Model Validation and Diagnostics

9.4.1 PIT Histograms for Discrete Data

The use of the PIT as a method for assessing the adequacy of distributional assumptions

for a model dates back at least to the work of Dawid (1984) and exploits Rosenblatt’s (1952)

transformation of an absolutely continuous (conditional) distribution into a uniform dis-

tribution. To be more specic, dene the random variable u

on the basis of the cumulative

predictive distribution F

(·) corresponding to the true DGP u

= F

t−1

). Under cor-

rect specication of the predictive distribution, the series of PIT random variables {u

} are

i.i.d. standard uniform (0, 1). If a specied model does not correspond to the true DGP

and has cumulative predictive distribution F(X

t−1

), departures from this behavior can

be expected. Diebold et al. (1998) discussed various ways in which the uniformity of {u

}

from any model may be assessed. These are divided into two categories: those that are

designed to check the unconditional uniformity of the {u

} and those that check whether

the PIT series is i.i.d. The former is typically checked in an informal way by plotting the

empirical cumulative distribution function of {u

} and comparing it to the identity func-

tion, or by constructing a histogram of the {u

} and checking for uniformity. Diebold et al.

(1998, p. 869) also argued that formal tests of whether the {u

} are i.i.d. U(0, 1) using, per-

haps, the Kolmogorov–Smirnov or the Cramer–von Mises test are readily available but are

nonconstructive and, therefore, of little practical value.

If the {u

} are not uniformly distributed, the nature of the deviation from uniformity can

be informative. An obvious tool for analyzing the conformity with the i.i.d. assumption for

the {u

} is to examine their autocorrelation structure. Incidentally, Berkowitz (2001) sug-

gested transforming the {u

} to standard normal variables by means of an inverse Gaussian

distribution. Then a quantile–quantile (QQ) plot against the standard normal distribu-

tion assesses the distributional t of different models, although we do not pursue this

proposal here.

In the context of discrete distributions, some modications to standard methods are

required, because predictive cumulative distribution functions are step functions. Two

methods have been proposed in the literature. The rst of these, suggested by Denuit

and Lambert (2005), is the so-called randomized PIT obtained by perturbing the step func-

tion nature of the distribution function for discrete random variables, rather in the manner

that a hypothesis testing procedure might use a randomization device to yield a test with

a prespecied signicance level in the context of discrete variables. A random draw, υ,

from a (standard) uniform distribution is used to construct a randomized PIT based on the

distribution function of observed counts x

from

= F(x

− 1|F

t−1

) + υ[F(x

t−1

) − F(x

− 1|F

t−1

)]. (9.10)

Here, F(·) remains the predictive cumulative distribution of the counts from some model

and F(−1) = 0, compare to Czado et al. (2009). As in the case of a continuous cumulative

predictive function, if the model is correctly specied, u

is a serially independent random

variable following a uniform distribution on the interval [0, 1].

The assessment may be carried out by constructing histograms from the {u

}; see, for

example, Jung and Tremayne (2011a) who employ this method in the context of count time

series models. They nd that, for data sets of small sample size, the shape of the PIT his-

togram can change appreciably between sets of random draws from υ. One approach to

overcoming this is to average out this effect by using N independent sets of random uni-

form numbers to compute N PIT histograms. The average histogram bin heights over these

N replications could then be used. But this method is, effectively, indistinguishable from



202 Handbook of Discrete-Valued Time Series

the nonrandomized PIT, to be discussed next, rendering further discussion of the approach

redundant.

An alternative method of computing the PIT for a discrete cumulative predictive dis-

tribution has been proposed by Czado et al. (2009). It utilizes the distribution function

observed at the count x

via

⎧

⎪

0, u



≤ F(x

− 1|F

t−1

)

⎪

⎨



− F(x

− 1|F

t−1

)

(t)



t−1

) =

⎪

F(x

t−1

) − F(x

− 1|F

t−1

)

, F(x

− 1|F

t−1

) ≤ u



≤ F(x

t−1

)

.(9.11)

⎪

⎩

1, u



≥ F(x

t−1

)

The assessment of the distributional assumption can be carried out by aggregating over the

set of T − p predictions for all observed counts and comparing the mean PIT



) = (T − p)

−1

(t)



t−1

),0 ≤ u



≤ 1,

t = p + 1

to the cumulative distribution function of a standard uniform random variable. The com-

putation of the relevant predictive cumulative distributions used in (9.10) and (9.11) is

straightforward for most time series models for count data.

We operationalize the procedure by plotting a (nonrandomized) PIT histogram obtained

by allocating the values to J equally spaced bins j = 1, ... , J, where the height of the jth bin

is computed from f

= F

(j/J) − F

((j − 1)/J) , and checking for uniformity. We supplement

such a graph by incorporating approximate 100(1 − α)% condence intervals (here we use

α = 0.05) obtained from a standard χ

goodness-of-t test of the null hypothesis that the

J bins of the histogram are drawn from a uniform distribution. Under the null hypothe-

sis, the statistic, G, is asymptotically χ

(J − 1) distributed; we set J to 10. Alternatively,



) may be plotted against u



and checked for deviations from the identity function;

this requires no (arbitrary) choice of the number of bins J.

Figure 9.6 provides a summary of the PIT-based diagnostics resulting from tting a

PINAR(1) model to each of the cuts and the iceberg order data sets. Based on the earlier

discussion, we present the nonrandomized variant of the PIT histogram in the gure. Both

the PIT histogram and the F



) chart for the cuts data suggest misspecication of the

distributional assumption. The pronounced U-shaped pattern of the PIT histogram indi-

cates that not enough dispersion has been allowed for in the conditional distribution. This

is mildly supported by the results for the uniformity test statistic for the PIT histogram for

the cuts data, which is 14.8053 (p-value: 0.0964). In the case of the iceberg order data, the PIT

histogram is closer to a uniform distribution, although a slight U-shape pattern is evident.

The uniformity test statistic for the iceberg orders is 9.4657 (p-value: 0.3954) and, therefore,

does not reject the null hypothesis of a uniform PIT histogram at any conventional sig-

nicance level. For both data sets, the correlograms of the (centered) {u

} provide similar

qualitative information to those of the Pearson residuals of Figure 9.4 and are indicative of

some misspecication of the dependence structure.

It is, therefore, evident from all the graphs from both residual and predictive distribution

analyses for both data sets that the basic PINAR(1) model does not provide a satisfactory t

for either. In a modeling exercise seeking data coherent specications for these data, some

model renement is clearly called for. This may lead a researcher to need to choose between

203

Model Validation and Diagnostics

PIT histogram

(u*)

ACF of {u

}

0.20

1.0

0.25

0.9

0.8

0.15

0.7

0.6

0.05

0.10

0.5

0.4

–0.05

0.3

0.05

0.2

–0.15

0.1

0.00

0.0

–0.25

1 2 3 4 6 7 8 10 0.1 0.3 0.5 0.7 0.9

135 79111315

(a)

Lags

PIT histogram

(u*)

ACF of {u

}

0.20

1.0

0.15

0.9

0.8

0.10

0.15

0.7

0.6

0.05

0.10

0.5

0.4

–0.00

0.3

0.05

0.2

–0.05

0.1

0.00

0.0 –0.10

13 5 7 9

0.1 0.3 0.5

0.7 0.9 1 4 7 10

13 17

(b)

Lags

FIGURE 9.6

PIT-based diagnostics after tting a PINAR(1) model to data: cuts (a) and iceberg order (b).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9: Model Validation and Diagnostics (3/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
9: Model Validation and Diagnostics (3/6)