199
Model Validation and Diagnostics
10
0.3
Departure residuals
8
Departure residuals
Arrival residuals
0.2
0.1
–0.0
6
–0.1
–0.2
Component residuals
4
12345678
9 101112131415
Lags
2
Arrival residuals
0.3
0
0.2
0.1
–2
–0.0
–0.1
–4
–0.2
0
10 20 30 40 50 60
70 80 90 100 110 120
12345678
9 101112131415
(a) Time (b)
Lags
FIGURE 9.5
Time series plot (a) and ACF plots (b) of the component residuals from tting a PINAR(1) model to the cuts data.
9.3.3 Overdispersion and the Information Matrix Test
A very general procedure to check if a model is correctly specied, the Information
Matrix (IM) test, was proposed by White (1982). Basically, the IM test checks to see if the
log-likelihood equality
2
∂
E =
Var
θ
2
θ
holds, as it should in a well-specied model. Here, is the log-likelihood of the model and
θ is the parameter (which could be a vector). Another motivation for this test was given
by Chesher (1983) who considered the parameter θ, under the misspecied alternative, to
be random. Hence, the model is well specied when the variance of θ,Var(θ), is zero and
checking Var[θ] = 0 is essentially the IM test. McCabe and Leybourne (2000) show that such
tests, in the multiparameter case, are locally mean most powerful.
Freeland and McCabe (2004) investigated the behavior of the IM test in the context
of integer time series with a focus on a random coefcient interpretation. Suppose a
PINAR(1) model with xed coefcients is to be tested against an alternative with ran-
dom coefcients, where the thinning and arrivals parameters are considered random with
beta and gamma distributions, respectively. Then, if the appropriate IM test rejects, a
model with greater dispersion would be a natural candidate for consideration as a new
specication.
When (as in the example of the previous paragraph) the parameter vector can be par-
titioned into natural subgroups, it turns out that the IM test can also be decomposed into
subtests, each associated with a natural subgrouping of the parameters. Thus, for exam-
ple, in the case of the INAR(p) model, there is a subtest associated with the parameters
of the thinning operators and another associated with the arrivals process. Unfortu-
nately, the combined IM test is often not very useful in practice as it may be difcult
to interpret a rejection constructively and, even when it does not reject, it may obscure
200 Handbook of Discrete-Valued Time Series
behavior in the individual components. Hence, it is more productive to apply the subtests
individually.
Specializing to the case in (9.2), consider the parameters to be random and that the
sequences
{
α
t
}
T
1
and
{
λ
t
}
T
1
are i.i.d with means E [α
t
] =α and E [λ
t
] =λ. Each of α
t
and
t=
t=
λ
t
may be vectors, but we do not complicate the notation by treating them explicitly
as such. The IM procedure tests the hypothesis that Var[α
t
]=Var [λ
t
] =0, t =1, 2, ..., T,
against the alternative that at least one of them is not. The subtest associated with the
thinning parameters is given by
T T
U
D
= u
D,t
=
˙
2
α
t
+
¨
α
t
t=1 t=1
where we denote the score of the model with respect to α
t
by
˙
α
t
and the second deriva-
tives of the log-likelihood by
¨
α
t
. We evaluate these derivatives at the mean values, that is,
using
˙
2
=
˙
2
and
¨
. In the vector parameter case,
˙
2
and
¨
are
α
t
α
t
α
t
=α,λ
t
=λ
α
t
α
t
=α,λ
t
=λ
α
t
α
t
matrices and their traces should be used in the computations. Since terms like
˙
α
2
t
depend
on the parameters, estimates are required to implement these tests. Hence, we construct
u
ˆ
D,t
based on estimated quantities using
˙
2
and
¨
. By means of
α
t
α
t
α,λ
t
=λ
ˆ
α
t
α
t
α,λ
t
=λ
ˆ
the martingale properties of the likelihood process, it usually follows, under mild regu-
larity, that U
ˆ
D
= s
D
T
t=1
u
ˆ
D,t
d
N
(
0, 1
)
where s
2
D
=
T
t=1
u
ˆ
2
D,t
. Similarly, the subtest
associated with the arrivals process is U
A
=
t
T
=1
u
A,t
=
t
T
=1
2
λ
t
+
¨
λ
t
and this may
be implemented with u
ˆ
A,t
estimated using
˙
2
λ
t
ˆ
and
¨
λ
t
α
t
α,λ
t
=λ
ˆ
. It also follows
α
t
α,λ
t
=λ
that U
ˆ
A
= s
A
T
t=1
u
ˆ
A,t
d
N
(
0, 1
)
where s
2
A
=
T
t=1
u
ˆ
2
A,t
. See Freeland and McCabe (2004),
Sec. 5 for further details.
The use of the IM test is again illustrated by reference to a PINAR(1) model tted to the
cuts data. Relevant computations show that the p-values of the IM tests, U
ˆ
D
and U
ˆ
A
,for
the departure and arrival processes are 0.0250 and 0.0159, respectively. The low p-values
indicate that there is more variation in the data than is described by the model. The rela-
tively larger p-value for the departure process may indicate that a greater problem exists
for the arrival process. Note again the pervasive suggestion of seasonality in this monthly
data, an issue revisited in Section 9.4.3.
9.4 Analyses Based on the Predictive Distributions
The literature on probabilistic forecasting has developed a number of tools to compare
and evaluate predictive distributions, see, for example, Diebold et al. (1998). These tools
can also be fruitfully employed in diagnostic checking, as proposed by Gneiting et al.
(2007) or Jung and Tremayne (2011a). Following Geweke and Amisano (2010), we rst dis-
cuss the probability integral transform (PIT) method as a means of assessing the absolute
performance of models. Then, in the second subsection, the relative performance of mod-
els within a group of competing ones is addressed using scoring rules and information
criteria.
201 Model Validation and Diagnostics
9.4.1 PIT Histograms for Discrete Data
The use of the PIT as a method for assessing the adequacy of distributional assumptions
for a model dates back at least to the work of Dawid (1984) and exploits Rosenblatt’s (1952)
transformation of an absolutely continuous (conditional) distribution into a uniform dis-
tribution. To be more specic, dene the random variable u
t
on the basis of the cumulative
predictive distribution F
c
(·) corresponding to the true DGP u
t
= F
c
(X
t
|F
t1
). Under cor-
rect specication of the predictive distribution, the series of PIT random variables {u
t
} are
i.i.d. standard uniform (0, 1). If a specied model does not correspond to the true DGP
and has cumulative predictive distribution F(X
t
|F
t1
), departures from this behavior can
be expected. Diebold et al. (1998) discussed various ways in which the uniformity of {u
t
}
from any model may be assessed. These are divided into two categories: those that are
designed to check the unconditional uniformity of the {u
t
} and those that check whether
the PIT series is i.i.d. The former is typically checked in an informal way by plotting the
empirical cumulative distribution function of {u
t
} and comparing it to the identity func-
tion, or by constructing a histogram of the {u
t
} and checking for uniformity. Diebold et al.
(1998, p. 869) also argued that formal tests of whether the {u
t
} are i.i.d. U(0, 1) using, per-
haps, the Kolmogorov–Smirnov or the Cramer–von Mises test are readily available but are
nonconstructive and, therefore, of little practical value.
If the {u
t
} are not uniformly distributed, the nature of the deviation from uniformity can
be informative. An obvious tool for analyzing the conformity with the i.i.d. assumption for
the {u
t
} is to examine their autocorrelation structure. Incidentally, Berkowitz (2001) sug-
gested transforming the {u
t
} to standard normal variables by means of an inverse Gaussian
distribution. Then a quantile–quantile (QQ) plot against the standard normal distribu-
tion assesses the distributional t of different models, although we do not pursue this
proposal here.
In the context of discrete distributions, some modications to standard methods are
required, because predictive cumulative distribution functions are step functions. Two
methods have been proposed in the literature. The rst of these, suggested by Denuit
and Lambert (2005), is the so-called randomized PIT obtained by perturbing the step func-
tion nature of the distribution function for discrete random variables, rather in the manner
that a hypothesis testing procedure might use a randomization device to yield a test with
a prespecied signicance level in the context of discrete variables. A random draw, υ,
from a (standard) uniform distribution is used to construct a randomized PIT based on the
distribution function of observed counts x
t
from
u
+
= F(x
t
1|F
t1
) + υ[F(x
t
|F
t1
) F(x
t
1|F
t1
)]. (9.10)
t
Here, F(·) remains the predictive cumulative distribution of the counts from some model
and F(1) = 0, compare to Czado et al. (2009). As in the case of a continuous cumulative
predictive function, if the model is correctly specied, u
+
t
is a serially independent random
variable following a uniform distribution on the interval [0, 1].
The assessment may be carried out by constructing histograms from the {u
+
}; see, for
t
example, Jung and Tremayne (2011a) who employ this method in the context of count time
series models. They nd that, for data sets of small sample size, the shape of the PIT his-
togram can change appreciably between sets of random draws from υ. One approach to
overcoming this is to average out this effect by using N independent sets of random uni-
form numbers to compute N PIT histograms. The average histogram bin heights over these
N replications could then be used. But this method is, effectively, indistinguishable from
202 Handbook of Discrete-Valued Time Series
the nonrandomized PIT, to be discussed next, rendering further discussion of the approach
redundant.
An alternative method of computing the PIT for a discrete cumulative predictive dis-
tribution has been proposed by Czado et al. (2009). It utilizes the distribution function
observed at the count x
t
via
0, u
F(x
t
1|F
t1
)
u
F(x
t
1|F
t1
)
F
(t)
(u
|F
t1
) =
F(x
t
|F
t1
) F(x
t
1|F
t1
)
, F(x
t
1|F
t1
) u
F(x
t
|F
t1
)
.(9.11)
1, u
F(x
t
|F
t1
)
The assessment of the distributional assumption can be carried out by aggregating over the
set of T p predictions for all observed counts and comparing the mean PIT
T
F
m
(u
) = (T p)
1
F
(t)
(u
|F
t1
),0 u
1,
t = p + 1
to the cumulative distribution function of a standard uniform random variable. The com-
putation of the relevant predictive cumulative distributions used in (9.10) and (9.11) is
straightforward for most time series models for count data.
We operationalize the procedure by plotting a (nonrandomized) PIT histogram obtained
by allocating the values to J equally spaced bins j = 1, ... , J, where the height of the jth bin
is computed from f
j
= F
m
(j/J) F
m
((j 1)/J) , and checking for uniformity. We supplement
such a graph by incorporating approximate 100(1 α)% condence intervals (here we use
α = 0.05) obtained from a standard χ
2
goodness-of-t test of the null hypothesis that the
J bins of the histogram are drawn from a uniform distribution. Under the null hypothe-
sis, the statistic, G, is asymptotically χ
2
(J 1) distributed; we set J to 10. Alternatively,
F
m
(u
) may be plotted against u
and checked for deviations from the identity function;
this requires no (arbitrary) choice of the number of bins J.
Figure 9.6 provides a summary of the PIT-based diagnostics resulting from tting a
PINAR(1) model to each of the cuts and the iceberg order data sets. Based on the earlier
discussion, we present the nonrandomized variant of the PIT histogram in the gure. Both
the PIT histogram and the F
m
(u
) chart for the cuts data suggest misspecication of the
distributional assumption. The pronounced U-shaped pattern of the PIT histogram indi-
cates that not enough dispersion has been allowed for in the conditional distribution. This
is mildly supported by the results for the uniformity test statistic for the PIT histogram for
the cuts data, which is 14.8053 (p-value: 0.0964). In the case of the iceberg order data, the PIT
histogram is closer to a uniform distribution, although a slight U-shape pattern is evident.
The uniformity test statistic for the iceberg orders is 9.4657 (p-value: 0.3954) and, therefore,
does not reject the null hypothesis of a uniform PIT histogram at any conventional sig-
nicance level. For both data sets, the correlograms of the (centered) {u
+
} provide similar
t
qualitative information to those of the Pearson residuals of Figure 9.4 and are indicative of
some misspecication of the dependence structure.
It is, therefore, evident from all the graphs from both residual and predictive distribution
analyses for both data sets that the basic PINAR(1) model does not provide a satisfactory t
for either. In a modeling exercise seeking data coherent specications for these data, some
model renement is clearly called for. This may lead a researcher to need to choose between
203
Model Validation and Diagnostics
PIT histogram
F
m
(u*)
ACF of {u
+
t
}
0.20
1.0
0.25
0.9
0.8
0.15
0.15
0.7
0.6
0.05
0.10
0.5
0.4
–0.05
0.3
0.05
0.2
–0.15
0.1
0.00
0.0
–0.25
1 2 3 4 6 7 8 10 0.1 0.3 0.5 0.7 0.9
135 79111315
(a)
u*
Lags
PIT histogram
F
m
(u*)
ACF of {u
+
t
}
0.20
1.0
0.15
0.9
0.8
0.10
0.15
0.7
0.6
0.05
0.10
0.5
0.4
–0.00
0.3
0.05
0.2
–0.05
0.1
0.00
0.0 –0.10
13 5 7 9
0.1 0.3 0.5
0.7 0.9 1 4 7 10
13 17
(b)
u*
Lags
FIGURE 9.6
PIT-based diagnostics after tting a PINAR(1) model to data: cuts (a) and iceberg order (b).
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset