201 Model Validation and Diagnostics
9.4.1 PIT Histograms for Discrete Data
The use of the PIT as a method for assessing the adequacy of distributional assumptions
for a model dates back at least to the work of Dawid (1984) and exploits Rosenblatt’s (1952)
transformation of an absolutely continuous (conditional) distribution into a uniform dis-
tribution. To be more specic, dene the random variable u
t
on the basis of the cumulative
predictive distribution F
c
(·) corresponding to the true DGP u
t
= F
c
(X
t
|F
t−1
). Under cor-
rect specication of the predictive distribution, the series of PIT random variables {u
t
} are
i.i.d. standard uniform (0, 1). If a specied model does not correspond to the true DGP
and has cumulative predictive distribution F(X
t
|F
t−1
), departures from this behavior can
be expected. Diebold et al. (1998) discussed various ways in which the uniformity of {u
t
}
from any model may be assessed. These are divided into two categories: those that are
designed to check the unconditional uniformity of the {u
t
} and those that check whether
the PIT series is i.i.d. The former is typically checked in an informal way by plotting the
empirical cumulative distribution function of {u
t
} and comparing it to the identity func-
tion, or by constructing a histogram of the {u
t
} and checking for uniformity. Diebold et al.
(1998, p. 869) also argued that formal tests of whether the {u
t
} are i.i.d. U(0, 1) using, per-
haps, the Kolmogorov–Smirnov or the Cramer–von Mises test are readily available but are
nonconstructive and, therefore, of little practical value.
If the {u
t
} are not uniformly distributed, the nature of the deviation from uniformity can
be informative. An obvious tool for analyzing the conformity with the i.i.d. assumption for
the {u
t
} is to examine their autocorrelation structure. Incidentally, Berkowitz (2001) sug-
gested transforming the {u
t
} to standard normal variables by means of an inverse Gaussian
distribution. Then a quantile–quantile (QQ) plot against the standard normal distribu-
tion assesses the distributional t of different models, although we do not pursue this
proposal here.
In the context of discrete distributions, some modications to standard methods are
required, because predictive cumulative distribution functions are step functions. Two
methods have been proposed in the literature. The rst of these, suggested by Denuit
and Lambert (2005), is the so-called randomized PIT obtained by perturbing the step func-
tion nature of the distribution function for discrete random variables, rather in the manner
that a hypothesis testing procedure might use a randomization device to yield a test with
a prespecied signicance level in the context of discrete variables. A random draw, υ,
from a (standard) uniform distribution is used to construct a randomized PIT based on the
distribution function of observed counts x
t
from
u
+
= F(x
t
− 1|F
t−1
) + υ[F(x
t
|F
t−1
) − F(x
t
− 1|F
t−1
)]. (9.10)
t
Here, F(·) remains the predictive cumulative distribution of the counts from some model
and F(−1) = 0, compare to Czado et al. (2009). As in the case of a continuous cumulative
predictive function, if the model is correctly specied, u
+
t
is a serially independent random
variable following a uniform distribution on the interval [0, 1].
The assessment may be carried out by constructing histograms from the {u
+
}; see, for
t
example, Jung and Tremayne (2011a) who employ this method in the context of count time
series models. They nd that, for data sets of small sample size, the shape of the PIT his-
togram can change appreciably between sets of random draws from υ. One approach to
overcoming this is to average out this effect by using N independent sets of random uni-
form numbers to compute N PIT histograms. The average histogram bin heights over these
N replications could then be used. But this method is, effectively, indistinguishable from