209 Model Validation and Diagnostics
the next two experiments in this section, we discuss the results from single simulation run
where the sample size is set to 50, 000 in order to limit the impact of sampling uncertainty
on the results. In this case, the generated data are analyzed with two different scenarios:
(a) a GP(1) estimated model corresponding to the true DGP and (b) a PINAR(1) estimated
model where the innovation distribution is erroneously taken to be Poisson and binomial
thinning is assumed.
Figure 9.9 provides three graphs associated with relevant diagnostic tools. Graphs for
neither the ACF of the {u
+
} nor the parametric bootstrap are provided, since they are
t
qualitatively similar to the ACF of the Pearson residuals and provide no added insight.
None of the graphical diagnostics depicted in the top panels of Figure 9.9 suggest that
the GP(1) model is inadequate for the data. However, this is not the case when attention is
focused on the lower row of panels in Figure 9.9. Here, the results of erroneously assuming
equidispersed Poisson innovation distribution (and binomial thinning) are clearly evident.
The distributional misspecication is seen in the U-shaped PIT histogram and an F
m
(u
)
chart (in the bottom row, third column of the gure), which deviates from the 45
◦
line
(compare the corresponding gure in the row above). In addition, the correlogram of the
Pearson residuals indicates misspecication with respect to the dynamics of the generated
data. This result is explained by the fact that the maximum likelihood estimate for the
parameter α
1
in the PINAR(1) model is biased downward. Therefore, the strength of the
dependence in the data is underestimated, resulting in residual serial correlation remain-
ing in the Pearson residuals; this is depicted in the left-hand panel in the lower row of
Figure 9.9.
A summary of numerical results is given in Table 9.3. It can be seen that, in contrast
with that for the correct specication, the sample variance of the Pearson residuals from
the PINAR(1) is considerably larger than one (1.3171), indicating that not all the dispersion
in the generated data has been accounted for in the tted specication. Note also that the
scoring rules and the information criteria of the two tted models uniformly indicate a
preference for the true GP(1) model for the data over the PINAR(1) one. For some of these
statistics, including the variance of the Pearson residuals, both information criteria and the
p-values of G, the evidence is emphatic.
The second experiment specically targets a model misspecication with respect to the
predictive distribution. The data are generated using an INAR(1) of the form (9.2) with
binomial thinning, but with GP innovations in truth. Such a model is discussed by Jung
and Tremayne (2011b), where it is indicated that the resultant counts exhibit marginal
overdispersion, but are not GP, because closure under convolution does not apply.
For this simulation experiment, we employ the same set of parameter values for α
1
,
λ,and η as in the rst experiment. Also, we use the same estimated models as men-
tioned earlier, that is, (1) a GP(1) model based on the random operator R
t
(·) of Joe
(1996) and (2) a PINAR(1) model based on binomial thinning with the parameters esti-
mated being the dependence parameter and the innovation mean based on a Poisson
innovation assumption. Note that neither tted model is correct. Figure 9.10 provides
graphs associated with various diagnostic tools. Summary numerical results are given in
Table 9.4.
Note that both tted models are misspecied, since the rst implies a misspecied thin-
ning mechanism (the marginal distribution of the data will be overdispersed but will not be
GP) and the second assumes an incorrect innovation, because the likelihood is based on a
Poisson assumption for innovations. This is reected at various junctures in the diagnostic
analysis.