9: Model Validation and Diagnostics (5/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

209 Model Validation and Diagnostics

the next two experiments in this section, we discuss the results from single simulation run

where the sample size is set to 50, 000 in order to limit the impact of sampling uncertainty

on the results. In this case, the generated data are analyzed with two different scenarios:

(a) a GP(1) estimated model corresponding to the true DGP and (b) a PINAR(1) estimated

model where the innovation distribution is erroneously taken to be Poisson and binomial

thinning is assumed.

Figure 9.9 provides three graphs associated with relevant diagnostic tools. Graphs for

neither the ACF of the {u

} nor the parametric bootstrap are provided, since they are

qualitatively similar to the ACF of the Pearson residuals and provide no added insight.

None of the graphical diagnostics depicted in the top panels of Figure 9.9 suggest that

the GP(1) model is inadequate for the data. However, this is not the case when attention is

focused on the lower row of panels in Figure 9.9. Here, the results of erroneously assuming

equidispersed Poisson innovation distribution (and binomial thinning) are clearly evident.

The distributional misspecication is seen in the U-shaped PIT histogram and an F



)

chart (in the bottom row, third column of the gure), which deviates from the 45

◦

line

(compare the corresponding gure in the row above). In addition, the correlogram of the

Pearson residuals indicates misspecication with respect to the dynamics of the generated

data. This result is explained by the fact that the maximum likelihood estimate for the

parameter α

in the PINAR(1) model is biased downward. Therefore, the strength of the

dependence in the data is underestimated, resulting in residual serial correlation remain-

ing in the Pearson residuals; this is depicted in the left-hand panel in the lower row of

Figure 9.9.

A summary of numerical results is given in Table 9.3. It can be seen that, in contrast

with that for the correct specication, the sample variance of the Pearson residuals from

the PINAR(1) is considerably larger than one (1.3171), indicating that not all the dispersion

in the generated data has been accounted for in the tted specication. Note also that the

scoring rules and the information criteria of the two tted models uniformly indicate a

preference for the true GP(1) model for the data over the PINAR(1) one. For some of these

statistics, including the variance of the Pearson residuals, both information criteria and the

p-values of G, the evidence is emphatic.

The second experiment specically targets a model misspecication with respect to the

predictive distribution. The data are generated using an INAR(1) of the form (9.2) with

binomial thinning, but with GP innovations in truth. Such a model is discussed by Jung

and Tremayne (2011b), where it is indicated that the resultant counts exhibit marginal

overdispersion, but are not GP, because closure under convolution does not apply.

For this simulation experiment, we employ the same set of parameter values for α

λ,and η as in the rst experiment. Also, we use the same estimated models as men-

tioned earlier, that is, (1) a GP(1) model based on the random operator R

(·) of Joe

(1996) and (2) a PINAR(1) model based on binomial thinning with the parameters esti-

mated being the dependence parameter and the innovation mean based on a Poisson

innovation assumption. Note that neither tted model is correct. Figure 9.10 provides

graphs associated with various diagnostic tools. Summary numerical results are given in

Table 9.4.

Note that both tted models are misspecied, since the rst implies a misspecied thin-

ning mechanism (the marginal distribution of the data will be overdispersed but will not be

GP) and the second assumes an incorrect innovation, because the likelihood is based on a

Poisson assumption for innovations. This is reected at various junctures in the diagnostic

analysis.

210

Handbook of Discrete-Valued Time Series

0.010

ACF Pearson residuals

0.20

PIT histogram

1.0

(u*)

0.9

0.005

0.15

0.7

0.8

0.6

0.000

0.10

0.5

0.4

–0.005

0.05

0.2

0.3

0.1

–0.010

(a)

0.10

1 4 7 10 14

Lags

ACF Pearson residuals

0.00

0.20

1 3 5 7

PIT histogram

0.0

1.0

0.1 0.3 0.5

(u*)

0.7 0.9

0.9

0.05

0.15

0.7

0.8

0.6

0.10

0.5

0.00

0.05

0.2

0.3

0.4

0.1

(b)

–0.05

14 7 10 13

Lags

0.00

1 3 5 7 9

0.0

0.1 0.3

0.5 0.7 0.9

FIGURE 9.9

Graphical results for the rst Monte Carlo experiment. (a) GP(1) estimated model and (b) PINAR(1) estimated

model.

211 Model Validation and Diagnostics

TABLE 9.3

Summary of Numerical Results for the First Monte Carlo Experiment

GP(1) Model PINAR(1) Model

Pearson residual

Mean −0.0008 −0.0137

Variance 1.0169 1.3171

Scoring rules

logs 1.7347 1.7689

qs 0.7816 0.7896

rps 0.8119 0.8227

AIC 86735.49 88444.71

BIC 86748.72 88452.53

G 11.584 777.8477

p-value (0.2378) (< 0.000)

Starting with the second set of results related to the PINAR(1) estimated model, we

see that, due to a downward bias in the ML estimation of the dependence parameter,

there are obvious unwanted spikes in the correlogram of the Pearson residuals. Both the

(nonrandomized) PIT histogram and the F



) chart indicate some misspecication in the

distributional assumption. From Table 9.4, it can be seen that the variance of the Pearson

residuals (at 1.2547) is considerably larger than unity and the G-statistic decisively rejects

uniformity of the PIT histogram.

Interpreting the rst set of results related to the GP(1) estimated specication, where the

misspecication is essentially due to the thinning operator assumed, is less obvious. We

reiterate (Jung and Tremayne 2011b) that the degree of overdispersion in the innovations is

attenuated in the true marginal distribution of the observations by the binomial thinning

operation used in the data-generating mechanism. The estimated GP(1) model is able to

capture the dependence structure in the data, reected by a Pearson residual correlogram

that shows the dependence structure in the data to be adequately modeled (top row, rst

column of the gure). Also, the variance of the Pearson residuals from the estimated model

is larger than one, but only marginally so. Diagnostic results related to other aspects of the

specication do tentatively suggest model misspecication in that the (nonrandomized)

PIT histogram and the F



) chart exhibit limited unwanted features. In particular, the

former shows some departure from uniformity, a conclusion backed up by the goodness-

of-t statistic G and its associated p-value. Overall, the results displayed in Figure 9.10 and

Table 9.4 suggest a preference for the GP(1) specication over the PINAR(1) one, but there

may be doubt about whether or not the former is fully data coherent.

The third experiment is designed to reect underspecication of dynamics in the esti-

mated model. Data is generated using a second-order integer autoregressive model with

GP innovations and the operator due to Joe (1996); by closure under convolution, the

marginal distribution of the generated counts is GP. See the discussion of the previous sub-

section relating to the preferred model for the iceberg data set for further information on

this specication. The following set of parameters are used in the experiment for the depen-

dence parameter vector α, α

= 0.4; α

= 0.25; α

= 0.1; λ = 0.5; and η = 0.2 leading to a

process mean of 2.5 and rst- and second-order autocorrelations of 0.45 + 0.1 = 0.55 and

0.25 + 0.1 = 0.35, respectively.

212

Handbook of Discrete-Valued Time Series

ACF Pearson residuals PIT histogram

(u*)

0.025

0.20

1.0

0.020

0.9

0.015

0.8

0.15

0.010

0.7

0.005

0.6

0.000

0.10

0.5

–0.005

0.4

–0.010

0.3

0.05

–0.015

0.2

–0.020

0.1

–0.025

0.00

0.0

3 5 7 9 12 15 1 3 5 7 9 0.1 0.3 0.5 0.7 0.9

(a) Lags

ACF Pearson residuals PIT histogram

(u*)

0.10

0.20

1.0

0.9

0.8

0.15

0.7

0.05

0.6

0.10

0.5

0.4

0.00

0.3

0.05

0.2

0.1

–0.05

0.00

0.0

135 7911 14 13 5 7 9 0.1

0.3 0.5 0.7 0.9

(b) Lags

FIGURE 9.10

Graphical results for the second Monte Carlo experiment. (a) GP(1) estimated model and (b) PINAR(1) estimated

model.

213 Model Validation and Diagnostics

TABLE 9.4

Summary Numerical Results for the Second Monte Carlo Experiment

GP(1) Model PINAR(1) Model

Pearson residual

Mean −0.0018 −0.0090

Variance 1.0534 1.2547

Scoring rules

logs 1.6927 1.7111

qs 0.7724 0.7768

rps 0.7661 0.7704

AIC 84631.63 85550.68

BIC 84644.86 85563.91

G 244.089 611.650

p-value (< 0.000) (< 0.000)

Two different models are tted to the generated data: (1) a GP(2) (correct) estimated

model and (2) a misspecied GP(1) model, so both estimated models utilize the thinning

operator of Joe (1996). Figure 9.11 displays graphs associated with some of the diag-

nostic tools discussed for the latter case only, since the tted GP(2) model evidences

no misspecication. Summary numerical results for both estimated models are given in

Table 9.5.

It is evident from the (nonrandomized) PIT histogram and the F



) chart that the

estimated GP(1) model is able to capture the distributional assumption correctly. How-

ever, the correlogram of the Pearson residuals indicates misspecied dynamics in this

underspecied tted model. In particular, it shows Pearson residual autocorrelations of

the GP(1) model that decay exponentially (after the third). This arises because the auto-

correlations of the data themselves exhibit a more complicated persistence pattern than a

rst-order model can account for. From the numerical results displayed in Table 9.5, it can

be seen, as it is to be expected, that all the scoring rules and the information criteria clearly

favor the GP(2) estimated model over the rst-order counterpart. Note, however, that the

summary statistics relating to the Pearson residuals and the G-statistic do not indicate

model misspecication.

Finally, we conduct a fourth experiment by generating data from a second-order inte-

ger autoregressive model with Poisson innovations. Data is again generated using the

Joe (1996) thinning operator and is of the form (9.1) with p = 2. Instead of using a con-

stant innovation rate as in the earlier experiments, we employ a time varying innovation

designed to capture (deterministic) seasonality often employed in empirical work. Suppose

the innovation rate λ

to be time varying using harmonics given by

 







2πt 2πt

= exp θ

+ θ

sin

+ θ

cos

(9.12)

200 200

and choose the following set of parameters: α

= 0.4, α

= 0.25, α

= 0.1, θ

= 0.05,

=−0.2, and θ

= 0.2. The harmonics introduce additional dynamic effects in the

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9: Model Validation and Diagnostics (5/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
9: Model Validation and Diagnostics (5/6)