9: Model Validation and Diagnostics (1/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Model Validation and Diagnostics

Robert C. Jung, Brendan P.M. McCabe, and A.R. Tremayne

CONTENTS

9.1 Introduction...................................................................................189

9.2 Parametric Resampling......................................................................191

9.3 Residual Analysis.............................................................................195

9.3.1 Pearson Residuals....................................................................196

9.3.2 Component Residuals................................................................197

9.3.3 Overdispersion and the Information Matrix Test................................199

9.4 Analyses Based on the Predictive Distributions..........................................200

9.4.1 PIT Histogramsfor Discrete Data..................................................201

9.4.2 Scoring Rules and Model Selection................................................204

9.4.3 Cuts and Iceberg Data Revisited...................................................205

9.5 Evidencewith ArticalData................................................................207

9.6 Conclusions....................................................................................217

References............................................................................................217

9.1 Introduction

Checking the adequacy of a specied model is an important part of any iterative modeling

exercise in applied time series analysis. For linear Gaussian time series models, or those

based on the framework of generalized linear models, there exist well-developed tools for

this purpose that are readily available and routinely employed in applied work. However,

for nonlinear time series models for discrete data, this is not the case. Nevertheless, the

need to compare two or more competing model specications, or evaluate the adequacy of

t of a chosen model, is obvious.

To help address this gap, we suggest a range of diagnostic and model validation meth-

ods designed to lead to data coherent models that achieve good probabilistic forecasting

outcomes. We borrow from the associated literature developed mainly for continuous vari-

ables, adapting them where necessary for the discrete context. This leads us to advocate a

set of graphical tools and other calibration methods of various kinds.

However, achieving the desired aim may not be straightforward, as the following quote

taken from the important paper of Tsay (1992, p. 2) indicates However, it is well known that

the best model with respect to one checking criterion may fare badly with respect to another criterion.

... Consequently, there is a need to specify the objective of data analysis before choosing a checking

criterion....Without mentioning objectives, reported model checking statistics are meaningless or

189

190 Handbook of Discrete-Valued Time Series

could be misleading. As suggested earlier, our standpoint is that we consider the model class

to be introduced next as primarily of use in a probabilistic forecasting sense, including a

need to provide not only point forecasts but also good estimates of entire forecast distribu-

tions. Hence, we focus our coverage on methods that may help to achieve good outcomes

in this respect.

We now briey introduce the class of integer autoregressive models that will be used as

a vehicle to demonstrate the application of the diagnostic tools described in the chapter.

When used for other model classes presented in this volume, appropriate adaptations may

be necessary. An integer autoregressive process {X

; t = 0, ±1, ±2, ...} of order p dened on

the state space of nonnegative integers is of the form

= R

t−1

; α) + ε

, (9.1)

where F

t−1

indicates the relevant past history of X

to be conditioned on, typically

t−1

, ..., X

t−p

in a pth-order model, and ε

are a sequence of i.i.d. discrete random vari-

ables. The innovation process ε

and F

t−1

are presumed to be stochastically independent

for all points in time. This model specication is inspired by the work of Joe (1996), from

which it follows, inter alia, that it is often a Markov chain (of some order).

In (9.1), R

(·) denotes a random operator to be applied at time t (which may differ from

specication to specication) that carries the dependence structure and preserves the inte-

ger nature of the process. Some practical examples of these random operators are given

in the following. Perhaps unsurprisingly, alternative choices of the operator R

(·) and the

innovations ε

lead to a rich class of models, see, for example, the survey by McKenzie

(2003). A variant of (9.1) that is popular in the literature is the following:

= α

 X

t−1

+ α

 X

t−2

+···+α

 X

t−p

+ ε

(9.2)

where, conditional on X

t−k

, α

 X

t−k

is an integer-valued random variable (using oper-

ator ) with parameter α

(possibly a vector). The conditional variables α

 X

t−k

, k ∈

{1, ..., p} are mutually independent and independent of the i.i.d. innovations sequence ε

The operator thus delivers an integer value, and dependence in X

is induced via the con-

ditioning variables X

t−k

, k ∈{1, ..., p}. The operator used in α

 X

t−k

may correspond to

binomial thinning and ε

to a Poisson variable with parameter λ. Then the conditional vari-

ables α

X

t−k

, k ∈{1, ..., p} have independent binomial distributions with parameters α

and X

t−k

. Another possibility is that, conditional on X

t−k

, α

X

t−k

is beta-binomial, while

is negative binomial. For all these pth order model variants of the form (9.2), the acronym

INAR(p) has been introduced.

The special case of (9.1) when p =1 is of importance. Under binomial thinning and

Poisson innovations, X

has a Poisson marginal distribution because closure under con-

volution applies. This is probably the workhorse model of integer time series modeling.

As it can be written in the form of (9.2), it will henceforth be denoted a PINAR(1) model.

If, however, ε

were to be generalized Poisson (GP) random variables, then, to preserve a

GP marginal distribution for X

, the random operator R

(·), conditional on X

t−1

, would

yield a quasi-binomial distribution; such a model will be denoted GP(1) in the follow-

ing. Further, if ε

is negative binomial and, conditional on X

t−1

, R

(·) is beta-binomial,

then X

also has a negative binomial marginal distribution (see, e.g., McKenzie, 2003,

p. 586).

191 Model Validation and Diagnostics

In what follows we use the PINAR(1) as a rst specication to t to two real life data

sets and consider at various subsequent points in the chapter the evidence that this simple

specication needs to be elaborated. We seek to reveal data coherent models for each data

set in the light of our diagnostic analyses.

Of course, there are a number of avenues that can be used to assess the evidence against

the suitability of a specied model. Issues to be considered include (but would not neces-

sarily be limited to) the type of random operator R

(·) chosen, relevant past history F

t−1

the distributional properties of ε

, and the need to introduce regression effects in some way.

Evidently, the third of these might be obviated by using the semiparametric approach of

McCabe et al. (2011), but this can introduce added complications when looking at the last

and so we do not consider this approach further in our contribution.

The plan of the chapter is as follows. Sections 9.2 through 9.4 provide a description of the

diagnostic methods surveyed together with the results of their application to two real data

sets. In Section 9.5, we use simulated data to highlight specic properties of the various

methods discussed. Finally, Section 9.6 contains concluding remarks.

9.2 Parametric Resampling

A very general informal approach to model diagnostics for time series is proposed by Tsay

(1992). He demonstrates the procedure by employing the sample spectral density function

of any process as a functional of interest. This is closely related to the (sample) autocorre-

lation function, (S)ACF, to be used here, since it is a cosine transformation of the spectrum.

The exibility of Tsay’s approach stems from the fact that it not only provides an overall

evaluation of the tted model but also can be tailored to meet certain specic needs of the

analysis. The procedure is widely applicable and rests on a fairly minimal set of require-

ments. Although bootstrap methods are ubiquitous, the caveat that they often do depend

on asymptotic theory (and sometimes on distributional assumptions) is in order. In our

context, the approach emphasizes reproducibility in tted models and is designed to pro-

vide overall evaluation of t or to check special characteristics of a process. Moreover, the

approach can be readily applied to time series models of counts as it is straightforward

to implement the data-generating process (DGP) of most of them in standard software

packages.

Only the following requirements need to be fullled for the implementation of Tsay’s

proposal: a parametric model of mathematical form with given parameters and a specied

distribution for innovations; and one, or more, characteristics or functionals that encap-

sulate special features of interest. No further restrictions, other than that the model can

be used to generate bootstrap samples, apply. Based on articially generated sample pro-

cesses, an empirical distribution of the specied functional (in our case, ordinates of sample

autocorrelation functions) is obtained. The adequacy of a tted model is then assessed by

comparing this empirical distribution to the corresponding functional quantity of the data

itself. A model may be regarded as adequate if it successfully reproduces the observed

characteristics of the actual data. Specically, for each xed lag of the autocorrelation func-

tion, the 100(1−α/2) and 100(α/2) quantiles (we use α = 0.05 for graphical displays in what

follows) can be computed to constitute the bounds of an acceptance envelope. If the sample

autocorrelations of the data predominantly lie within the acceptance envelopes, the tted

model can be deemed adequate according to the functional chosen. Notice that this is not

192 Handbook of Discrete-Valued Time Series

an interval estimation procedure as such, so one cannot reason that such an envelope will

contain the true value of any functional 100(1 − α)% of the time in repeated applications;

see Tsay (1992, Sec. 2.2) for related discussion.

As we shall use this parametric resampling procedure regularly in this chapter as a tool

to assess a tted model’s adequacy, it seems appropriate to rst examine how the proce-

dure operates in a stylized setting. We, therefore, conduct pilot Monte Carlo experiments

in which the model tted to articially generated data is a PINAR(1). The data itself are

generated in two ways: rst, when the true model is tted, that is, the data itself fol-

low a PINAR(1) process in truth; and, second, when the true DGP is an INAR(2) of the

form (9.2) with Poisson innovations. In the former case, the mean and variance of the

marginal distribution of the data are equal and the autocorrelation function is the same as

that of the Gaussian AR(1) continuous counterpart. In the second case, the true marginal

distribution of the data is not Poisson, there is some overdispersion, and the true auto-

correlation function of the process is equivalent to that of a Gaussian AR(2) process. We

anticipate that application of the Tsay procedure under the rst scenario will indicate no

model misspecication and the contrary under the second.

The functionals that we use in this illustrative experiment are as follows: the variance

and the rst four ordinates of the autocorrelation function. Articial data are generated

from the two specications and the relevant sample functionals, the sample variance, and

the rst- through fourth-order sample autocorrelations, denoted SACF(1)–SACF(4) in the

following, are calculated for the generated data. A PINAR(1) model is tted by maximum

likelihood (ML) and, using the resultant parameter estimates, B bootstrap samples are gen-

erated and the same sample functionals computed. We determine the percentage of times

the functionals of the data are covered by a 100(1−α) probability interval (for α = 0.01, 0.05,

and 0.1) constructed from the bootstrap replicates of the resampling procedure. This par-

allels the procedure described by Tsay (1992, p. 4), and is repeated R times to provide an

indication of the performance of the procedure.

In the rst experiment, data series of length T = 500 are generated from a PINAR(1)

model with the following parameter values: α

= 0.8 and λ = 0.4. This leads to Poisson

distributed count time series with (theoretical) mean and variance of 2 and rst four

autocorrelations 0.8, 0.64, 0.51, and 0.8

= 0.41, respectively. The results presented here

are based on R = 1000 replications, and for each generated series, we perform the para-

metric resampling procedure as described in the previous paragraph using B = 5000

replications.

To provide some information on the sampling variability that can be expected when the

true model is tted to the data, we present the average quantiles for the sample function-

als over the 1000 replications for the rst experiment in the upper panel of Table 9.1. From

this, it is evident that, on average, the sampling distributions of the functionals are centered

quite close to the true values used to generate the data. The lower panel of Table 9.1 shows

what happens if the sample size is varied from T = 500 to T = 250. Broadly, increased sam-

pling variability of the anticipated type is seen in the average newly estimated quantiles.

The left panel of Table 9.2 provides the percentages with which the functionals of the

data are covered by the three acceptance bounds used in this experiment and a correct

model is tted. It is evident that, in all cases, these percentages show that sample function-

als outside the envelopes occur less often than might be expected. We conducted a further

experiment to vary the dependence in the generated process (using α

= 0.5 and λ = 1) to

see if the results were sensitive to this variation, but they were not. The results indicate that

the Tsay procedure will generally conrm a correctly specied model.

193 Model Validation and Diagnostics

TABLE 9.1

Average Quantiles from R = 1000 Replications for the Monte Carlo Experiment for the Tsay

Resampling Procedure When a True PINAR(1) Model is Fitted for T = 500 (Upper Panel) and

T = 250 (Lower Panel)

Quantile (%) 0.5 2.5 5 50 95 97.5 99.5

Functional

T = 500

Sample variance 1.286 1.417 1.489 1.935 2.537 2.675 2.990

SACF(1) 0.703 0.726 0.737 0.791 0.836 0.844 0.859

SACF(2) 0.482 0.518 0.536 0.624 0.702 0.716 0.742

SACF(3) 0.315 0.359 0.381 0.492 0.549 0.612 0.647

SACF(4) 0.188 0.237 0.261 0.387 0.506 0.527 0.569

T = 250

Sample variance 1.074 1.228 1.316 1.895 2.767 2.980 3.451

SACF(1) 0.652 0.687 0.704 0.782 0.845 0.856 0.874

SACF(2) 0.403 0.457 0.483 0.610 0.718 0.736 0.770

SACF(3) 0.221 0.285 0.316 0.474 0.614 0.639 0.684

SACF(4) 0.088 0.156 0.190 0.367 0.530 0.559 0.613

TABLE 9.2

Inclusion Rates of Sample Functionals from R = 1000 Replications for the Tsay Resampling

Procedure When a PINAR(1) Model Is Fitted and the True DGP Includes PINAR(1) (Left Panel) and

INAR(2) (Right Panel)

Acceptance Bounds 90% 95% 99%

Functional

Sample variance 95.20 98.40 99.80

SACF(1) 93.60 98.60 99.90

SACF(2) 93.80 97.60 99.40

SACF(3) 93.20 96.10 99.30

SACF(4) 92.30 96.30 99.10

90% 95% 99%

10.30

23.60

0.00

15.30

36.60

0.00

0.20

0.10

30.00

61.00

0.00

0.50

0.80

On the other hand, when an inadequate model is tted (refer to the right-hand panel

Table 9.2), all functionals are able to indicate this on a regular basis. These results are

obtained by using an INAR(2) data-generating mechanism with α

= 0.45, α

= 0.35, and

λ = 0.4. This generates data that have a true mean of 2, variance = 2.95, and rst four auto-

correlation ordinates equal to 0.692, 0.662, 0.540, and 0.475, respectively. Again, T = 500,

R = 1000, and B = 5000 are used. Thus, the Tsay procedure does show an ability to detect

an incorrectly tted model, in this pilot experiment at least. In any instance where it indi-

cates a model’s inadequacy, a search for a more rened (or different) model specication

should be undertaken.

Jung and Tremayne (2011b) applied the method previously with integer time series

(though without any examination of its empirical performance, limited evidence on which

is provided earlier). Grunwald et al. (1997) report that the procedure is able to discover

some surprising results in the context of Bayesian time series models that would have not

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 9: Model Validation and Diagnostics (1/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
9: Model Validation and Diagnostics (1/6)