3: Generalized Linear Autoregressive Moving Average Models (4/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

66 Handbook of Discrete-Valued Time Series

TABLE 3.1

Results for testing various xed effect multiple GLARMA models for the Road Deaths Series in 17

U.S. States

Model −2log L S G

d.f. p-val

FE-I: Unrestricted 5345.69 92 — — —

FE-II: φ



s in 6 groups

FE-III: BAC, ALR, FS same

FE-IV: BAC, ALR, FS, lnOVD same

5347.87

5391.76

5436.57

II v I

= 2.18

III v II

= 43.89

IV v III

= 44.81

IV v II

= 88.61

0.998

0.236

0.00015

0.0021

We next check whether the regression coefcients in W

vary signicantly between indi-

vidual states. We begin with the overall unrestricted t to all 17 states. We refer to this

as Model FE-I, which has −2 log L = 5345.69 with S = 92 parameters. Examination of

the individual estimates φ

suggested that they could be simplied as follows: Group 1

(State 11, φ

=−0.081 ± 0.045), Group 2 (States 1, 6, 7, 9, 10, 12:17, φ

= 0.005 ± 0.013),

Group 3 (States 2, 4, φ

= 0.066 ± 0.015), Group 4 (State 5, φ

= 0.212 ± 0.087), Group 5

(State 8, φ

= 0.401 ± 0.096), Group 6 (State 5, φ

= 0.545 ± 0.219) in which, at most, 6

coefcients are signicant.

The model with the φ

restricted to these groups is referred to as Model FE-II in

Table 3.1. Using the likelihood ratio test we obtain G

II v I

= 2.18 on 11 d.f.; hence, restriction

of the φ

would not be rejected. From this model, we then examined whether or not some

or all of the regression coefcients (other than the intercept which does vary substantially

between states) take common values across all 17 states. Model FE-III restricts the coef-

cients for BAC, ALR, Friday–Saturday to be the same and (see Table 3.1) G

=43.89

III v II

on 38 d.f. and associated p-value of 0.24, which is not sufciently strong evidence to sug-

gest that the impact of these variables differs between individual states in a statistically

signicant way. Next, in Model FE-IV, log OMVD was allowed to differ between states.

Compared with Model FE-III or Model FE-II, this risk control variable is strongly statisti-

cally signicant between states with G

IV v III

= 44.81 on 16 d.f. and associated p-value of

0.00015 and G

IV v II

= 88.61 on 54 d.f. and associated p-value of 0.0021.

Hence, Model FE-III provides a useful summary of the commonality or otherwise of

regression variable impacts on single vehicle night time road deaths across the 17 states. The

tted parameters and associated standard errors are reported in Table 3.2. The six groups

for φ

could be reduced to four by removing the nonsignicant cases of Groups 1 and 2.

We did not pursue this here, preferring to move onto the use of a random effects analy-

sis. The impact of lowering the legal BAC level is estimated to be β

=−0.072 ± 0.022

conrming the statistical signicance of this association found in Bernat et al. (2004).

The xed effects GLARMA model analysis provides a good starting point for the random

effects GLARMA modeling that we turn to in the next section. In particular, it seems plau-

sible from the results of Table 3.1 that random effects will be needed for the intercept term

and the log OMVD term, but not for BAC, ALR, or Friday–Saturday effects. The parame-

ter values reported for Model FE-III in Table 3.2 can provide useful starting values for the

random effects model tting. For xed effects, we use the point estimates of coefcients for

predictors that are common to all series, while for predictors that vary between series, we

use the mean values of point estimates of the coefcients.







�

67 Generalized Linear Autoregressive Moving Average Models

TABLE 3.2

Parameter estimates for the random effects model for the Road Deaths Series in 17 U.S. States

RE-IV FE-III RE-III

No GLARMA Multiple GLARMA Multiple GLARMA

Random Effects Fixed Effects Random Effects

Estimate s.e. Estimate s.e. Estimate s.e.

(intercept) −1.649 0.116 −1.801 — −1.705 0.119

(BAC change) −0.054 0.022 −0.072 0.022 −0.060 0.022

(ALR term) −0.063 0.035 −0.011 0.039 −0.047 0.037

(Frid-Sat) 0.032 0.011 0.037 0.011 0.037 0.011

(logOMVD) 0.395 0.063 0.314 — 0.367 0.061

Intercept RE s.d. 0.241 0.053 0.209 — 0.242 0.054

logOMVD RE s.d 0.160 0.066 0.140 — 0.145 0.072

Gp1

— — −0.066 0.045 — —

Gp2

— — 0.010 0.012 — —

Gp3

— — 0.067 0.015 0.071 0.015

Gp4

— — 0.216 0.080 0.201 0.074

Gp5

— — 0.404 0.091 0.408 0.090

Gp6

— — 0.531 0.213 — —

−2loglikelihood 5552.173 5391.1 5505.082

The results labeled RE-IV is that reported in Bernat et al. (2004) using SAS PROC NLMIXED. The results labeled

FE-III is the nal xed effects multiple GLARMA model discussed in Section 3.5.2.

Note: Values reported against the intercept β

and the logOMVD term β

are averages of the 17 individual

values obtained while the values in the rows labeled “Intercept RE” and “logOMVD RE” are the standard

deviations of these individual estimates, respectively. The results labeled RE-III is the nal random effects

multiple GLARMA model discussed in Section 3.6.3.

3.6 Random Effects Multiple GLARMA Model

3.6.1 Maximum Likelihood Estimation

Let W

be dened as in (3.15), where U

are multivariate normal. Let θ = β

(1)

, ...,

(J)

, τ

(1)

, ..., τ

(J)

, λ now be the collection of parameters in the GLARMA models and the

random effects parameters. The joint log-likelihood is now

l(θ) =

(β

(j)

, τ

(j)

, λ), (3.21)

j=1

where

(β

(j)

, τ

(j)

, λ) = log exp(l

(β

(j)

, τ

(j)

|u)g

(u; (λ))du, (3.22)

�





68 Handbook of Discrete-Valued Time Series

and g

(u; (λ)) is the multivariate normal density. To proceed further, we parameterize

the covariance matrix as  = LL

where L is lower triangular and let U

= Lζ

where ζ

are

independent N(0, I

).Letλ =vech(L) be the half-vectorisation. With this parameterization,

rewrite W

in (3.5) linearly in terms of λ as





= x

(j)

+ vech

λ + Z

. (3.23)

The log-likelihood (3.22) becomes

(β

(j)

, τ

(j)

, λ) = log exp

(j)

, τ

(j)

, λ|ζ

g(ζ)dζ (3.24)

where g(ζ) is the d-fold product of the standard normal density and

(j)

, τ

(j)

, λ|ζ =

− a

b(W

) + c(y

t=1 t=1

Note that (3.23) is in the same form as (3.3) but the parameters λ are treated as regression

parameters for any xed value of the vector ζ and the random effects covariates r

j,t

The representation of the random effects covariance matrix as  = LL

allows the

parameter λ to enter into the conditional log-likelihood linearly and without bounding con-

straints. Both properties enable existing GLARMA software to calculate the log-likelihood

and derivatives with respect to the parameters. When some elements of , and hence L,are

specied as zero to reect zero covariance between some of the random effects, λ is the half

vectorization of L with the structural zeros removed. Covariance matrices in which certain

combinations of random effects are specied to be zero cannot be represented in this form.

However, these can often be accommodated by reordering the random effect variables and

setting the appropriate elements of L to zero.

3.6.2 Laplace Approximation and Adaptive Gaussian Quadrature

For any xed θ, computation of the log-likelihood l(θ) requires calculation of the J inte-

grals dened in (3.24). We now outline an approximate method based on the Laplace

approximation and adaptive Gaussian quadrature (AGQ). The integral in (3.24) can be

rewritten as

�

 

(θ) =

(2π)

d/2

exp

(ζ|θ)

dζ

where the exponent is considered as a function of ζ for xed parameters θ and is

dened as







(ζ|θ) =

(ζ; x

, θ)) − a

b(W

(ζ; x

, θ)) + c(y

) −

, (3.25)

t=1

n n





 





 







Generalized Linear Autoregressive Moving Average Models 69

where

(ζ; x

, θ) =

ζ + x

(j)

+ Z

(3.26)

is treated as a function of ζ for x

j,t

(j)

xed. To nd the Laplace approximation, we expand

the exponent F(ζ) around its modal value in a second-order Taylor series, and ignore the

remainder. The resulting integral can be obtained in closed form. Note that Z

in (3.26)

is a function of ζ. Hence, the contribution to the rst and second derivatives from the

summation term in (3.25) required for the Taylor series expansion of F

need to be calcu-

lated using the GLARMA software with ζ treated as a regression parameter for covariates

j,t

and xing x

j,t

(j)

as the offset term. To nd the modal value, we need to nd ζ

∗

which solves

∂

(ζ

∗

) = 0.

∂ζ

The Newton–Raphson method is used to nd ζ

∗

and, at convergence, we set

 

−1



∗

=−

∂ζ

∂

(ζ

∗

) .

Since

∂ζ

∂

(ζ

∗

) is almost surely positive denite for the canonical link exponential family,

the Newton–Raphson method will converge to the modal solution from any starting point;

we use ζ

(0)

= 0 to intitiate the recursions.

The Laplace approximation gives the approximate log-likelihood for the jth state as

(

(θ) = log det(

∗

(θ))

1/2

+ F

(ζ

∗

(θ)), which can be combined to give the overall

approximate log-likelihood as

(1)

(θ) =

(

(θ) (3.27)

j=1

AGQ methods can be used to improve the approximation as has been successfully done for

likelihoods in other statistical models such as nonlinear and non-Gaussian mixed effects

modeling. This approach is implemented in a number of widely used software systems

as the default method—see Pinheiro and Bates (1995) and Pinheiro and Chao (2006) for

examples. Our implementation of AGQ follows that of Pinheiro and Chao (2006). It relies on

the mode, ζ

∗

,and 

∗

used in the Laplace approximation to center and scale Q quadrature

points in each of d coordinates resulting in integrands evaluated at d

points. When Q = 1,

the Laplace approximation is obtained.



70 Handbook of Discrete-Valued Time Series

The AGQ approximation to the jth integral is denoted by

(

(θ), with corresponding

approximation to the overall likelihood as

(Q)

(θ) = log

(

(θ). (3.28)

j=1

Since z

∗

(θ) and 

∗

(θ) are functions of the unknown parameters θ, it is necessary to

recompute the Laplace approximation at each iterate of θ to maximize (3.28).

Maximizing (3.28) using the optimizer optim in R proved to be very slow and unreli-

able for our applications. An alternative was to use Fisher Scoring or Newton–Raphson

updates based on numerical derivatives obtained using the R package numDeriv.This

also proved to be very slow. Analytical derivatives require implicit differentiation of ζ

∗

(θ)

and 

∗

(θ) which results in complex expressions requiring substantial modication to the

current GLARMA software. We next describe an alternative approach that avoids all of

these issues.

First derivatives of the log-likelihood (3.19) with respect to unknown parameters are

∂

�

∂





(θ) =

∂θ

l(θ) =

j=1

(θ)

∂θ

(θ|ζ

exp(l

(θ|z)g(ζ)dζ. (3.29)

Second derivatives,

(θ), are also easy to derive and involve more integrals to be approx-

imated. For any xed ζ, the integrands in these derivative expressions can be calculated

recursively using the unpackaged form of single series GLARMA software. If S denotes the

number of parameters in θ, then there are J×(1+S+2S(S+1)/2) = J(1+S)

, d-dimensional

integrals to calculate in order to implement the Newton–Raphson method. For instance, for

the nal model for the BAC example (Model RE-III) with two uncorrelated random effects,

we have J = 17, S = 10 requiring calculation of 2057 d = 2 dimensional integrals at each

step of the Newton–Raphson iterations. Fisher scoring is not available here because the

summation to compute the whole likelihood is over J; hence, insufcient outer products

of rst derivative vectors would result in an ill-conditioned approximation to the second

derivative matrix unless J is quite large.

In our experience, for long longitudinal data applications, the Laplace approximation

can provide quite accurate single-point approximations to the integrals required for the

likelihood itself. However, the rst and second derivatives have integrands that are cer-

tainly not positive, nor are they unimodal, and so a single-point integral approximation

is inadequate. However, AGQ can provide multipoint approximations for the integrals

required for derivatives. In our experience, surprisingly few quadrature points are required

to get approximations to the likelihood and the rst and second derivatives which are suf-

ciently accurate for convergence to the optimum of the likelihood and which provide

accurate standard errors for inferential purposes. We denote the estimates of

l(θ) and

l(θ)

obtained by applying AGQ with Q nodes by

(Q)

(θ) and

(Q)

(θ), respectively. The same

quadrature points and weights that are used for

(Q)

(θ) are also used to obtain

(Q)

(θ) and

(Q)

(θ) using one pass of the GLARMA software.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 3: Generalized Linear Autoregressive Moving Average Models (4/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
3: Generalized Linear Autoregressive Moving Average Models (4/6)