71 Generalized Linear Autoregressive Moving Average Models
Summary of algorithm to approximate likelihood and derivatives:
1. Initialize parameter value θ
(k)
2. For each j = 1, ..., J
a. Use GLARMA software treating ζ as a parameter to nd the derivatives needed
to nd the J Laplace approximations ζ
j
,
j
.
b. Select Q quadrature points in each of d directions, relocate, and scale these using
ζ
j
and
j
c. Apply GLARMA software to calculate the integrands at the d
Q
integrating
points in order to estimate the likelihood and rst and second derivatives
at θ
(k)
.
3. Assemble the complete likelihood and derivatives over the J cases.
4. Use Newton–Raphson iteration to update θ
(k)
θ
(k+1)
. Repeat at Step 2 until
convergence.
3.6.3 Application to Road Deaths Data
We used Model FE-III (central columns of Table 3.2) as the starting point for the random
effects analysis. The correlation between the 17 pairs of intercepts and coefcients of the
logOMVD variables was Corr β
ˆ
0,j
, β
ˆ
4,j
= 0.531 which is signicantly different from zero.
As a result, we used a correlated bivariate random effect for the intercept and this regres-
sor. The results of tting this model (Model RE-I) are given in Table 3.3. The point estimate
of the parameter controlling the correlation between these two terms was not signicant
(L
ˆ
12
= 0.101 ± 0.107) and so we removed the correlation between the random effects to
arrive at model RE-II. The likelihood ratio test (see Table 3.3) conrmed that these were
not correlated. The random effect variance for the log OVMD regressor is marginally sig-
nicant compared to its standard error. Use of a likelihood ratio test does not reject this
simplication, even after adjusting for the fact that this is a test on the boundary using
the standard methodology for mixed effects variance testing as described in Fitzmaurice
et al. (2012). Similar to the xed effects analysis, Groups 1 and 2 showed no signicant
autocorrelation; however, the large autocorrelation for the sixth group was no longer sig-
nicant. Since there is clearly potential in these models for serially correlated effects to
interact or trade-off with regression random effects, our next step was to ret the model
after removing autocorrelation terms for Groups 1, 2, and 6. This resulted in Model RE-
III and the likelihood ratio test conrms that autoregressive terms are not required for
these three Groups. The nal states for which signicant autocorrelation is required are
TABLE 3.3
Results for testing various random effects GLARMA models for the Road Deaths Series
Model φ
12
Groups Random Effects 2log L S G
2
d.f. p-value
RE-I 6 levels 2 correlated 5500.85 14
RE-II
RE-III
RE-IV
6 levels
3 levels
No φ
12
2 uncorrelated
2 uncorrelated
2 uncorrelated
5501.87
5505.08
5552.17
13
10
7
G
2
II v I
= 0.98
G
2
III v II
= 3.21
G
2
IV v III
= 47.09
1
3
3
0.32
0.36
3 × 10
10
72 Handbook of Discrete-Valued Time Series
states 2, 4, 5, and 8 (corresponding to California, Florida, Idaho, and Maine). We next com-
pared this model with the purely random effects model (the analogue of what was t in
Bernat et al., 2004), which is labeled Model RE-IV. The likelihood ratio test overwhelmingly
rejects the hypothesis that autocorrelation terms can be removed from the model for these
four states.
Did our inclusion of autoregressive terms impact the original conclusions of Bernat et al.
(2004) concerning the strength and signicance of the BAC intervention on single vehicle
night time road deaths? They obtained β
ˆ
1
=−0.052 ± 0.021 (p-value = 0.013). In the
analysis presented here, we have removed the seasonal offset terms from the response dis-
tribution and the logOMVD control variable for reasons discussed earler. The analogous
result is Model RE-IV and for that model β
ˆ
=−0.054 ± 0.022 (p-value = 0.013), a very
similar nding to that in Bernat et al. (2004).
We report estimates for all parameters of Model RE-III as the nal column pair in
Table 3.2. Inclusion of signicant serial dependence terms where needed has actually
increased the size of the point estimate of the BAC effect relative to Model RE-IV (no serial
dependence model) and left the standard error unchanged, resulting in a reduced p-value
of 0.006 for this term and hence suggesting that the original nding of the signicance of
the BAC association may have been conservative.
3.6.4 Computational Speed and Accuracy
3.6.4.1 Impact of Increasing the Number of Quadrature Points
In several applications of the earlier method, we have found that usually Q = 3, 5, or
7 quadrature points in each of the d random effects coordinates are sufcient for use in
optimizing the likelihood and for obtaining accurate standard errors and likelihood values
and likelihood ratio statistics.
3.6.4.2 Comparison with Numerical Derivatives
For one iteration of the Newton–Raphson procedure, near convergence to the maximum
likelihood, using rst and second numerical derivatives calculated using the numDeriv
package is of the order of 300 times longer than using the AGQ method for calculating
derivatives that we propose. Use of optim for convergence and evaluation of the Hes-
sian for standard error calculations were similarly slow. There was no substantial loss of
accuracy either for convergence of the maximum likelihood updates or in the standard
errors.
In summary, the AGQ method proposed here to calculate derivatives of the log-
likelihood is two orders of magnitude faster than using optim without derivative or
numerical derivatives based on numDeriv with no substantial loss of accuracy even for
Q = 3. Use of the AGQ method makes it feasible to t combined random GLARMA ran-
dom effects models for long longitudinal data. We have experienced similar comparisons
of speed and accuracy in more complex settings, such as the analysis of 32 binary time
series of length 393 arising in the musicology study discussed earlier in which up to four
random effects were included in the models and another study of 49 times series of length
336 of suicide counts for which a Poisson response was appropriate and up to three random
effects were included in the model.
73 Generalized Linear Autoregressive Moving Average Models
3.7 Interpretation of the Serially Dependent Random Effects Models
For single series GLARMA models, means, variances, and serial covariances for the state
process {W
t
} can be readily derived using the denition of Z
t
in (3.4). For the Poisson or
negative binomial response GLARMA plus random effects model, the marginal interpreta-
tion of the xed effects coefcients is approximately equal to the conditional interpretation
since
E(Y
jt
) exp
x
T
β
(0)
+ x
T
β
(j)
exp
σ
U
2
+
γ
2
0,t
j,t
2 2
by simple extension of the argument in Davis et al. (2003).
For binomial and Bernoulli responses, calculation of means, variances, autocovariances
for the response series and interpretation of regression coefcients are not straightforward.
This is a typical issue for interpretation of random effects models and transition models in
the binomial or Bernoulli case—see Diggle et al. (2002) for example.
3.8 Conclusions and Future Directions
This chapter has reviewed the tting of GLARMA models for single time series of exponen-
tial family response distributions and illustrated this on some binary and binomial series
arising in a study of listener responses to music features. Extensions and utilization of sin-
gle series GLARMA modeling ideas and software to the long longitudinal data setting were
explained. Two approaches to providing a combined analysis of all series in a panel of
responses were considered. The rst approach was based on a constrained t across the
panel using single series GLARMA software and allowed testing of parameter similarity in
each series across the panel of series. The second approach modeled between series param-
eter variation using random effects. Again, single series GLARMA software can be used to
compute a modal approximation to the integrals that constitute the likelihood when there
are random effects. This modal approximation is based on the Laplace approximation and
this can also be calculated using the GLARMA single series software. The use of AGQ
to compute the very large number of integrals required to compute the rst and second
derivatives of the combined likelihood is explained and, somewhat surprisingly, these can
be accurately and speedily computed using a small number of quadrature points, hence
making the optimization of the likelihood based on Newton–Raphson iteration feasible in
practice. The speed of this method compared to those based on numerical derivatives of
standard optimizers is several hundred times faster per iteration. We illustrated the mul-
tiple independent time series approaches on a set of long longitudinal data arising in the
study of the association between lowering the legal blood alcohol level in drivers on road
deaths in 17 U.S. States.
The approach taken here allows considerable exibility on the specication of the serial
dependence in the individual time series. The examples presented here clearly require that
exibility. We know of no other current methods that have this exibility; those that we
74 Handbook of Discrete-Valued Time Series
have reviewed appear to continue assuming short longitudinal trajectories in which case
it is difcult to allow exibility in serial dependence specications. For long longitudinal
data, this restriction can be avoided as we have demonstrated in the examples presented
herein.
We have also extended the random effects approach to parameter-driven models for
individual time series serial dependence, again based on the use of a Laplace approxima-
tion as in Davis and Rodriguez-Yam (2005) and AGQ for the random effect integration.
We aim to extend GLARMA random effects models covered in this chapter and in the
parameter-driven version to allow for random effects on serial dependence parameters.
This seems to us a natural extension within the modeling perspective of random effects for
between series parameter variation.
There is an obvious need for rigorous asymptotic theory to properly justify the statisti-
cal inference that we have presented based on these new models. However, this will rely
on similar theory being developed for individual observation-driven time series models,
something that remains underdeveloped.
Finally, we have not discussed forecasting for GLARMA models in this review nor the
somewhat related issue of missing data. GLARMA models, requiring recursive calculation
of the state equation, are computationally intensive to forecast beyond one or two time
points. This has implications for the computation of the likelihood when there are missing
data, as it requires a conditional distribution of the response after gaps.
Acknowledgments
The Music Listener data used in Section 3.3 was provided by Roger Dean of the MARCS
Institute at the University of Western Sydney and the U.S. Road Deaths data used in
Sections 3.5.2 and 3.6.3 was that used in Bernat et al. (2004) and its source is acknowledged
there. I would also like to thank Chris McKendry, my previous Honour’s degree student,
for his assistance in developing the adaptive Gaussian quadrature multiple independent
random effects approach discussed in this chapter. Finally, helpful comments from a referee
improved the clarity of the presentation.
References
Benjamin, M. A., Rigby, R. A., and Stasinopoulos, D. M. (2003). Generalized autoregressive moving
average models. Journal of the American Statistical Association, 98(461):214–223.
Bernat, D. H., Dunsmuir, W., Wagenaar, A. C. et al. (2004). Effects of lowering the legal bac to 0.08 on
single-vehicle-nighttime fatal trafc crashes in 19 jurisdictions. Accident Analysis and Prevention,
36(6):1089.
Buckley, D. and Bulger, D. (2012). Trends and weekly and seasonal cycles in the rate of errors in the
clinical management of hospitalized patients. Chronobiology International, 29(7):947–954.
Creal, D., Koopman, S. J., and Lucas, A. (2008). A general framework for observation driven time-
varying parameter models. Technical report, Tinbergen Institute Discussion Paper, Amsterdam,
the Netherlands.
75 Generalized Linear Autoregressive Moving Average Models
Davies, R. B. (1987). Hypothesis testing when a nuisance parameter is present only under the
alternative. Biometrika, 74(1):33–43.
Davis, R. A., Dunsmuir, W., and Wang, Y. (1999). Modeling time series of count data. Statistics
TextBooks and Monographs, 158:63–114.
Davis, R. A., Dunsmuir, W. T., and Streett, S. B. (2003). Observation-driven models for Poisson counts.
Biometrika, 90(4):777–790.
Davis, R. A., Dunsmuir, W. T., and Streett, S. B. (2005). Maximum likelihood estimation for an
observation driven model for Poisson counts. Methodology and Computing in Applied Probability,
7(2):149–159.
Davis, R. A., Dunsmuir, W. T., and Wang, Y. (2000). On autocorrelation in a Poisson regression model.
Biometrika, 87(3):491–505.
Davis, R. A. and Liu, H. (2015). Theory and inference for a class of observation-driven models with
application to time series of counts. Statistica Sinica. doi:10.5705/ss.2014.145t (to appear).
Davis, R. A. and Rodriguez-Yam, G. (2005). Estimation for state-space models based on a likelihood
approximation. Statistica Sinica, 15(2):381–406.
Davis, R. A. and Wu, R. (2009). A negative binomial model for time series of counts. Biometrika,
96(3):735–749.
Dean, R. T., Bailes, F., and Dunsmuir, W. T. (2014a). Shared and distinct mechanisms of individual
and expertise-group perception of expressed arousal in four works. Journal of Mathematics and
Music, 8(3):207–223.
Dean, R. T., Bailes, F., and Dunsmuir, W. T. (2014b). Time series analysis of real-time music perception:
Approaches to the assessment of individual and expertise differences in perception of expressed
affect. Journal of Mathematics and Music, 8(3):183–205.
Diggle, P., Heagerty, P., Liang, K.-Y., and Zeger, S. (2002). Analysis of Longitudinal Data, 2nd edn.
Oxford University Press, Oxford, U.K.
Dunsmuir, W. T., Li, C., and Scott, D. J. (2014). Glarma: Generalized Linear Autoregressive Moving Average
Models. R package version 1.3-0. http://CRAN.Rproject.org/package=glarma
Dunsmuir, W. T. and Scott, D. J. (2015). The glarma package for observation driven time series
regression of counts. Journal of Statistical Software (to appear).
Dunsmuir, W. T. M., Leung, J., and Liu, X. (2004). Extensions of observation driven models for time
series of counts. In Proceedings of the International Sri Lankan Statistical Conference: Visions of Futur-
istic Methodologies, eds B. M. de Silva and N. Mukhopadhyay, RMIT University and University
of Peradeniy, Peradeniy, Sri Lanka.
Dunsmuir, W. T. M., Tran, C., and Weatherburn, D. (2008). Assessing the Impact of Mandatory DNA
Testing of Prison Inmates in NSW on Clearance, Charge and Conviction Rates for Selected Crime Cate-
gories. NSW Bureau of Crime Statistics and Research. http://www.bocsar.nsw.gov.au/lawlink/
bocsar/ll_bocsar.nsf/pages/bocsar_pub_legislative.
Etting, S. F. and Isbell, L. A. (2014). Rhesus macaques (macaca mulatta) use posture to assess level of
threat from snakes. Ethology, 120(12):1177–1184.
Fitzmaurice, G. M., Laird, N. M., and Ware, J. H. (2012). Applied Longitudinal Analysis, vol. 998. John
Wiley & Sons. Hoboken, New Jersy.
Hansen, B. E. (1996). Inference when a nuisance parameter is not identied under the null hypothesis.
Econometrica: Journal of the Econometric Society, 64(2):413–430.
Kedem, B. and Fokianos, K. (2002). Regression Models for Time Series Analysis. John Wiley & Sons.
Hoboken, New Jersy.
Liesenfeld, R., Nolte, I., and Pohlmeier, W. (2006). Modelling nancial transaction price movements:
a dynamic integer count data model. Empirical Economics, 30(4):795–825.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models (Monographs on Statistics and Applied
Probability 37). Chapman & Hall, London, U.K.
Pinheiro, J. C. and Bates, D. M. (1995). Approximations to the log-likelihood function in
the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics, 4(1):
12–35.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset