129 State Space Models for Count Time Series
typically resorts to numerical optimization of an approximation to L(θ). These meth-
ods often do not rely on derivative information. If a gradient or Hessian is required,
then another d or d(d + 1)/2 integrals need to be approximated, where d = dim(θ).
Computing standard errors: Once an approximation
θ
ˆ
to the MLE has been produced,
standard errors are required for inference purposes. This is especially important
in the model tting stage when perhaps a number of covariates are being consid-
ered for inclusion in the model. Because of the large dimensional integrals dening
the likelihood, approaches using approximations to the Hessian or scoring algo-
rithms are problematic. Often one resorts to numerical approximations to derivatives
of the approximating likelihood evaluated at the estimated value. However, these
estimates can be quite variable and numerically sensitive to the choice of tuning
parameters in the numerical algorithms. Bootstrap methods, in which each boot-
strap replicate would require its own n-dimensional integral to be computed, is one
possible workaround for computing more reliable standard errors.
Asymptotic theory: There are currently no proofs that the MLE
θ
ˆ
is consistent or asymp-
totically normal. One would certainly expect these properties to hold, but since the
form of the likelihood is rather intractable, the arguments are not standard adap-
tations of existing proofs. Nonetheless, it is important to have a complete theory
worked out for maximum likelihood estimation in these models in order to ensure
that inferences about the parameters are justiable.
As a result of these practical concerns, a large variety of methods have been proposed to
approximate the likelihood and, to a lesser extent, derivatives of these approximations. The
main approaches in the literature are approximations to the integrand in (6.9), which can
be integrated in closed form to get an approximation to the likelihood. Improvements to
these approximations are typically based on Monte Carlo methods and importance sam-
pling from approximating importance densities. Quasi Monte Carlo (QMC) methods based
on randomly shifted deterministic lattices of points in R
n
have also been applied in recent
years. One recent attempt is given in Sinescu et al. (2014) where a Poisson model with a
constant mean plus AR(1) state process is considered. While QMC methods hold promise,
further development is required before they become competitive with other methods
reviewed here.
6.2.3 Earlier Monte Carlo and Approximate Methods
Nelson and Leroux (2006) review various methods in this class for estimating the likelihood
function, including the Monte Carlo expectation method rst used for the Poisson response
AR(1) model in Chan and Ledolter (1995) based on Gibb’s sampling for the E-step, a
version due to McCulloch based on Metropolis–Hastings sampling, Monte Carlo Newton–
Raphson (Kuk and Cheng, 1999) and a modied version of the iterative bias correction
method of Kuk (1995). They compare performance of these methods with the original esti-
mating equations approach of Zeger (1988) and the PQL approach (based on a Laplace
approximation to the likelihood) of Breslow and Clayton (1993) using simulations and by
application to the Polio data in Zeger (1988). Davis and Rodriguez-Yam (2005) also review
the Monte Carlo Expectation Maximization and Monte Carlo Newton–Raphson methods.
Apart from the Bayesian method, all the methods reviewed by Nelson and Leroux pro-
vide, as a byproduct of the parameter estimation, estimated covariances matrices under
the assumption that the estimates satisfy a central limit theorem.