56 Handbook of Discrete-Valued Time Series
for serial dependence. To illustrate, consider the case where no serial dependence exists
but p = q > 0 is specied. Then the likelihood iterations are unlikely to converge because
the likelihood surface will be “ridge-like” on the manifold where φ
j
=−θ
j
, an issue that is
encountered for standard ARMA model tting. Corresponding to this, the second deriva-
tive matrix D
NR
(δ) will be singular or the state variable W
t
can degenerate or diverge.
Because of this possibility, it is prudent to start with low orders for p and q and avoid speci-
fying them as equal. Once stability of estimation is reached for a lower-order specication,
increasing the values of p or q could be attempted.
The likelihood ratio test that there is no serial dependence versus the alternative
that there is GLARMA-like serial dependence with p =q > 0 will not have a standard
chi-squared distribution because the parameters φ
j
,for j = 1, ..., p, are nuisance parame-
ters which cannot be estimated under the null hypothesis. Testing methods such as those
proposed by Hansen (1996) or Davies (1987) need to be developed for this situation. Further
details on these points can be found in Dunsmuir and Scott (2015).
3.2.3 Distribution Theory for Likelihood Estimation
The consistency and asymptotic distribution for the maximum likelihood estimate
δ
ˆ
is rig-
orously established only in a limited number of special cases. In the stationary Poisson
response case in Davis et al. (2003) where x
T
t
≡1 (intercept only) and p = 0and q = 1,
these results have been proved rigorously. Similarly, for simple models in the Bernoulli
stationary case in Streett (2000) these results hold. Simulation results are also reported in
Davis et al. (1999, 2003) for nonstationary Poisson models. Other simulations not reported
in the literature support the supposition that
δ
ˆ
has a multivariate normal distribution for
large samples for a range of regression designs and for the various response distributions
considered here.
For inference in the GLARMA model, it is assumed that the central limit theorem holds
so that
ˆ
d
ˆ
δ
≈ N(δ,
), (3.11)
where the approximate covariance matrix is estimated by
ˆ
=−D
NR
(
δ
ˆ
)
−1
or
ˆ
=
−D
FS
(
δ
ˆ
)
−1
.In the glarma package, this distribution is used to obtain standard errors
and to construct Wald tests of the hypotheses that subsets of δ are zero. It is also assumed
that Wald tests and equivalent likelihood ratio tests will be asympotically chi-squared with
the correct degrees of freedom, results which would follow straightforwardly from (3.11)
and its proof when available. Regardless of the technical issues involved in establishing
a general central limit theorem, the earlier approximate result seems plausible since, for
these models, the log-likelihood is a sum of elements in a triangular array of martingale
differences. Conditions under which this result would likely hold include identiability
conditions as discussed earlier, conditions on the regressors similar to those used in Davis
et al. (2000, 2003) and Davis and Wu (2009), where the covariates x
t
are assumed to be a
realization of a stationary time series or is dened as x
t
= x
nt
= f(t/n) where f(u) is a piece-
wise continuous function from u ∈[0, 1] to R
K
. Additional conditions on the coefcients
are also needed to ensure that Z
t
and hence W
t
do not degenerate or grow without bound.
Indeed, little is known so far about suitable conditions to ensure this.