64 Handbook of Discrete-Valued Time Series
A primary focus is on testing if the parameterization across series can be simplied
so that regression coefcients or serial dependence parameters can be constrained to be
the same. We consider only linear constraints of the form θ = Aψ where A has fewer
columns than rows and ψ denotes the lower-dimensional vector of parameters in the con-
strained model. Typically ψ
T
=
ψ
β
T
, ψ
T
τ
since we will generally not be interested in
relating the regression coefcients to the serial dependence parameters. In that case A will
be block diagonal appropriately partitioned. Denote the log-likelihood with respect to the
constrained parameters as l(ψ) = l(Aθ).
Maximization of (3.19) with respect to the constrained parameters can be over a high-
dimensional parameter space. Initial estimates of
θ
ˆ
are obtained by maximizing (3.19)
without constraints which is the same as maximizing all individual likelihoods separately
and combining the resulting l
j
β
ˆ
(j)
,
τ
ˆ
(j)
. These unconstrained estimates are used to ini-
tialize the constrained parameters via
ψ
ˆ
(0)
= (A
T
A)
−1
A
T
θ
ˆ
. Next, using the appropriate
components of
θ
ˆ
(0)
= A
ψ
ˆ
(0)
, each component log-likelihood l
j
θ
ˆ
(
(
0
j)
)
and its derivatives
are calculated using the standard GLARMA software. Finally, these are combined to get
the overall l
ψ
ˆ
(0)
. Derivatives with respect to ψ can be obtained using the identities
∂l(ψ)/∂ψ = A
T
∂l(Aψ)/∂θ and ∂
2
l(ψ)/∂ψ∂ψ
T
= A
T
∂
2
l(Aψ)/∂θ∂θ
T
A. This procedure
is then iterated to convergence using the Newton–Raphson or Fisher scoring algorithm.
Fisher scoring was found to be more stable in the initial stages. Once the derivative
∂l(ψ)/∂ψ settles down, the iterative search for the optimum can be switched to the
Newton–Raphson updates, which typically gives speedier convergence.
Similar to the single series case, the asymptotic properties of the MLEs
ψ
ˆ
have not been
established for the general model previously described. Asymptotic results for longitudi-
nal data typically let the number of cases J tend to innity and the lengths of individual
series n
j
are typically held xed. For this scenario, asymptotic theory is typically straight-
forward since it relies on large numbers of independent trajectories. For our applications,
J is often bounded and we perceive of all n
j
as tending to innity in which case asymp-
totic results rely on those for individual time series which, as previously noted, are rather
underdeveloped. We assume, however, that asymptotic results hold and perform infer-
¨
ence in the usual way. For example, the matrices of second derivatives
l(
ψ
ˆ
), computed in
the course of the Newton–Raphson or Fisher scoring maximization procedures, are used to
estimate standard errors for individual parameters. Also, to test the null hypothesis of com-
mon regression effects, we use the likelihood ratio statistic G
2
=−2
l(
ψ
ˆ
0
) − l(
ψ
ˆ
1
)
, where
ψ
ˆ
0
is the estimate obtained under the null hypothesis and
ψ
ˆ
1
the estimate obtained under
the alternative. Degrees of freedom, for the chi-squared approximate reference distribu-
tion for G
2
, are calculated in the usual way. These ideas were rst illustrated in Dunsmuir
et al. (2004) for three series of daily asthma counts which are assessed for common seasonal
patterns, day of the week, weather, and pollution effects.
3.5.2 Application of Multiple Fixed Effects GLARMA to Road Deaths
Bernat et al. (2004) assessed the impact of lowering the legal allowable BAC in motor vehicle
drivers from 0.10 to 0.08 on monthly counts of single vehicle night time fatalities in 17 U.S.
states for which at least 12 months of post intervention data were available. The study
design selected 72 consecutive months of data with 36 months prior to the decrease in