14
Coherence Consideration in Binary Time
Series Analysis
Benjamin Kedem
CONTENTS
14.1 Introduction....................................................................................311
14.2 Coherence for Quadratic Systems..........................................................311
14.2.1 Residual Coherence..................................................................314
14.2.1.1 Examples: Residual Coherence Applied to Clipped
Binary Series................................................................315
14.3 Logistic Regression for Binary Time Series...............................................317
14.3.1 Interactions in Logistic Regression.................................................319
14.3.2 Application toLAMortality........................................................320
14.4 Discussion......................................................................................323
References............................................................................................323
14.1 Introduction
In a recent study of mortality forecasting in the United States, it has been found that quite
often mortality patterns in a given state are inuenced by mortality trends in neighbor-
ing states, and the inclusion of interaction terms from the latter in log-linear models can
substantially improve mortality forecasting in the given state (Khan et al., 2004). This moti-
vates the problem of identifying interaction terms expressed as products of covariates of the
form x
t
x
tk
in other time series regression models including logistic regression for binary
time series. This chapter discusses a spectral measure for interaction identication and its
application in binary time series regression. The spectral measure for interaction identica-
tion, called residual coherence, depends on a certain nonlinear extension of the well-known
measure of (squared) coherence. It is helpful, therefore, to provide rst some background
leading to the denition of residual coherence and illustrate its use. This is followed by an
application to logistic regression for binary time series.
14.2 Coherence for Quadratic Systems
Let x
t
, t =···, 1, 0, 1, ..., be a zero mean stationary time series admitting a spectral repre-
sentation in terms of a process of orthogonal increments ξ
x
(λ), λ (π, π]. Dene a system
of degree n with input x
t
and output y
t
by the nth degree polynomial functional
311
312 Handbook of Discrete-Valued Time Series
y
t
= ··· exp[it(λ
1
+···+λ
n
)]H
n
(λ
1
, ..., λ
n
)ξ
x
(dλ
1
) ···ξ
x
(dλ
n
)
+···+ exp(itλ
1
)H
1
(λ
1
)ξ
x
(dλ
1
) + H
0
(14.1)
where H
0
is a constant and H
j
, j = 1, ..., n are complex continuous and bounded kernels. H
n
is said to be the leading kernel. Functionals x, y of the form (14.1) are said to be orthogonal
if E(xy
¯
) =0. In addition, we shall assume that all relevant spectra and cross spectra are
well dened.
Nonlinear systems that admit a representation of the form (14.1) have been studied in
Tick (1961), Kimelfeld (1972, 1974), Nelson and Van Ness (1973a,b), and Priestley (1988)
among others.
Consider now a quadratic system, that is, n = 2, with input x
t
and output y
t
. Then, fol-
lowing a procedure described in Lectures 1 through 4 of Wiener (1958), y
t
can be expressed
as a sum of orthogonal functionals G
j
(K
j
, y
t
) with leading kernels K
j
,
2
y
t
= G
j
(K
j
, y
t
) (14.2)
j=0
where G
0
(K
0
, y
t
) =K
0
is a constant, G
1
(K
1
, y
t
) is a rst-order (linear) functional, and
G
2
(K
2
, y
t
) is a quadratic functional with leading kernel K
2
. Assume that E(y
t
) =0for
all t. Then K
0
=0 with probability one, and by the orthogonality of G
1
and G
2
, K
1
(λ) =
f
xy
(λ)/f
xx
(λ) where f
xx
is the spectral density of x
t
,and f
xy
is the cross-spectral density of x
t
and y
t
.
In general, it is difcult to determine K
2
without imposing conditions on x
t
such as the
Gaussian assumption (Priestley, 1988, Section 3.3). Tick (1961) determined the kernels of
a quadratic functional when x
t
is Gaussian employing the cross bi-spectrum between x
t
and y
t
. In Kimelfeld (1972, 1974), by the use of lag processes, it is shown how to bypass the
Gaussian assumption by approximating G
2
itself without determining K
2
, using a class of
approximating functionals for which we can determine all the kernels as follows.
For integers u
k
, k = 1, ..., n dene the lag processes U
k
(t) by the centered product,
U
k
(t) = x
t
x
t+u
k
R
xx
(u
k
) (14.3)
where R
xx
is the autocovariance of x
t
. Under the assumption that the U
k
(t) are station-
ary, it can be shown that for sufciently large n, y
t
in (14.2) admits the mean-square
representation,
n
f
xy
(λ)
itλ
B
k
(λ)ξ
U
k
itλ
A
k
(λ)ξ
x
(dλ)
y
t
= G
1
, y
t
+
e
(dλ) + e , (14.4)
f
xx
(λ)
k=1
where
B
k
(λ)f
xu
k
(λ)
A
k
(λ) =−
. (14.5)
f
xx
(λ)
Coherence Consideration in Binary Time Series Analysis 313
To get the B
k
(λ), dene
f
uu
(λ) = (f
u
i
u
j
(λ)),
f
ux
(λ) = (f
u
1
x
(λ), ..., f
u
n
x
(λ))
, f
uy
(λ) = (f
u
1
y
(λ), ..., f
u
n
y
(λ))
,
B(λ) = (B
1
(λ), ..., B
n
(λ))
.
Then, observing that f
ux
(λ) is the conjugate transpose of f
xu
(λ), we have
B(λ) =
f
uu
(λ)
1
f
xx
(λ)
f
ux
(λ)f
xu
(λ)
1
f
uy
(λ)
f
xy
(λ)
f
xx
(λ)
f
ux
(λ)
. (14.6)
It is important to note that the denition of A
k
in (14.5) guarantees the orthogonality of the
approximating sum in (14.4) and G
1
. Thus, if we dene
y
t
= G(t) +
t
n
G
1
f
xy
(λ)
, y
t
+
e
itλ
B
k
(λ)ξ
U
k
(dλ) +
e
itλ
A
k
(λ)ξ
x
(dλ)
+
t
(14.7)
f
xx
(λ)
k=1
where
t
is orthogonal to G
1
and to the quadratic sum, then clearly A
k
and B
k
do not change,
and in addition,
f
GG
(λ)
0 S
2
(λ; u
1
, ..., u
n
)
1.
f
yy
(λ)
It is easy to see then that
S
2
(λ; u
1
, ..., u
n
) =
|f
xy
(λ)|
2
f
xx
(λ)f
yy
(λ)
+
1
B
(λ)
f
uu
(λ)
1
f
ux
(λ)f
xu
(λ)
B(λ) (14.8)
f
yy
(λ) f
xx
(λ)
where we recognize that the rst term on the right-hand side of (14.8) is the well-known
(squared) coherence that measures the degree of linear relationship between x
t
and y
t
in
(14.7); see Koopmans (1974, p. 137). The other term is due to the quadratic term in (14.7)
corresponsing to the lag processes U
k
(t).WeshallrefertoS
2
(λ; u
1
, ..., u
n
) as lagged coherence
(Kimelfeld, 1972). It measures the validity of models of the form (14.7) by observing that
when S
2
(λ; u
1
, ..., u
n
) is close to 1 for all λ ∈[0, π] then the signal-to-noise ratio is high.
Clearly, S
2
(λ; u
1
, ..., u
n
) may be close to one on all or part of [0, π] due to the quadratic
term in (14.7) represented by the sum and not as a result of the linear component G
1
,in
which case the system is substantially quadratic. Similarly, S
2
(λ; u
1
, ..., u
n
) could be large
due to the linear component, in which case the system is substantially linear.
314 Handbook of Discrete-Valued Time Series
It is interesting to compare the lagged coherence (14.8) with the “quadratic coherency”
of Tick (1961) which assumes that x
t
is Gaussian,
|f
xy
(λ)|
2
1
|C.B.S.(ω λ, λ)|
2
quad. coh(ω) = +
2
dλ (14.9)
f
xx
(λ)f
yy
(λ) f
yy
(λ) f
xx
(ω λ)f
xx
(λ)
where C.B.S. stands for the cross bi-spectrum, the Fourier transform of E[x
t+t
2
x
t+t
1
(y
t
E(y
t
))] as a function of t
1
, t
2
. In both (14.8) and (14.9), the quadratic contribution is measured
as an augmentation of the linear coherence, and both are between 0 and 1; for proofs see
Kimelfeld (1972).
14.2.1 Residual Coherence
The lagged coherence S
2
(λ; u
1
, ..., u
n
) can also help in the selection of the lags themselves,
where u
1
, ..., u
n
are preferable to u
1
, ..., u
n
if
S
2
(λ; u
1
, ..., u
n
) S
2
(λ; u
1
, ..., u
n
), λ ∈[0, π].
In this chapter, we make use of this idea in the case of a single lag u as follows. First, it is
more convenient to dene a lag process using the notation
X
u
(t) = x
t
x
tu
R
xx
(u), u = 0, 1, 2, .... (14.10)
Consider the model
y
t
= l
k
x
tk
+ b
k
X
u
(t k) +
t
(14.11)
k=−∞ k=−∞
where
t
is independent noise. By adding and subtracting an appropriate linear term of the
form
k=−∞
a
k
x
tk
we can rewrite (14.11) as in (14.4), and in that case the lagged coherence
reduces to something more palatable (Kedem-Kimelfeld, 1975),
S
2
(λ; u) = S
1
(λ) +
|
f
B
yy
(λ
(λ
)|
)
2
f
x
u
x
u
(λ)
|f
xx
f
xx
u
(
(
λ
λ
)
)
|
2
, π < λ π (14.12)
where u = 0, 1, 2, ..., S
1
(λ) is the linear coherence as in (14.8),
|f
xy
(λ)|
2
S
1
(λ) =
, (14.13)
f
xx
(λ)f
yy
(λ)
and
f
xx
(λ)f
x
u
y
(λ) f
x
u
x
(λ)f
xy
(λ)
B(λ) =
π < λ π.
f
xx
(λ)f
x
u
x
u
(λ) −|f
xx
u
(λ)|
2
315 Coherence Consideration in Binary Time Series Analysis
Clearly, 0 S
1
(λ) 1 for all λ (π, π], and similarly
0 S
2
(λ; u) 1, π < λ π, u = 0, 1, 2, ...
For a given lag u, the inuence of X
u
(t) on y
t
can be measured by noting a signicant
increase in S
2
(λ; u) relative to the linear coherence S
1
(λ), for some or all λ ∈[0, π]. Alterna-
tively, as suggested recently in Khan et al. (2004), we can use the maximum residual coherence
dened as
RS(u) = max
{S
2
(λ; u) S
1
(λ)}, u = 0, 1, 2, ... (14.14)
λ
to measure the inuence of the “interaction” X
u
(t) on y
t
. This can be done graphically. As
a graphical display, RS(u) resembles the periodogram where the “lag is replaced by fre-
quency,” language related to me years ago by the late Melvin Hinich. In both measures,
one tries to discern graphically conspicuous ordinates and identify by this important lags
in the case of residual coherence and important frequencies in the case of the periodogram.
And like in periodogram analysis, the residual coherence is elevated at secondary conspic-
uous lags provided the corresponding lag processes have signicant coefcients in models
of the form (14.4). This is illustrated in the analysis of model (14.16).
In this connection, Hinich (1979), assuming a stationary Gaussian input, presents a pro-
cedure for determining the values of multiple lags when there is a nite number of lag
processes, taking advantage of the relationship between the weights of the lag processes in
a quadratic system and the sample cross bi-spectrum between the input and output series.
To determine the true lags, he too presents a graphical device called lagstrum, in which lag
plays the role of frequency.
14.2.1.1 Examples: Residual Coherence Applied to Clipped Binary Series
Suppose x
t
is a rst-order autoregressive process x
t
= 0.3x
t1
+
t
where
t
is standard
logistic noise, and consider an autoregression plus a past interaction covariate x
t1
x
t2
,
z
t
= 0.8z
t1
+ 1.5x
t1
x
t2
+ η
t
, t = 1, ..., 156 (14.15)
where η is again standard logistic noise. Except for a constant, x
t1
x
t2
is a lag process with
u = 1, and we would expect the residual coherence obtained from (x
t
, z
t
) to peak at u = 1.
This can be seen clearly in the bar plot at the top of Figure 14.1. Clipping z
t
at level 5 we
obtain a binary time series
1, z
t
5
y
t
=
0, z
t
< 5
, t = 1, ..., 156.
The bar plot at the bottom of Figure 14.1 obtained from (x
t
, y
t
) again is maximized at u = 1
as expected, since in general clipping operations retain to a degree useful spectral infor-
mation from the original baseline series which in the present case is z
t
; see Kedem (1980).
Very similar bar plots are obtained when
t
and η
t
are both Gaussian. Thus, the residual
coherence RS(u) points to a possible association between y
t
and x
t1
x
t2
.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset