14: Coherence Consideration in Binary Time Series Analysis (1/3)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Coherence Consideration in Binary Time

Series Analysis

Benjamin Kedem

CONTENTS

14.1 Introduction....................................................................................311

14.2 Coherence for Quadratic Systems..........................................................311

14.2.1 Residual Coherence..................................................................314

14.2.1.1 Examples: Residual Coherence Applied to Clipped

Binary Series................................................................315

14.3 Logistic Regression for Binary Time Series...............................................317

14.3.1 Interactions in Logistic Regression.................................................319

14.3.2 Application toLAMortality........................................................320

14.4 Discussion......................................................................................323

References............................................................................................323

14.1 Introduction

In a recent study of mortality forecasting in the United States, it has been found that quite

often mortality patterns in a given state are inuenced by mortality trends in neighbor-

ing states, and the inclusion of interaction terms from the latter in log-linear models can

substantially improve mortality forecasting in the given state (Khan et al., 2004). This moti-

vates the problem of identifying interaction terms expressed as products of covariates of the

form x

t−k

in other time series regression models including logistic regression for binary

time series. This chapter discusses a spectral measure for interaction identication and its

application in binary time series regression. The spectral measure for interaction identica-

tion, called residual coherence, depends on a certain nonlinear extension of the well-known

measure of (squared) coherence. It is helpful, therefore, to provide rst some background

leading to the denition of residual coherence and illustrate its use. This is followed by an

application to logistic regression for binary time series.

14.2 Coherence for Quadratic Systems

Let x

, t =···, −1, 0, 1, ..., be a zero mean stationary time series admitting a spectral repre-

sentation in terms of a process of orthogonal increments ξ

(λ), λ ∈ (−π, π]. Dene a system

of degree n with input x

and output y

by the nth degree polynomial functional

311

 



312 Handbook of Discrete-Valued Time Series

= ··· exp[it(λ

+···+λ

)]H

(λ

, ..., λ

)ξ

(dλ

) ···ξ

(dλ

)

+···+ exp(itλ

(λ

)ξ

(dλ

) + H

(14.1)

where H

is a constant and H

, j = 1, ..., n are complex continuous and bounded kernels. H

is said to be the leading kernel. Functionals x, y of the form (14.1) are said to be orthogonal

if E(xy

) =0. In addition, we shall assume that all relevant spectra and cross spectra are

well dened.

Nonlinear systems that admit a representation of the form (14.1) have been studied in

Tick (1961), Kimelfeld (1972, 1974), Nelson and Van Ness (1973a,b), and Priestley (1988)

among others.

Consider now a quadratic system, that is, n = 2, with input x

and output y

. Then, fol-

lowing a procedure described in Lectures 1 through 4 of Wiener (1958), y

can be expressed

as a sum of orthogonal functionals G

, y

) with leading kernels K

= G

, y

) (14.2)

j=0

where G

, y

) =K

is a constant, G

, y

) is a rst-order (linear) functional, and

, y

) is a quadratic functional with leading kernel K

. Assume that E(y

) =0for

all t. Then K

=0 with probability one, and by the orthogonality of G

and G

, K

(λ) =

(λ)/f

(λ) where f

is the spectral density of x

,and f

is the cross-spectral density of x

and y

In general, it is difcult to determine K

without imposing conditions on x

such as the

Gaussian assumption (Priestley, 1988, Section 3.3). Tick (1961) determined the kernels of

a quadratic functional when x

is Gaussian employing the cross bi-spectrum between x

and y

. In Kimelfeld (1972, 1974), by the use of lag processes, it is shown how to bypass the

Gaussian assumption by approximating G

itself without determining K

, using a class of

approximating functionals for which we can determine all the kernels as follows.

For integers u

, k = 1, ..., n dene the lag processes U

(t) by the centered product,

(t) = x

t+u

− R

) (14.3)

where R

is the autocovariance of x

. Under the assumption that the U

(t) are station-

ary, it can be shown that for sufciently large n, y

in (14.2) admits the mean-square

representation,

 

(λ)







itλ

(λ)ξ



itλ

(λ)ξ

(dλ)



= G

, y

(dλ) + e , (14.4)

(λ)

k=1

where

(λ)f

(λ)

(λ) =−

. (14.5)

(λ)



Coherence Consideration in Binary Time Series Analysis 313

To get the B

(λ), dene

(λ) = (f

(λ)),

(λ) = (f

(λ), ..., f

(λ))



, f

(λ) = (f

(λ), ..., f

(λ))



B(λ) = (B

(λ), ..., B

(λ))



Then, observing that f

(λ) is the conjugate transpose of f

(λ), we have

B(λ) =



(λ) −

(λ)

(λ)f

(λ)



−1



(λ) −

(λ)



. (14.6)

It is important to note that the denition of A

in (14.5) guarantees the orthogonality of the

approximating sum in (14.4) and G

. Thus, if we dene

= G(t) + 

 



 



≡ G

(λ)

, y

itλ

(λ)ξ

(dλ) +

itλ

(λ)ξ

(dλ)

+ 

(14.7)

(λ)

k=1

where 

is orthogonal to G

and to the quadratic sum, then clearly A

and B

do not change,

and in addition,

(λ)

0 ≤ S

(λ; u

, ..., u

) ≡

≤ 1.

(λ)

It is easy to see then that

(λ; u

, ..., u

) =

(λ)|

(λ)f

(λ)



(−λ)



(λ) −

(λ)f

(λ)



B(λ) (14.8)

(λ) f

(λ)

where we recognize that the rst term on the right-hand side of (14.8) is the well-known

(squared) coherence that measures the degree of linear relationship between x

and y

(14.7); see Koopmans (1974, p. 137). The other term is due to the quadratic term in (14.7)

corresponsing to the lag processes U

(t).WeshallrefertoS

(λ; u

, ..., u

) as lagged coherence

(Kimelfeld, 1972). It measures the validity of models of the form (14.7) by observing that

when S

(λ; u

, ..., u

) is close to 1 for all λ ∈[0, π] then the signal-to-noise ratio is high.

Clearly, S

(λ; u

, ..., u

) may be close to one on all or part of [0, π] due to the quadratic

term in (14.7) represented by the sum and not as a result of the linear component G

,in

which case the system is substantially quadratic. Similarly, S

(λ; u

, ..., u

) could be large

due to the linear component, in which case the system is substantially linear.

 

 

314 Handbook of Discrete-Valued Time Series

It is interesting to compare the lagged coherence (14.8) with the “quadratic coherency”

of Tick (1961) which assumes that x

is Gaussian,

(λ)|



|C.B.S.(ω − λ, λ)|

quad. coh(ω) = +

dλ (14.9)

(λ)f

(λ) f

(ω − λ)f

(λ)

where C.B.S. stands for the cross bi-spectrum, the Fourier transform of E[x

t+t

−

E(y

))] as a function of t

, t

. In both (14.8) and (14.9), the quadratic contribution is measured

as an augmentation of the linear coherence, and both are between 0 and 1; for proofs see

Kimelfeld (1972).

14.2.1 Residual Coherence

The lagged coherence S

(λ; u

, ..., u

) can also help in the selection of the lags themselves,

where u



, ..., u



are preferable to u

, ..., u

(λ; u



, ..., u



) ≥ S

(λ; u

, ..., u

), ∀λ ∈[0, π].

In this chapter, we make use of this idea in the case of a single lag u as follows. First, it is

more convenient to dene a lag process using the notation

(t) = x

t−u

− R

(u), u = 0, 1, 2, .... (14.10)

Consider the model

∞ ∞

= l

t−k

+ b

(t − k) + 

(14.11)

k=−∞ k=−∞

where 

is independent noise. By adding and subtracting an appropriate linear term of the



∞

form

k=−∞

t−k

we can rewrite (14.11) as in (14.4), and in that case the lagged coherence

reduces to something more palatable (Kedem-Kimelfeld, 1975),

(λ; u) = S

(λ) +

(λ

)

(λ) −

(

)

, −π < λ ≤ π (14.12)

where u = 0, 1, 2, ..., S

(λ) is the linear coherence as in (14.8),

(λ)|

(λ) =

, (14.13)

(λ)f

(λ)

and

(λ)f

(λ) − f

(λ)f

(λ)

B(λ) = −

π < λ ≤ π.

(λ)f

(λ) −|f

(λ)|



315 Coherence Consideration in Binary Time Series Analysis

Clearly, 0 ≤ S

(λ) ≤ 1 for all λ ∈ (−π, π], and similarly

0 ≤ S

(λ; u) ≤ 1, −π < λ ≤ π, u = 0, 1, 2, ...

For a given lag u, the inuence of X

(t) on y

can be measured by noting a signicant

increase in S

(λ; u) relative to the linear coherence S

(λ), for some or all λ ∈[0, π]. Alterna-

tively, as suggested recently in Khan et al. (2004), we can use the maximum residual coherence

dened as

RS(u) = max

(λ; u) − S

(λ)}, u = 0, 1, 2, ... (14.14)

to measure the inuence of the “interaction” X

(t) on y

. This can be done graphically. As

a graphical display, RS(u) resembles the periodogram where the “lag is replaced by fre-

quency,” language related to me years ago by the late Melvin Hinich. In both measures,

one tries to discern graphically conspicuous ordinates and identify by this important lags

in the case of residual coherence and important frequencies in the case of the periodogram.

And like in periodogram analysis, the residual coherence is elevated at secondary conspic-

uous lags provided the corresponding lag processes have signicant coefcients in models

of the form (14.4). This is illustrated in the analysis of model (14.16).

In this connection, Hinich (1979), assuming a stationary Gaussian input, presents a pro-

cedure for determining the values of multiple lags when there is a nite number of lag

processes, taking advantage of the relationship between the weights of the lag processes in

a quadratic system and the sample cross bi-spectrum between the input and output series.

To determine the true lags, he too presents a graphical device called lagstrum, in which lag

plays the role of frequency.

14.2.1.1 Examples: Residual Coherence Applied to Clipped Binary Series

Suppose x

is a rst-order autoregressive process x

= 0.3x

t−1

+ 

where 

is standard

logistic noise, and consider an autoregression plus a past interaction covariate x

t−1

t−2

= 0.8z

t−1

+ 1.5x

t−1

t−2

+ η

, t = 1, ..., 156 (14.15)

where η is again standard logistic noise. Except for a constant, x

t−1

t−2

is a lag process with

u = 1, and we would expect the residual coherence obtained from (x

, z

) to peak at u = 1.

This can be seen clearly in the bar plot at the top of Figure 14.1. Clipping z

at level 5 we

obtain a binary time series

1, z

≥ 5

0, z

< 5

, t = 1, ..., 156.

The bar plot at the bottom of Figure 14.1 obtained from (x

, y

) again is maximized at u = 1

as expected, since in general clipping operations retain to a degree useful spectral infor-

mation from the original baseline series which in the present case is z

; see Kedem (1980).

Very similar bar plots are obtained when 

and η

are both Gaussian. Thus, the residual

coherence RS(u) points to a possible association between y

and x

t−1

t−2

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 14: Coherence Consideration in Binary Time Series Analysis (1/3)

Create new playlist

Sign In

Sign Up

Table of Contents for
14: Coherence Consideration in Binary Time Series Analysis (1/3)