10: Detection of Change Points in Discrete-Valued Time Series (1/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Detection of Change Points in Discrete-Valued

Time Series

Claudia Kirch and Joseph Tadjuidje Kamgaing

CONTENTS

10.1 Introduction...................................................................................219

10.2 General Principles of Retrospective Change Point Analysis.. . . .. . . ... . . .. . . ... . . .. . . .220

10.3 Detection of Changes in Binary Models...................................................225

10.4 Detection of Changes in Poisson Autoregressive Models..............................227

10.5 Simulation and Data Analysis..............................................................230

10.5.1 Binary Autoregressive Time Series.................................................230

10.5.1.1 Data Analysis:U.S. Recession Data......................................231

10.5.2 Poisson Autoregressive Models....................................................232

10.5.2.1 Data Analysis: Number of Transactions per Minute

for Ericsson B Stock........................................................233

10.6 Online Procedures............................................................................235

10.7 Technical Appendix. .. ... .. ... .. .............................................................236

10.7.1 Regularity Assumptions.............................................................236

10.7.2 Proofs...................................................................................239

References............................................................................................243

10.1 Introduction

There has recently been a renewed interest in statistical procedures concerned with the

detection of structural breaks in time series, for example, the recent review articles by

Aue and Horváth [2] and Horváth and Rice [16]. The literature contains statistics to detect

simple mean changes, changes in linear regression, changes in generalized autoregressive

conditionally heteroscedastic (GARCH) models; from likelihood ratio to robust M meth-

ods (see, e.g., Berkes et al. [3], Davis et al. [6], Hušková and Marušiaková [26], and Robbins

et al. [31]). While at rst sight, the corresponding statistics appear very different, most of

them are derived using the same principles. In this chapter, we shed light on those prin-

ciples, explaining how corresponding statistics and their respective asymptotic behavior

under both the null and alternative hypotheses can be derived. This enables us to give a

unied presentation of change point procedures for integer-valued time series. Because the

methodology considered in this chapter is by no means limited to these situations, it allows

for future extensions in a standardized way.

219



220 Handbook of Discrete-Valued Time Series

Hudecová [17] and Fokianos et al. [11] propose change point statistics for binary time

series models while Franke et al. [13] and Doukhan and Kegne [7] consider changes

in Poisson autoregressive models. Related procedures have also been investigated by

Fokianos and Fried [9,10] for integer valued GARCH and log-linear Poisson autoregres-

sive time series, respectively, but with a focus on outlier detection and intervention effects

rather than change points.

Section 10.2 explains how change point statistics can be constructed and derives asymp-

totic properties under both the null and alternative hypotheses, based on regularity

conditions, which are summarized in Appendix 10.7.1 to lighten the presentation. This

methodology is then applied to binary time series in Section 10.3 and to Poisson autoregres-

sive models in Section 10.4, generalizing the statistics already discussed in the literature.

In Section 10.5, some simulations as well as applications to real data illustrate the perfor-

mance of these procedures. A short review of sequential (also called online) procedures for

count time series is given in Section 10.6. Finally, the proofs are given in Appendix 10.7.2.

10.2 General Principles of Retrospective Change Point Analysis

Assume that data Y

, ..., Y

are observed with a possible structural break at the (unknown)

change point k

. We will rst look at likelihood ratio tests for structural breaks before

explaining how to generalize these ideas. To this end, we assume that the data before and

after the change can be parameterized by the same likelihood function L but with different

(unknown) parameters θ

, θ

∈  ⊂ R

. A likelihood ratio approach yields the following

statistic:

       

max

(k) := max 

(

, ..., Y

)



+  Y

k+1

, ..., Y



◦

− 

(

, ..., Y

)



1�k�n 1�k�n

where (Y, θ) is the log-likelihood function and θ

and θ

◦

are the maximum likelihood

estimator based on Y

, ..., Y

and Y

k+1

, ..., Y

, respectively. The maximum over k is due to

the fact that the change point is unknown, so the likelihood ratio statistic maximizes over

all possible change points. A similar approach based on some asymptotic Bayes statistic

leads to a sum-type statistic, where the sum over (k) is considered (see, e.g., Kirch [19]).

Davis et al. [6] proposed this statistic for linear autoregressive processes of order p with

standard normal errors:



i.i.d.

= β

+ β

t−j

+ ε

, ε

∼ N(0, 1). (10.1)

j=1

In this situation (which includes mean changes as a special case (p = 0)), this maximum

likelihood statistic does not converge in distribution to a nondegenerate limit but almost

surely to innity (Davis et al. [6]). Nevertheless, asymptotic level α tests based on this

maximum likelihood statistic can be constructed using a Darling–Erdös limit theorem as

stated in Theorem 10.1b. In small samples, however, the slow convergence of Darling–

Erdös limit theorems often leads to some size distortions.















 

221 Detection of Change Points in Discrete-Valued Time Series

Similarly, one can construct Wald-type statistics based on maxima or sums of quadratic

forms of

W(k) :=



−



◦

, k = 1, ..., n.

Wald statistics can be generalized to any other estimation procedure for θ and are not

restricted to maximum likelihood estimators. However, for both maximum likelihood and

Wald statistics, the estimators θ

and θ

◦

need to be calculated, which can be problem-

atic in nonlinear situations. In such situations, which are typical for integer-valued time

series, these estimators are usually not analytically tractable, but need to be calculated

using numerical optimization methods. This can lead to additional computational effort

to calculate the statistics or large numerical errors. The latter problems can be reduced by

using score-type statistics based on maxima or sums of quadratic forms of





∂

S(k) := S k, θ



(

, ..., Y

), θ

)



, k = 1, ..., n.

∂θ

θ=θ

In this case, only the estimator based on the full data set Y

, ..., Y

needs to be calculated

(possibly using numerical methods). The likelihood score statistic for the linear regression

model has been investigated in detail by Hušková et al. [27]. Similarly to Wald statis-

tics, score statistics do not need to be likelihood based but can be generalized to different

estimators as long as those estimators can be obtained as a solution to

S n, θ

(

, Y

t−1

)

, θ

= 0

j=1

for some estimating function F. Important estimators of this type are M estimators, which

have been used in the context of linear regression models to construct score-type change

point tests by Antoch and Hušková [1].

In the linear autoregressive situation in (10.1), the likelihood-based statistics of the type

mentioned earlier are all equivalent. Specically, some calculations yield

2(k) = W(k)

−

◦

W(k) = S(k)

−



◦



−1

S(k),

k n

where C

= Y

t−1

−1

, C

◦

= Y

t−1

−1

, Y

t−1

= (1, Y

t−1

, ..., Y

t−p

)

t =1

t =k+1

As already mentioned, the maximum likelihood statistic (hence the corresponding likeli-

hood Wald and score statistics) does not converge in distribution. Under the null hypoth-

esis, the matrix in the quadratic form of the likelihood score statistic can be approximated

asymptotically (as k →∞, n − k →∞, n →∞)by

−

◦

)

−1



−



−1

+ o

(1), C = EY

t−1

. (10.2)

n n













  



 



222 Handbook of Discrete-Valued Time Series

n)C

−1

does converge in distribution. More precisely, we consider

Replacing this term by w(k/

for a suitable weight function w(·) leads to a statistic that

max

S(k)

−1

S(k),

1�k�n

where w : [0, 1]→R

is a nonnegative continuous weight function fullling

lim

w(t)<∞, lim(1 − t)

w(t)<∞, for some 0  α < 1

t→0 t→1

sup

w(t)<∞ for all 0 < η 

. (10.3)

η�t�1−η

Theorem 10.1a shows that this class of statistics converges, under regularity conditions,

in distribution to a nondegenerate limit. The following choice of weight function, closely

related to the choice of the weights in (10.2), has often been proposed in the literature:

w(t) = (t(1 − t))

−γ

,0  γ < 1,

where γ close to 1 detects early or late changes with better power. In the econometrics

literature, the following weight functions are often used, which correspond to a truncation

of the likelihood ratio statistic and can be viewed as the likelihood ratio statistic under

restrictions on the set of admissible change points,

w(t) = (t(1 − t))

−1/2

{�t�1−}

for some  > 0. Similarly, if a priori knowledge of the location of the change point is

available, one can increase the power of the designed test statistic for such alternatives by

choosing a weight function that is larger near these points (Kirch et al. [20]). Nevertheless,

these statistics have asymptotic power one for other change locations (See Theorem 10.2).

Additionally, many change point statistics discussed in the literature do not use the full

score function but rather a lower-dimensional projection, where C

is replaced by a lower

rank matrix. For linear autoregressive models as in (10.1), for example, Kulperger [24] and

Horváth [25] use a partial sum process based on estimated residuals, which corresponds

to the rst component of the likelihood score vector in this example.

For this reason, in the following, we do not require S(k, θ) to be the likelihood score (nor

even of the same dimension as θ), nor do we assume that θ

is the maximum likelihood

estimator. In fact, we allow for general score-type statistics that are based on partial sum

processes of the type

S k, θ

H X

, θ

,with S n, θ

= 0and θ

→ θ

, (10.4)

j=1

where θ

is typically the correct parameter, X

are observations, where, for example, for the

autoregressive case of order one, a vector X

= (X

, X

j−1

)

is used, and H is some function

usually of the type AF for some (possibly lower rank) matrix A and an estimating function















223 Detection of Change Points in Discrete-Valued Time Series

F that denes the estimator θ

as the unique zero of

F(X

, θ

) = 0. Furthermore, it is

possible to allow for misspecication, in which case, θ

becomes the best approximating

parameter in the sense of EF(X

, θ

) = 0. More details on this framework in a sequential

context can be found in Kirch and Kamgaing [22].

We are now able to derive the limit distribution of the corresponding score-type change

point tests under the null hypothesis under the regularity conditions given in Section 10.7.1.

These regularity conditions are implicitly shown in the proofs for change point tests of the

types mentioned earlier. Examples for integer-valued time series are given in Sections 10.3

and 10.4.

Theorem 10.1 We obtain the following null asymptotics:

(a) Let A.1 and A.2 (i) in Section 10.7.1 hold. Assume that the weight function is either

a continuous nonnegative and bounded function w : [0, 1]→ R

, or for unbounded

functions fullling (10.3), let additionally A.2 (ii) in Section 10.7.1 hold. Then:

(i) max

1�k�n

w(k

/n)









−1







−→ sup

0�t�1

w(t)



(t),

(ii)



1�k�n

/n)









−1







−→



w(t)



j=1

(t) dt,

where B

(·),j = 1, ..., d, are independent Brownian bridges and  can be replaced by 

if 

−  = o

(1).

(b) Under A.1 and A.3 in Section 10.7.1 it holds

P a(log n) max

S(k,



)



−1

S(k,



) − b

(log n)  t → exp(−2e

−t

1�k�n

k(n − k)

where a(x) =

√

2 log x, b

(x) = 2 log x +

log log x − log (d/2), (·) is the Gamma-

function, and d is the dimension of the vector S(k, θ). Furthermore,  can be replaced by

an estimator 

if 

−1/2

− 

−1/2

=o

((log log n)

−1

The assumption of continuity of the weight function in (b) can be relaxed to allow for

a nite number of points of discontinuity, where w is either left or right continuous with

existing limits from the other side.

Similarly, under alternatives, we provide some regularity conditions, which ensure

that the tests mentioned earlier have asymptotic power one. Additionally, we propose a

consistent estimator of the change point in rescaled time.

Theorem 10.2 Under alternatives with a change point of the form

=λn,0 < λ < 1, (10.5)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10: Detection of Change Points in Discrete-Valued Time Series (1/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
10: Detection of Change Points in Discrete-Valued Time Series (1/6)