10: Detection of Change Points in Discrete-Valued Time Series (2/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google













224 Handbook of Discrete-Valued Time Series

we get the following assertions:

(a) If Assumptions B.1, B.3, and B.4(i) in Section 10.7.1 hold, then the Darling–Erd

andos-

max-type statistics for continuous weight functions fullling w(λ)>0 have asymptotic

power one, that is, for all x ∈ R it holds that

(i) P max

1�k�n

w(k

/n)

S(k,



)



−1

S(k,



)  x → 1,

(ii) P a(log n) max

1�k�n

S(k,



)



−1

S(k,



) − b

(log n)  x → 1.

If B.4(i) is replaced by B.4(ii), then the assertion remains true if  is replaced by 

, a

consistent estimator of .

(b) If additionally B.5 holds, then the sum-type statistics for a continuous weight function

w(·) = 0 fullling (10.3) has power one, that is, it holds for all x ∈ R









/n)

S(k,



)



−1

S(k,



)  x



→ 1.

1�k�n

If B.4(i) is replaced by B.4(ii), then the assertion remains true if  is replaced by 

consistent estimator of .

=λn, 0 < λ < 1, and consider

arg max S(k,



)



−1

S(k,



)



Under Assumptions B.1, B.3, and B.4(i), λ

is a consistent estimator for the change point

in rescaled time λ, that is,



− λ = o

(1).

If B.4(i) is replaced by B.4(iii), then the assertion remains true if  is replaced by 

a consistent estimator of .

Assumption (10.5) is standard in change point analysis but can be weakened for Part (a)

of Theorem 10.2. The continuity assumption on the weight function can be relaxed in this

situation.

While these test statistics were designed for the situation, where at most one change is

expected, they usually also have power against multiple changes. This fact is the underly-

ing principle of binary segmentation procedures (rst proposed by Vostrikova [36]), which

works as follows: The data set is tested using an at most one change test as given ear-

lier. If that test is signicant, the data set is split at the estimated change point and the

procedure repeated on both data segments until insignicant. Recently, a randomized

version of binary segmentation has been proposed for the simple mean change problem

(Fryzlewicz [14]).









225 Detection of Change Points in Discrete-Valued Time Series

The optimal rate of convergence in (b) is usually given by λ

− λ = O

(1/n) but requires

a much more involved proof (Csörg

o and Horváth [5], Theorem 2.8.1, for a proof in a mean

change model).

10.3 Detection of Changes in Binary Models

Binary time series are important in applications, where one is observing whether a certain

event has or has not occurred. Wilks and Wilby [40], for example, observe whether it has

been raining on a specic day, and Kauppi and Saikkonen [18] and Startz [34] observe

whether a recession has occurred or not in a given month. A common binary time series

model is given by

| Y

t−1

, Y

t−2

, ..., Z

t−1

, Z

t−2

, ... ∼ Bern(π

(β)),with g(π

(β)) = β

t−1

where Z

t−1

= (Z

t−1

(1), ..., Z

t−1

(p))

is a regressor, which can be purely exogenous (i.e.,

} is independent of {Y

}), purely autoregressive (i.e., Z

t−1

= (Y

t−1

, ..., Y

t−p

))oramix-

ture of both (in particular, the independence assumption does not need to hold). Similar

to generalized linear models, the canonical link function g(x) = log(x/(1 − x)) is used and

statistical inference is based on the partial likelihood

L(β) =

(β)

(1 − π

(β))

1−y

t=1

with corresponding score vector

BAR

(k, β) = Z

t−1

− π

(β)) (10.6)

t=1

for the canonical link function.

Theorem 10.3 We get the following assertions under the null hypothesis:

(a) Let the covariate process {Z

} be strictly stationary and ergodic with nite fourth moments.

Then, under the null hypothesis, A.1 and A.3 (i) in Section 10.7.1 are fullled for the partial

sum process S

BAR

(k, β

) and β

dened by

BAR



(n, β

) = 0. (10.7)

(b) If (Y

, Z

t−1

, ..., Z

t−p

)

is also α-mixing with exponential rate, then A.3 (ii) and (iii) in

Section 10.7.1 are fullled.



 



226 Handbook of Discrete-Valued Time Series

BAR

(k,



In particular, change point statistics based on

) have the null asymptotics as stated in

Theorem 10.1 with  = cov(Z

t−1

− π

(β

)),where β

is the true parameter under the null

hypothesis.

Remark 10.1 For Z

t−1

= (Y

t−1

, ..., Y

t−p

)

is the standard binary autoregressive model

(BAR(p)), for which the assumptions of Theorem 10.3 (b) are fullled, see, for example, Wang and

Li [37]. However, considering some regularity assumptions on the exogenous process, one can prove

that (Y

, ..., Y

t−p+1

, Z

, ..., Z

t−q

) is a Feller chain, for which Theorem 1 of Feigin and Tweedie [8]

implies geometric ergodicity (see Kirch and Tadjuidje Kamgaing [21] for details) implying that it is

β-mixing with exponential rate.

Remark 10.2 The mixing concept can be regarded as an asymptotic measure of independence in

time between the observations. The reader can refer to Tadjuidje et al. [35], Remark 4, for α- and

β-mixing denitions, as well as a concise summary of their relationship to geometric ergodicity for

a Markov chain.

Instead of using the full vector partial sum process S

BAR

(k, β

) to construct the test

statistics, often lower-dimensional linear combinations are used, such as

BAR







(k, β

) = (Y

− π

(β

)), (10.8)

t=1

where β

is dened by (10.7). If the assumptions of Theorem 10.3 are fullled, then the null

asymptotics as in Theorem 10.1 with  = cov(Y

− π

(β

)) hold.

BAR

(k,



The statistic based on

) for w ≡ 1 has been proposed by Fokianos et al. [11]; a

statistic based on S

BAR

(k, β



) with a somewhat different standardization in a purely autore-

gressive setup has been considered by Hudecová [17]. Hudecová’s statistic is the score

statistic based on the partial likelihood and the restricted alternative of a change only in

the intercept.

BAR-Alternative: Let the following assumptions hold:

(i) The change point is of the form k

=λn,0 < λ < 1.

(ii) The binary time series {Y

} and the covariate process {Z

} before the change

fullls the assumptions of Theorem 10.3a.

(iii) The time series after the change as well as the covariate process after the

change can be written as Y

= Y

+ R

(t) and Z

= Z

+ R

(t), respectively,

t > nλ, where {Y

} is bounded, stationary, and ergodic and {Z

} is square

integrable as well as stationary and ergodic with remainder terms fullling

(t) = o

(1),

R

(t)

= o

(1).

n n

j=λn+1 j=λn+1

(iv) λEZ

− π

(β)) + (1 − λ)EZ

n−1

−˜π

(β)) has a unique zero β

∈  and

 is compact and convex with β

∈ .



227 Detection of Change Points in Discrete-Valued Time Series

Assumption (i) is standard in change point analysis but could be relaxed, and assump-

tion (ii) states that the time series before the change fullls the assumption under the null

hypothesis. Assumption (iii) allows for rather general alternatives, including situations

where starting values from before the change are used resulting in a nonstationary time

series after the change. Assumption (iv) guarantees that the estimator β

converges to β

Neither Hudecová [17] nor Fokianos et al. [11] have derived the behavior of their statistics

under alternatives.

Theorem 10.4 Let H

(i)–H

(iv) hold.

(a) For S

BAR

(k, β) as in (10.6), B.1 and B.2 are fullled, which implies B.3. If k

=λn,

thenB.5isfullledwithF

(β) = λEZ

− π

(β)).

(b) For S

BAR

(k, β) as in (10.8) and if k

=λn, then B.5 is fullled with F

(β) = λE

− π

(β)).

B.4 is fullled for the full score statistic from Theorem 10.4a if the time series before and

after the change are correctly specied binary time series models with different parameters.

Otherwise, restrictions apply. Together with Theorem 10.2, this implies that the correspond-

ing change point statistics have power one and the point where the maximum is obtained

is a consistent estimator for the change point in rescaled time.

10.4 Detection of Changes in Poisson Autoregressive Models

Another popular model for time series of counts is the Poisson autoregression, where we

observe Y

1−p

, ..., Y

with

| Y

t−1

, Y

t−2

, ..., Y

t−p

∼ Pois(λ

), λ

= f

t−1

, ..., Y

t−p

) (10.9)

for some d-dimensional parameter vector γ ∈ .If f

(x) is Lipschitz continuous in x for

all γ ∈  with Lipschitz constant strictly smaller than 1, then there exists a stationary

ergodic solution of (10.9) that is β-mixing with exponential rate (Neumann [28]). For a

given parametric model f

, this allows us to consider score-type change point statistics

based on likelihood equations using the tools of Section 10.2. The mixing condition in

connection to some moment conditions typically allows one to derive A.3, while a Taylor

√

expansion in connection with

n-consistency of the corresponding maximum likelihood

estimator (e.g., derived by Doukhan and Kegne [7], Theorem 3.2) gives A.1 under some

additional moment conditions. Related test statistics for independent Poisson data are dis-

cussed in Robbins et al. [32]. However, in this chapter, we will concentrate on change point

statistics related to those proposed by Franke et al. [13], which are based on least square

scores and, as such, do not make use of the Poisson structure of the process. However, the

methods described in Section 10.2 can be used to derive change point tests based on the

partial likelihood, which can be expected to have higher power if the model is correctly

specied.

 

228 Handbook of Discrete-Valued Time Series

Consider the least squares estimator



dened by

PAR

(n,



) = 0, where S

PAR

(k, γ

) =



∇f

((Y

t−1

, ..., Y

t−p

))(Y

− f

t−1

, ..., Y

t−p

)),

t=1

(10.10)

where ∇ denotes the gradient with respect to γ. Under the additional assumption f

(x) =

+ f

,...,γ

(x),thatis,if γ

is an additive constant in the regression function, this implies

in particular that







− f



t−1

, ..., Y

t−p

) = 0. (10.11)

t=1

Assumptions under H

(i) {Y

} is stationary and α-mixing with exponential rate such that E sup

γ∈

)<∞.

(ii) f

(x) = γ

+ f

,...,γ

(x) is twice continuously differentiable with respect to γ

for all x ∈ N

and

E sup

∇f

)∇

) < ∞, E Y

sup ∇

t−1

) < ∞.

γ∈ γ∈

(iii) e(γ) = E(Y

− f

t−1

))

has a unique minimizer γ

in the interior of some

compact set  such that the Hessian of e(γ

) is positive denite.

As already mentioned, (i) is fullled for a large class of Poisson autoregressive processes

under mild conditions. Assumption (ii) means that the autoregressive function used to

construct the test statistic is linear in the rst component guaranteeing (10.11). Note that this

assumption does not need to be fullled for the true regression function of {Y

} (in fact, Y

does not even need to be a Poisson autoregressive time series). Assumption (iii) is fullled

for the true value if {Y

} is a Poisson autoregressive time series with regression function f

Theorem 10.5 Let under the null hypothesis H

(i)–(iii) be fullled.

(a) If E∇f

, ..., Y

)

< ∞ for some ν > 2,then S

PA R

(k, γ) =



j=1

−



t−1

, ..., Y

t−p

)) together with



as in (10.10), A.1 and A.3 in Section 10.7.1 hold

with  = (σ

) the long-run variance of Y

− f

((Y

t−1

, ..., Y

t−p

)).

(b) If f

(x) = γ

+ (γ

, ..., γ

)

x (p = d − 1) and E|Y

< ∞ for some ν > 4,then

PA R

(k, γ) =



j=1

t−1

−



t−1

),whereY

t−1

= (1, Y

t−1

, ..., Y

t−d+1

)

. Together

with



as in (10.10), A.1 and A.3 in Section 10.7.1 hold, where  is the long-run

covariance matrix of {Y

t−1

− γ

t−1

)}.

In particular, the change point statistics based on S

PA R

(k,



) in (a) and S

PA R

(k,



) in (b) have the

null asymptotics as stated in Theorem 10.1.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10: Detection of Change Points in Discrete-Valued Time Series (2/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
10: Detection of Change Points in Discrete-Valued Time Series (2/6)