224 Handbook of Discrete-Valued Time Series
we get the following assertions:
(a) If Assumptions B.1, B.3, and B.4(i) in Section 10.7.1 hold, then the Darling–Erd
˝
andos-
max-type statistics for continuous weight functions fullling w(λ)>0 have asymptotic
power one, that is, for all x R it holds that
(i) P max
1kn
w(k
n
/n)
S(k,
θ
n
)
T
1
S(k,
θ
n
) x 1,
(ii) P a(log n) max
1kn
S(k,
θ
n
)
T
1
S(k,
θ
n
) b
d
(log n) x 1.
If B.4(i) is replaced by B.4(ii), then the assertion remains true if is replaced by
n
, a
consistent estimator of .
(b) If additionally B.5 holds, then the sum-type statistics for a continuous weight function
w(·) = 0 fullling (10.3) has power one, that is, it holds for all x R
P
w(
n
k
2
/n)
S(k,
θ
n
)
T
1
S(k,
θ
n
) x
1.
1kn
If B.4(i) is replaced by B.4(ii), then the assertion remains true if is replaced by
n
,a
consistent estimator of .
(c) Let the change point be of the form k
0
=λn, 0 < λ < 1, and consider
arg max S(k,
θ
n
)
T
1
S(k,
θ
n
)
λ
n
=
.
n
Under Assumptions B.1, B.3, and B.4(i), λ
n
is a consistent estimator for the change point
in rescaled time λ, that is,
λ
n
λ = o
P
(1).
If B.4(i) is replaced by B.4(iii), then the assertion remains true if is replaced by
n
,
a consistent estimator of .
Assumption (10.5) is standard in change point analysis but can be weakened for Part (a)
of Theorem 10.2. The continuity assumption on the weight function can be relaxed in this
situation.
While these test statistics were designed for the situation, where at most one change is
expected, they usually also have power against multiple changes. This fact is the underly-
ing principle of binary segmentation procedures (rst proposed by Vostrikova [36]), which
works as follows: The data set is tested using an at most one change test as given ear-
lier. If that test is signicant, the data set is split at the estimated change point and the
procedure repeated on both data segments until insignicant. Recently, a randomized
version of binary segmentation has been proposed for the simple mean change problem
(Fryzlewicz [14]).
225 Detection of Change Points in Discrete-Valued Time Series
The optimal rate of convergence in (b) is usually given by λ
n
λ = O
P
(1/n) but requires
a much more involved proof (Csörg
˝
o and Horváth [5], Theorem 2.8.1, for a proof in a mean
change model).
10.3 Detection of Changes in Binary Models
Binary time series are important in applications, where one is observing whether a certain
event has or has not occurred. Wilks and Wilby [40], for example, observe whether it has
been raining on a specic day, and Kauppi and Saikkonen [18] and Startz [34] observe
whether a recession has occurred or not in a given month. A common binary time series
model is given by
Y
t
| Y
t1
, Y
t2
, ..., Z
t1
, Z
t2
, ... Bern(π
t
(β)),with g(π
t
(β)) = β
T
Z
t1
,
where Z
t1
= (Z
t1
(1), ..., Z
t1
(p))
T
is a regressor, which can be purely exogenous (i.e.,
{Z
t
} is independent of {Y
t
}), purely autoregressive (i.e., Z
t1
= (Y
t1
, ..., Y
tp
))oramix-
ture of both (in particular, the independence assumption does not need to hold). Similar
to generalized linear models, the canonical link function g(x) = log(x/(1 x)) is used and
statistical inference is based on the partial likelihood
n
L(β) =
π
t
(β)
y
t
(1 π
t
(β))
1y
t
,
t=1
with corresponding score vector
k
S
BAR
(k, β) = Z
t1
(Y
t
π
t
(β)) (10.6)
t=1
for the canonical link function.
Theorem 10.3 We get the following assertions under the null hypothesis:
(a) Let the covariate process {Z
t
} be strictly stationary and ergodic with nite fourth moments.
Then, under the null hypothesis, A.1 and A.3 (i) in Section 10.7.1 are fullled for the partial
sum process S
BAR
(k, β
n
) and β
n
dened by
S
BAR
(n, β
n
) = 0. (10.7)
(b) If (Y
t
, Z
t1
, ..., Z
tp
)
T
is also α-mixing with exponential rate, then A.3 (ii) and (iii) in
Section 10.7.1 are fullled.
226 Handbook of Discrete-Valued Time Series
S
BAR
(k,
In particular, change point statistics based on
β
n
) have the null asymptotics as stated in
Theorem 10.1 with = cov(Z
t1
(Y
t
π
t
(β
0
)),where β
0
is the true parameter under the null
hypothesis.
Remark 10.1 For Z
t1
= (Y
t1
, ..., Y
tp
)
T
,Y
t
is the standard binary autoregressive model
(BAR(p)), for which the assumptions of Theorem 10.3 (b) are fullled, see, for example, Wang and
Li [37]. However, considering some regularity assumptions on the exogenous process, one can prove
that (Y
t
, ..., Y
tp+1
, Z
t
, ..., Z
tq
) is a Feller chain, for which Theorem 1 of Feigin and Tweedie [8]
implies geometric ergodicity (see Kirch and Tadjuidje Kamgaing [21] for details) implying that it is
β-mixing with exponential rate.
Remark 10.2 The mixing concept can be regarded as an asymptotic measure of independence in
time between the observations. The reader can refer to Tadjuidje et al. [35], Remark 4, for α- and
β-mixing denitions, as well as a concise summary of their relationship to geometric ergodicity for
a Markov chain.
Instead of using the full vector partial sum process S
BAR
(k, β
n
) to construct the test
statistics, often lower-dimensional linear combinations are used, such as
k
S
˜
BAR
(k, β
n
) = (Y
t
π
t
(β
n
)), (10.8)
t=1
where β
n
is dened by (10.7). If the assumptions of Theorem 10.3 are fullled, then the null
asymptotics as in Theorem 10.1 with = cov(Y
t
π
t
(β
0
)) hold.
S
BAR
(k,
The statistic based on
β
n
) for w 1 has been proposed by Fokianos et al. [11]; a
statistic based on S
˜
BAR
(k, β
n
) with a somewhat different standardization in a purely autore-
gressive setup has been considered by Hudecová [17]. Hudecová’s statistic is the score
statistic based on the partial likelihood and the restricted alternative of a change only in
the intercept.
BAR-Alternative: Let the following assumptions hold:
H
1
(i) The change point is of the form k
0
=λn,0 < λ < 1.
H
1
(ii) The binary time series {Y
t
} and the covariate process {Z
t
} before the change
fullls the assumptions of Theorem 10.3a.
H
1
(iii) The time series after the change as well as the covariate process after the
change can be written as Y
t
= Y
˜
t
+ R
1
(t) and Z
t
= Z
˜
t
+ R
2
(t), respectively,
t > nλ, where {Y
˜
t
} is bounded, stationary, and ergodic and {Z
˜
t
} is square
integrable as well as stationary and ergodic with remainder terms fullling
1
n
R
2
1
(t) = o
P
(1),
1
n
R
2
(t)
2
= o
P
(1).
n n
j=λn+1 j=λn+1
H
1
(iv) λEZ
0
(Y
1
π
1
(β)) + (1 λ)EZ
˜
n1
(Y
˜
n
−˜π
1
(β)) has a unique zero β
1
and
is compact and convex with β
n
.
227 Detection of Change Points in Discrete-Valued Time Series
Assumption (i) is standard in change point analysis but could be relaxed, and assump-
tion (ii) states that the time series before the change fullls the assumption under the null
hypothesis. Assumption (iii) allows for rather general alternatives, including situations
where starting values from before the change are used resulting in a nonstationary time
series after the change. Assumption (iv) guarantees that the estimator β
n
converges to β
1
.
Neither Hudecová [17] nor Fokianos et al. [11] have derived the behavior of their statistics
under alternatives.
Theorem 10.4 Let H
1
(i)–H
1
(iv) hold.
(a) For S
BAR
(k, β) as in (10.6), B.1 and B.2 are fullled, which implies B.3. If k
0
=λn,
thenB.5isfullledwithF
λ
(β) = λEZ
0
(Y
1
π
1
(β)).
(b) For S
˜
BAR
(k, β) as in (10.8) and if k
0
=λn, then B.5 is fullled with F
λ
(β) = λE
(Y
1
π
1
(β)).
B.4 is fullled for the full score statistic from Theorem 10.4a if the time series before and
after the change are correctly specied binary time series models with different parameters.
Otherwise, restrictions apply. Together with Theorem 10.2, this implies that the correspond-
ing change point statistics have power one and the point where the maximum is obtained
is a consistent estimator for the change point in rescaled time.
10.4 Detection of Changes in Poisson Autoregressive Models
Another popular model for time series of counts is the Poisson autoregression, where we
observe Y
1p
, ..., Y
n
with
Y
t
| Y
t1
, Y
t2
, ..., Y
tp
Pois(λ
t
), λ
t
= f
γ
(Y
t1
, ..., Y
tp
) (10.9)
for some d-dimensional parameter vector γ .If f
γ
(x) is Lipschitz continuous in x for
all γ with Lipschitz constant strictly smaller than 1, then there exists a stationary
ergodic solution of (10.9) that is β-mixing with exponential rate (Neumann [28]). For a
given parametric model f
θ
, this allows us to consider score-type change point statistics
based on likelihood equations using the tools of Section 10.2. The mixing condition in
connection to some moment conditions typically allows one to derive A.3, while a Taylor
expansion in connection with
n-consistency of the corresponding maximum likelihood
estimator (e.g., derived by Doukhan and Kegne [7], Theorem 3.2) gives A.1 under some
additional moment conditions. Related test statistics for independent Poisson data are dis-
cussed in Robbins et al. [32]. However, in this chapter, we will concentrate on change point
statistics related to those proposed by Franke et al. [13], which are based on least square
scores and, as such, do not make use of the Poisson structure of the process. However, the
methods described in Section 10.2 can be used to derive change point tests based on the
partial likelihood, which can be expected to have higher power if the model is correctly
specied.
228 Handbook of Discrete-Valued Time Series
Consider the least squares estimator
γ
n
dened by
k
S
PAR
(n,
γ
n
) = 0, where S
PAR
(k, γ
n
) =
f
γ
((Y
t1
, ..., Y
tp
))(Y
t
f
γ
(Y
t1
, ..., Y
tp
)),
t=1
(10.10)
where denotes the gradient with respect to γ. Under the additional assumption f
γ
(x) =
γ
1
+ f
γ
2
,...,γ
d
(x),thatis,if γ
1
is an additive constant in the regression function, this implies
in particular that
n
Y
t
f
γ
n
(Y
t1
, ..., Y
tp
) = 0. (10.11)
t=1
Assumptions under H
0
:
H
0
(i) {Y
t
} is stationary and α-mixing with exponential rate such that E sup
γ
f
γ
2
(Y
t
)<.
H
0
(ii) f
γ
(x) = γ
1
+ f
γ
2
,...,γ
d
(x) is twice continuously differentiable with respect to γ
for all x N
0
p
and
E sup
∇f
γ
(Y
t
)
T
f
γ
(Y
t
) < , E Y
t
sup ∇
2
f
γ
(Y
t1
) < .
γ γ
H
0
(iii) e(γ) = E(Y
t
f
γ
(Y
t1
))
2
has a unique minimizer γ
0
in the interior of some
compact set such that the Hessian of e(γ
0
) is positive denite.
As already mentioned, (i) is fullled for a large class of Poisson autoregressive processes
under mild conditions. Assumption (ii) means that the autoregressive function used to
construct the test statistic is linear in the rst component guaranteeing (10.11). Note that this
assumption does not need to be fullled for the true regression function of {Y
t
} (in fact, Y
t
does not even need to be a Poisson autoregressive time series). Assumption (iii) is fullled
for the true value if {Y
t
} is a Poisson autoregressive time series with regression function f
γ
.
Theorem 10.5 Let under the null hypothesis H
0
(i)–(iii) be fullled.
(a) If E∇f
γ
0
(Y
p
, ..., Y
1
)
ν
< for some ν > 2,then S
˜
PA R
(k, γ) =
k
j=1
(Y
t
f
γ
n
(Y
t1
, ..., Y
tp
)) together with
γ
n
as in (10.10), A.1 and A.3 in Section 10.7.1 hold
with = (σ
2
) the long-run variance of Y
t
f
γ
0
((Y
t1
, ..., Y
tp
)).
(b) If f
γ
(x) = γ
1
+ (γ
2
, ..., γ
d
)
T
x (p = d 1) and E|Y
0
|
ν
< for some ν > 4,then
S
PA R
(k, γ) =
k
j=1
Y
t1
(Y
t
γ
T
n
Y
t1
),whereY
t1
= (1, Y
t1
, ..., Y
td+1
)
T
. Together
with
γ
n
as in (10.10), A.1 and A.3 in Section 10.7.1 hold, where is the long-run
covariance matrix of {Y
t1
(Y
t
γ
T
0
Y
t1
)}.
In particular, the change point statistics based on S
˜
PA R
(k,
γ
n
) in (a) and S
PA R
(k,
γ
n
) in (b) have the
null asymptotics as stated in Theorem 10.1.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset