291 Spectral Analysis of Qualitative Time Series
and C = T = 1 yields the numerical sequence 011101010 ..., which is not very interesting.
However, if we use the strong–weak bonding alphabet, W ={A, T}=0and S ={C, G}=1,
then the sequence becomes 001001001 ..., which is very interesting. It should be clear, then,
that one does not want to focus on only one scaling. Instead, the focus should be on nding
scalings that bring out all of the interesting features in the data. Rather than choose values
arbitrarily, the spectral envelope approach selects scales that help emphasize any periodic
feature that exists in a categorical time series of virtually any length in a quick and auto-
mated fashion. In addition, the technique can help in determining whether a sequence is
merely a random assignment of categories.
13.3 Denition of Spectral Envelope
As a general description, the spectral envelope is a frequency-based, principal component
technique applied to a multivariate time series. In this section, we will focus on the basic
concept and its use in the analysis of categorical time series. Technical details can be found
in Stoffer et al. (1993a).
Briey, in establishing the spectral envelope for categorical time series, we addressed
the basic question of how to efciently discover periodic components in categorical
time series. This was accomplished via nonparametric spectral analysis as follows. Let
{X
t
; t = 0, ±1, ±2, ...} be a categorical-valued time series with nite state-space C =
{c
1
, c
2
, ..., c
k+1
}. Assume that X
t
is stationary and p
j
= Pr{X
t
=c
j
}> 0for j =1, 2, ..., k + 1.
For β =(β
1
, β
2
, ..., β
k+1
)
∈R
k+1
, denote by X
t
(β) the real-valued stationary time series
corresponding to the scaling that assigns the category c
j
the numerical value β
j
,for
j = 1, 2, ..., k + 1. Our goal was to nd scaling β so that the spectral density is in some
sense interesting and to summarize the spectral information by what we called the spectral
envelope.
We chose β to maximize the power (variance) at each frequency ω, across frequencies
ω ∈ (−1/2, 1/2], relative to the total power σ
2
(β) = Var{X
t
(β)}. That is, we chose β(ω),at
each ω of interest, so that
f (ω; β)
λ(ω) = sup
σ
2
(β)
, (13.3)
β
over all β not proportional to 1
k+1
,the(k+1)×1 vector of ones. Note that λ(ω) is not dened
if β = a1
k+1
for a ∈ R because such a scaling corresponds to assigning each category
the same value a; in this case, f (ω; β) ≡ 0and σ
2
(β) = 0. The optimality criterion λ(ω)
possesses the desirable property of being invariant under location and scale changes of β.
As in most scaling problems for categorical data, it was useful to represent the categories
in terms of the vectors e
1
, e
2
, ..., e
k+1
, where e
j
represents the (k + 1) × 1 vector with one
in the jth row and zeros elsewhere. We then dened a (k + 1)-dimensional stationary time
series Y
t
by Y
t
= e
j
when X
t
= c
j
. The time series X
t
(β) can be obtained from the Y
t
time
series by the relationship X
t
(β) = β
Y
t
. Assume that the vector process Y
t
has a continu-
ous spectral density denoted by f
Y
(ω). For each ω, f
Y
(ω) is, of course, a (k + 1) × (k + 1)
complex-valued Hermitian matrix. Note that the relationship X
t
(β) = β
Y
t
implies that