21
Long Memory Discrete-Valued Time Series
Robert Lund, Scott H. Holan, and James Livsey
CONTENTS
21.1 Introduction...................................................................................447
21.2 Inadequacies of Classical Approaches.....................................................448
21.3 Valid Long Memory Count Approaches.. ................................................449
21.4 Binary Series...................................................................................453
21.5 Bayesian LongMemory Models............................................................454
21.6 Conclusion.....................................................................................456
References............................................................................................456
21.1 Introduction
This chapter reviews modeling and inference issues for discrete-valued time series with
long memory, with an emphasis on count series. En route, several recent areas of research
and possible extensions are described, including Bayesian methods and estimation issues.
A covariance stationary time series {X
t
} with nite second moments is said to have long
memory (also called long-range dependence) when
|Cov(X
t
, X
t+h
)|=∞.
h=0
Other denitions of long memory are possible (Guégan 2005). Long memory time series
models and applications are ubiquitous in the modern sciences (Granger and Joyeux 1980,
Hosking 1981, Geweke and Porter-Hudak 1983, Robinson 2003, Beran 1994, Palma 2007).
However, time series having both long memory and a discrete (count) marginal dis-
tribution have been more difcult to devise/quantify; literature on the topic is scarce
(Quoreshi 2014 is an exception). Here, we overview models for long memory count series,
discussing what types of methods will and will not produce long memory features.
One appealing approach for modeling non-Gaussian long-range dependence is proposed
by Palma and Zevallos (2010). Here, the authors devise a long memory model where the
distribution of current observation is specied conditionally upon past observations. Often,
such series are stationary (see MacDonald and Zucchini [2015; Chapter 12 in this volume]).
On Bayesian fronts, Brockwell (2007) constructs a model having a general non-Gaussian
distribution (including Poisson) conditional on a long memory latent Gaussian process.
While Brockwell (2007) does not pursue probabilistic properties of his model, a Markov
447
448 Handbook of Discrete-Valued Time Series
chain Monte Carlo (MCMC) sampling algorithm is devised for efcient estimation. Also
worth mentioning are the parameter-based (process-based) approach of Creal et al. (2013)
and the estimating equation approach of Thavaneswaran and Ravishanker (2015; Chapter 7
in this volume).
This chapter proceeds as follows. Section 21.2 shows why some classical approaches
will not generate discrete-valued long memory time series. Methods capable of generat-
ing long memory count time series are presented in Section 21.3. The special case of a
binary long memory series is discussed in Section 21.4. Bayesian methods are pursued
in Section 21.5, where some open research questions are suggested. Section 21.6 provides
concluding discussion.
21.2 Inadequacies of Classical Approaches
Integer autoregressive moving-average (INARMA) models (Steutel and van Harn 1979,
McKenzie 1985, 1986, 1988, Al-Osh and Alzaid 1988) and discrete ARMA (DARMA)
methods (Jacobs and Lewis 1978a, 1978b) cannot produce long memory series. A simple
rst-order integer autoregression, for example, obeys the recursion
X
t
= p X
t1
+ Z
t
, t = 0, ±1, ±2, ..., (21.1)
where p (0, 1) is a parameter, is the thinning operator, and {Z
t
}
t=−∞
is an indepen-
dent and identically distributed (IID) sequence supported on the nonnegative integers.
Clarifying, p M is a binomial random variable with M trials and success probabil-
ity p. The thinning in (21.1) serves to keep the series integer valued. While the solution
of (21.1) is stationary, its lag h autocovariance is proportional to p
h
and is hence abso-
lutely summable over all lags. Higher-order autoregressions have the same autocovariance
summability properties; that is, one cannot construct long memory count models with
INARMA methods.
Similarly, DARMA methods also cannot produce long memory sequences. A rst-order
discrete autoregression with marginal distribution π obeys the recursion
X
t
= A
t
X
t1
+ (1 A
t
)Z
t
, t = 1, 2, ...,
where {Z
t
} is IID with distribution π and {A
t
} is an IID sequence of Bernoulli trials with
P[A
t
= 1]≡ p. The recursion commences with a draw from the specied marginal distri-
bution π: X
0
=
D
π. While any marginal distribution can be achieved, the lag h autocovariance
is again proportional to p
h
, which is absolutely summable in lag. Moving to higher-order
models does not alter this absolute summability.
ARMA methods will not produce long memory series, even in noncount settings. The
classical ARMA(p, q) difference equation is
X
t
φ
1
X
t1
···φ
p
X
tp
= Z
t
+ θ
1
Z
t1
+···+θ
1
Z
tq
, t = 0, ±1, ..., (21.2)
449 Long Memory Discrete-Valued Time Series
where {Z
t
}
t=−∞
is white noise (uncorrelated in time) with variance σ
2
Z
. Here, φ
1
, ..., φ
p
are autoregressive coefcients and θ
1
, ..., θ
q
are moving-average coefcients. We make the
usual assumption that the autoregressive (AR) polynomial
φ(z) = 1 φ
1
z ··· φ
p
z
p
(21.3)
and the moving-average (MA) polynomial
θ(z) = 1 + θ
1
z +···+ θ
q
z
q
(21.4)
have no common roots (this is needed for solutions of the difference equation to be
unique in mean square). When the AR polynomial has no roots on the complex unit circle
{z : |z|= 1}, solutions to (21.2) can be expressed in the form
X
t
= ψ
k
Z
tk
, t = 0, ±1, ..., (21.5)
k=−∞
where the weights are absolutely summable (i.e.,
k=−∞
|ψ
k
| < ). From (21.5), we have
Cov(X
t
, X
t+h
) = σ
2
Z
ψ
k
ψ
k+h
, h = 0, ±1, ±2, .... (21.6)
k=−∞
It now follows that
2
Z
|Cov(X
t
, X
t+h
)|≤ σ
2
|ψ
k
| < .
h=−∞ k=−∞
The point is that stationary long memory series cannot be produced by ARMA methods.
Should one permit the AR polynomial (21.3) to have a unit root, then solutions to (21.2)
will not be stationary (this is the result of Problem 4.28 in Brockwell and Davis 1991).
21.3 Valid Long Memory Count Approaches
One model for a long memory count series fractionally differences and thins as in
Quoreshi (2014). Specically, if {ψ
k
}
is a sequence of real numbers with ψ
k
∈[0, 1] for
k=0
each k 0and {Z
t
}
t=−∞
is an IID sequence of nonnegative-valued counts, then
X
t
= ψ
k
Z
tk
, t = 0, ±1, ..., (21.7)
k=0
denes a stationary sequence of counts when some conditions are imposed on {ψ
k
}
.If
k=0
k=0
|ψ
k
| < is required, then {X
t
} will have short memory. However, if this absolute
450 Handbook of Discrete-Valued Time Series
summability is relaxed—but the weights ψ
k
still converge to zero slowly enough to make
probabilistic sense of the summation in (21.7) —then the resulting process could have long
memory. The covariance structure of {X
t
} in (21.7) is identical to that in (21.6).
Quoreshi (2014) employed this strategy with the weights ψ
0
= 1and
(k + d)
ψ
k
=
, k = 1, 2, ...,
(k + 1)(d)
where d (0, 1/2) is a fractional differencing parameter arising in the power series expan-
sion of (1 B)
d
(see Hosking 1981). Here, B is the backshift operator, dened by B
k
X
t
=
X
tk
for k 0. This setup was generalized by Quoreshi (2014) to extract integer-valued
solutions to difference equations of form
φ(B)X
t
= θ(B)(1 B)
d
Z
t
, t = 0, ±1, ....
Here, the AR and MA polynomials in (21.3) and (21.4) have “probabilistic coefcients":
φ
∈[0, 1] for = 1, ..., p and θ
∈[0, 1] for = 1, ..., q. This restriction makes the
analysis unwieldy in comparison to classical ARMA methods. For example, in an AR(1)
setting, φ
1
∈[0, 1] and it follows that the model cannot have any negative autocorrelations
by (21.6).
A completely different approach involves renewal sequences (Cui and Lund 2009, Lund
and Livsey [2015; Chapter 5 in this volume]). In this paradigm, there is a random lifetime L
supported in {1, 2, ...} with mean μ and IID copies of L, which we denote by {L
i
}
There
i=1
.
is an initial delay lifetime L
0
that may not have the same distributions as the L
i
for i 1.
Dene a random walk {S
n
}
0
via
n=
S
n
= L
0
+ L
1
+···+L
n
, n = 0, 1, 2, ...,
and say that a renewal occurs at time t if S
m
= t for some m 0. This is the classical
discrete-time renewal sequence popularized in Smith (1958) and Feller (1968). If a renewal
occurs at time t,set R
t
= 1; otherwise, set R
t
= 0. To make {R
t
} a stationary Bernoulli
sequence, a special distribution is needed for L
0
. Specically, L
0
is posited to have the rst
tail distribution derived from L:
P(L > k)
P(L
0
= k) =
, k = 0, 1, .... (21.8)
μ
Notice that L
0
can be zero (in which case, the process is called nondelayed).
For notation, let u
t
= P(R
t
= 1|R
0
= 1) be the time t renewal probability in a nondelayed
setup. These are calculated recursively from the lifetime L’s probabilities via
t1
u
t
= P[L = t]+ P[L = ]u
t
, t = 1, 2, ..., (21.9)
=1
where we use the convention u
0
= 1. The elementary renewal theorem states that u
t
−→
E[L]
1
when L has a nite mean and an aperiodic support set (henceforth assumed).
When L
0
has the distribution in (21.8), E[R
t
]≡μ
1
and the covariance function of {R
t
} is
Cov(R
t
, R
t+h
) =μ
1
u
h
μ
1
for h 0 (Lund and Livsey [2015; Chapter 5 in this volume]).
451 Long Memory Discrete-Valued Time Series
Renewal theory can be used to extract some process properties. Heathcote (1967) gives
the generating function expansion
h
1 ψ
L
0
(z)
z
u
h
μ
1
=
1 ψ
L
1
(z)
, |z|≤1, (21.10)
h=0
where ψ
L
1
(z) = E[z
L
1
] and ψ
L
0
(z) = E[z
L
0
]. Letting z 1 and using that E[L
0
] is nite if and
only if E[L
2
1
] is nite, we see that
0
|u
h
μ
1
| < if and only if E[L
1
2
] is nite. In the
h=
case where E[L
2
1
] < ,
u
h
μ
1
=
E[L
2
1
]−E[L
1
]
.
2E[L
1
]
h=0
Our standing assumption is that E[L
1
]=μ < . Since the lag h autocovariance is propor-
tional to u
h
μ
1
, {R
t
}
0
will have long memory if and only if E[L
2
1
]=∞. This, perhaps,
t=
is our major result.
Lifetimes with a nite mean but innite second moment are plentiful. One such lifetime
involves the Pareto distribution
P(L = k) =
c(α)
, k = 1, 2, ..., (21.11)
k
α
where c(α) = 1/
k=1
k
α
is a normalizing constant and α > 2 is a parameter. A Pareto
lifetime L with α (2, 3] has E[L] < and E[L
2
]=∞.
The tactics mentioned above construct a stationary binary sequence with long mem-
ory. A sample path of such a series is shown in Figure 21.1 along with its sample
autocorrelations. Lifetimes here were generated using (21.11) with α = 2.5.
To obtain other marginal distributions, we superimpose as in Lund and Livsey (2015;
Chapter 5 in this volume). Let {R
t,i
} be IID long memory binary renewal sequences for
i 1. If M 1 is xed and
M
X
t
= R
t,i
, t = 1, 2, ...,
i=1
then X
t
has a binomial marginal distribution with M trials and success probability μ
1
.
Process autocovariances are
M
1
Cov(X
t
, X
t+h
) =
u
h
, h = 0, 1, ...,
μ μ
and will have long memory when E[L
2
]=∞.
Other marginal distributions with long memory can be constructed. For Poisson
marginals, suppose that {M
t
}
0
is a sequence of IID Poisson random variables with mean
t=
λ and set
M
t
X
t
= R
t,i
, t = 1, 2, .... (21.12)
i=1
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset