452 Handbook of Discrete-Valued Time Series
Realization of a binary long memory series
1.5
1.0
0.5
0.0
0 200 400
Time
600 800 1000
Sample ACF
0.8
0.4
0.0
FIGURE 21.1
0 10 20
Lag
30 40 50
Count
A realization of length 1000 of a long memory stationary time series with Bernoulli marginal distributions. Sample
autocorrelations are also shown with pointwise 95% condence bounds for white noise.
Then {X
t
}
0
is strictly stationary with marginal Poisson distributions with mean λ/μ and
t=
Cov(X
t
, X
t+h
) =
C(λ)
u
h
μ
1
, h = 1, 2, ....
μ
Lund and Livsey (2015; Chapter 5 in this volume) show that C(λ) =λ[1 e
2λ
{I
0
(2λ) +
I
1
(2λ)}], where
(λ/2)
2n+j
I
j
(λ) =
, j = 0, 1,
n!(n + j)!
n=0
is a modied Bessel function. Again, {X
t
}
0
will have long memory when E[L
2
]=∞.
t=
Figure 21.2 shows a realization of (21.12) of length 1000 along with sample autocor-
relations of the generated series. The sequence {M
t
}
n
1
was generated using λ = 5; the
t=
Pareto lifetime in (21.11) was utilized with α = 2.1. Since E[L
2
]=∞, this model has long
memory.
As discussed in Lund and Livsey (2015; Chapter 5 in this volume), generalities are pos-
sible; there, it is shown how to construct geometric marginal distributions (and hence also
negative binomial marginals) with renewal methods.
453 Long Memory Discrete-Valued Time Series
Realization of a long memory Poisson series
1
2
3
4
5
6
7
Count
0
200 400 600 800 1000
Time
0.0
0.4
0.8
Sample ACF
0 10 20 30 40 50 60
Lag
FIGURE 21.2
A realization of length 1000 of a long memory stationary time series with Poisson marginal distributions. Sample
autocorrelations are also shown with pointwise 95% condence bounds for white noise.
21.4 Binary Series
The case of a binary series (binomial with M = 1) is worth additional discussion. Here,
X
t
= R
t,1
and the covariance function has form
1
1
Cov(X
t
, X
t+h
) =
u
h
, h = 0, 1, ....
μ μ
For long memory, L needs to have a nite mean but an innite second moment.
Suppose that X
1
, ..., X
n
constitutes data sampled from a long memory binary process.
In this case, likelihood estimators can be constructed. To see this, suppose that x
i
∈{0, 1}
for i = 1, 2, ..., n are xed. Let τ
1
= inf{t 1: x
t
= 1} be the rst x
t
that is unity, and
inductively dene the kth occurrence of unity as
τ
k
= inf{t > τ
k1
: x
t
= 1}, k = 1, 2, ..., N(n).
Here, N(n) denotes the number of unit x
t
s in the rst n indices. For notation, let η
k
=
τ
k
τ
k1
. By the construction of the renewal process, we have
P
n
t=1
X
t
= x
t
= P(L
0
= τ
1
, L
1
= η
2
, ..., L
N(n)1
= η
N(n)
, L
N(n)
> n τ
N(n)
). (21.13)
454 Handbook of Discrete-Valued Time Series
TABLE 21.1
Likelihood results for a long memory binary process with Pareto lifetimes
n = 100 n = 500 n = 1000 n = 5000
α = 2.05 2.1554 (0.1509) 2.0809 (0.0701) 2.0708 (0.0548) 2.0542 (0.0299)
α = 2.20 2.2627 (0.1939) 2.2085 (0.0962) 2.2048 (0.0710) 2.2014 (0.0331)
α = 2.40 2.4503 (0.2259) 2.4103 (0.1016) 2.4069 (0.0722) 2.4003 (0.0326)
α = 2.60 2.6400 (0.2445) 2.6085 (0.1081) 2.6075 (0.0767) 2.6017 (0.0341)
α = 2.80 2.8433 (0.2671) 2.8114 (0.1167) 2.8053 (0.0822) 2.8014 (0.0367)
α = 3.00 3.0534 (0.2919) 3.0155 (0.1270) 3.0017 (0.0892) 3.0016 (0.0398)
α = 3.50 3.5827 (0.3733) 3.5156 (0.1599) 3.5110 (0.1126) 3.5006 (0.0501)
Note: Cases with α (2, 3] have long memory.
This relationship allows us to compute likelihood estimators. For example, if L has a Pareto
distribution with parameter α, then the probability in (21.13) is a function of α and can be
used as a likelihood L(α):
L(α) = P(L
0
= τ
1
) × P(L
1
= η
2
) ×···×P(L
N(n)1
= η
N(n)
) × P(L
N(n)
> n τ
N(n)
).
We recommend maximizing the log-likelihood (α) = log(L(α)):
N(n)1
(α) = log P(L
0
= τ
1
) + log P(L
i
= η
i
) + log P(L
N(n)
> n τ
N(n)
). (21.14)
i=1
Table 21.1 summarizes likelihood estimation results when L has the Pareto lifetime in
(21.11). Series of various lengths n were generated and the Pareto parameter α was esti-
mated by maximizing the log-likelihood in (21.14). Each table entry reports the average
estimator over 1000 simulations; sample standard deviations are listed in parentheses for
intuition (there is no theoretical guarantee that these error estimates are nite). While there
appears to be some overestimation (bias) in the estimators, biases decay with increasing
series length. Estimation precision seems to decrease with increasing α. Overall, the pro-
cedure seems to work well. Extensions of these likelihood calculations to settings where
M 2 constitute an open area of research and appear difcult.
21.5 Bayesian Long Memory Models
Bayesian methods to model general long memory series are an emerging area of research.
Recent contributions include Pai and Ravishanker (1996, 1998), Ravishanker and Ray (1997,
2002), Ko and Vannucci (2006), Holan et al. (2009), and Holan and McElroy (2012). The
literature on Bayesian methods for modeling count-valued long memory time series is
scarce, with Brockwell (2007) being an exception. As such, we briey outline a condition-
ally specied (hierarchical) Bayesian approach and suggest possible avenues for further
research. In this section, we do not make an attempt to identify the marginal distribution
of the constructed series.
455 Long Memory Discrete-Valued Time Series
The setup here is similar to Brockwell (2007) and MacDonald and Zucchini (2015; Chap-
ter 12 in this volume) and proceeds via a conditional specication. For simplicity, we
restrict attention to a conditional Poisson model where the logarithm of a latent intensity
parameter is modeled as a Gaussian autoregressive fractionally integrated moving-average
(ARFIMA) (p, d, q) process. It is assumed that the data are conditionally independent given
the underlying latent Poisson intensity parameter. Specically, we posit that the conditional
distribution of X
t
given λ
t
is
ind
X
t
|λ
t
, Poisson(λ
t
), t = 1, 2, .... (21.15)
Let λ
= log(λ
t
). We model {λ
}
1
with a zero-mean Gaussian ARFIMA(p, d, q) process
t t
t=
satisfying
φ(B)(1 B)
d
λ
t
= θ(B)
t
, t = 1, 2, ...,
where (1 B)
d
= 1 Bd B
2
d(d 1)/2!··· is the general binomial expansion, p, q Z
+
,
d (1/2, 1/2), {
t
} is zero-mean white noise, and the AR and MA polynomials are as in
(21.3) and (21.4). With λ
n
= (λ
1
, ..., λ
n
)
and = (, , d, σ
2
), where = (φ
1
, ..., φ
p
) and
= (θ
1
, ..., θ
q
), the Gaussian ARFIMA supposition implies that
λ
| N(0,
n
), (21.16)
n
where
n
is the autocovariance matrix of λ
n
. As in Brockwell (2007), it is straightforward
to specify a nonzero mean in (21.16); that is, deterministic regressors could be added to
(21.16) in a straightforward manner.
With xed values of the ARFIMA parameters in , {λ
}
1
is a strictly stationary
t
t=
Gaussian series. It follows that {λ
t
}
1
and {X
t
}
1
are also strictly stationary. However,
t= t=
the marginal distribution of X
t
is unclear. Some computations provide the form
λ
λ
k
exp{−
1
ln(λ)
2
}
P(X
t
= k) =
e
2 γ
(0)
dλ, k = 0, 1, ...,
k!
λ 2πγ
(0)
0
where γ
(0) = Var(λ
t
). This is a difcult integral to explicitly evaluate, although numeri-
cal approximations can be made—see Asmussen et al. (2014) for the latest. The covariance
Cov(X
t
, X
t+h
) also appears intractable. It seems logical that {X
t
}
1
will also have long
t=
memory, but this has not been formally veried.
In a Bayesian setting, the time series parameters are typically treated as random. For
example, the distributions of and could be taken as uniform over their respective AR
and MA stationarity and invertibility regions, d could be uniform over (1/2, 1/2),and σ
2
would have a distribution supported on (0, ). One could take these components to be
independent, although formulations allowing dependence between these components are
also possible.
In practice, it is convenient to work with an autoregressive setup (i.e., q = 0) for λ
.
Even with this simplifying assumption, several open research questions arise. For esti-
mation, it would be useful to derive efcient MCMC sampling algorithms. One such
algorithm is provided by Brockwell (2007). Also, for large n, it might be advantageous
to consider approximate Bayesian inference by a Whittle likelihood in lieu of an exact
Gaussian likelihood (see Palma 2007, McElroy and Holan 2012). Further computational
456 Handbook of Discrete-Valued Time Series
efciencies might be obtained from using preconditioned conjugate gradient methods
(Chen et al. 2006).
Long memory need not be driven via ARFIMA(p, d, q) structures. For example, {λ
}
t
t=1
could follow a fractionally differenced exponential model (FEXP(q)) as in Holan et al. (2009).
Such models could be extended to permit seasonal long memory specications, including
GARMA(Gegenbauer ARMA) (Woodward et al. 1998) and GEXP (Gegenbauer exponential
models) (Holan and McElroy 2012, McElroy and Holan 2012). This said, seasonal long
memory cases pose additional computational challenges (McElroy and Holan 2012).
21.6 Conclusion
Long memory models for discrete-valued time series are a promising area of research—
there is little current guidance on how to extend Gaussian long memory methods. Here,
several recent advances were proposed and future research was suggested. Specically,
we pointed out that most classical discrete-valued approaches will not produce long mem-
ory series. This motivated methods that will produce long memory count series. These
include Quoreshi (2014)’s approach of thinning with a fractional weighting scheme and
the renewal theory approach in Cui and Lund (2009) and Lund and Livsey (2015; Chap-
ter 5 in this volume). Binary long memory series were examined in greater detail. Here,
maximum likelihood parameter estimates were obtained for a renewal model. In the bino-
mial case of M 2 (see Section 21.3), likelihood estimation constitutes an open area of
research. Bayesian approaches to discrete-valued long memory time series were also dis-
cussed. While we focused on conditional Poisson data with a latent Gaussian intensity
parameter, other discrete-valued distributions seem plausible. In the Bayesian context,
several avenues for future research were suggested.
In summary, although long memory time series has become a popular topic, little has
been previously done for discrete-valued time series. Here, we have detailed the current
state of the topic and described several areas for future research. Of particular interest is
the exploration of the utility of these models to real-world applications.
Acknowledgments
Robert Lund’s research was partially supported by NSF Award DMS 1407480. Scott Holan’s
research was partially supported by the U.S. National Science Foundation (NSF) and
the U.S. Census Bureau under NSF grant SES-1132031, funded through the NSF-Census
Research Network (NCRN) program.
References
Al-Osh, M. and Alzaid, A.A. (1988). Integer-valued moving averages (INMA), Statistical Papers, 29,
281–300.
Asmussen, S., Jensen, J.L., and Rojas-Nandayapa, L. (2014). Methodology and Computing in Applied
Probability. doi: 10.1007/s11009-014-9430-7.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset