250 Handbook of Discrete-Valued Time Series
where r
t
= γα
t1
and p
t
= γβ
t1
/γβ
t1
+ 1. The one-step-ahead forecast is obtained as
α
t1
E
Y
t
|D
t1
, γ = .
β
t1
An interesting property of the model is the long-run behavior of its one-step-ahead fore-
casts. As t gets large, using β
t
= γβ
t1
+ 1, we can show that β
t
approaches 1/(1 γ) and
we obtain
E
Y
t
|D
t1
, γ =
(1 γ)Y
t1
+ (1 γ)γY
t2
+ ...+ (1 γ)γ
t
α
0
,
which is an exponentially weighted average of the observed counts.
Although an analytic expression is not available for the k-step-ahead predictive density,
the k-step-ahead predictive means can be easily obtained. Using a standard conditional
expectation argument one can obtain E(Y
t+k
|D
t
, γ) as

E
Y
t+k
|D
t
, γ =
E
λ
t+k
,γ
E
Y
t+k
|λ
t+k
, D
t
= E
λ
t+k
|D
t
, γ
. (11.19)
Furthermore, using the state equation (11.4), we have
t+k
E
λ
t+k
|D
t
, γ
= E
λ
t
|D
t
, γ
E
n
|D
t
= E
λ
t
|D
t
, γ
=
α
t
, (11.20)
γ β
t
n=t+1
where E(
n
|D
t
) = γ for any n. Therefore, combining (11.19) and (11.20), we obtain the
k-step-ahead forecasts given data up to time t as
E
Y
t+k
|D
t
, γ
= E
λ
t+k
|D
t
, γ
=
α
t
. (11.21)
β
t
Note that in the case of the model with covariates, it can be shown that
z
t+k
E
Y
t+k
|D
t
, z
t+k
, ψ, γ
= E
λ
t+k
|D
t
, z
t+k
, ψ, γ
=
α
t
e
ψ
. (11.22)
β
t
If we treat the discount factor γ as a random variable, we lose the analytical tractability of
the model described earlier. However selecting a prior distribution for γ can be handled
fairly easily. Given D
t
, the likelihood function of γ is given by
t
L(γ; D
t
) = p
Y
i
|D
i1
, γ
, (11.23)
i=1
where p(Y
i
|D
i1
, γ) is negative binomial as in (11.18). The posterior distribution of γ can
then be obtained as
t
p
γ|D
t
p
Y
i
|D
i1
, γ
p(γ). (11.24)
i=1

251 Bayesian Modeling of Time Series of Counts with Business Applications
For some priors for p(γ) in (11.24), the posterior distribution will not be available in closed
form. However, we can always sample from the posterior using an MCMC method such
as the Metropolis–Hastings algorithm. Alternatively, a discrete uniform prior can be a
reasonable choice.
11.3 Markov Chain Monte Carlo (MCMC) Estimation of the Model
Since all conditional distributions previously introduced are all dependent on the param-
eter vectors ψ and γ, we need to discuss how to obtain the joint posterior densities of
ψ and γ that cannot be obtained in closed form; therefore, we can use MCMC methods
to generate the required samples. Our objective in this section is to obtain the joint pos-
terior distribution of the model parameters given observed counts up to time t,that is,
p(θ
1
, ..., θ
t
, ψ, γ|D
t
). We use a Gibbs sampler to generate samples from the full conditionals
of p(θ
1
, ..., θ
t
|ψ, γ, D
t
) and p(ψ, γ|θ
1
, ..., θ
t
, D
t
), none of which are available as standard
densities.
For notational convenience, we dene ω ={ψ, γ}. The conditional posterior distribution
of ω given the latent rates, (θ
1
, ..., θ
t
) is
t
Y
i
p
ω|θ
1
, ..., θ
t
, D
t
exp
θ
i
e
ψ
z
i
θ
i
e
ψ
z
i
p(ω), (11.25)
Y
i
!
i=1
where p(ω) is the joint prior for ψ and γ. Regardless of the prior selection for ω, (11.25) will
not be a standard density. We use an MCMC algorithm such as the Metropolis–Hastings
to generate samples from p(ω|θ
1
, ..., θ
t
, D
t
). In our numerical examples, we assume at
but proper priors for ψ and γ with ψ
i
Normal(0, 1000) for all i and γ Uniform(0, 1).
Following Chib and Greenberg (1995), the steps in the Metropolis–Hastings algorithm can
be summarized as follows:
1. Assume the starting points ω
(0)
at j = 0.
Repeat for j > 0,
2. Generate ω
from q(ω
|ω
(j)
) and u from U(0, 1).
3. If u f (ω
(j)
, ω
) then set ω
(j)
= ω
;elseset ω
(j)
= ω
(j)
and j = j + 1,
where
f
ω
(j)
, ω
= min
1,
π(ω
)q(ω
(j)
|ω
)
. (11.26)
π(ω
(j)
)q(ω
|ω
(j)
)
In (11.26), q(.|.) is the multivariate normal proposal density and π(.) is given by (11.25)
which is the density we need to generate samples from. If we repeat this a large number of
times, we can obtain samples from p(ω|θ
1
, ..., θ
t
, D
t
).
252 Handbook of Discrete-Valued Time Series
Generation of samples from the full conditional distribution, p(θ
1
, ..., θ
t
|D
t
, ω),using
the FFBS algorithm as described in Fruhwirth-Schnatter (1994) requires the smoothing dis-
tribution of θ
t
s, which enable retrospective analysis. In other words, given that we have
observed the count data, D
t
at time t, we will be interested in the distribution of (θ
tk
|D
t
, ω)
for all k 1.
We can write
p
θ
tk
|D
t
, ω =
p
θ
tk
|θ
tk+1
, D
t
, ω
p
θ
tk+1
|D
t
, ω
dθ
tk+1
, (11.27)
where p(θ
tk
|θ
tk+1
, D
t
, ω) is obtained via Bayes’ rule as
p
1
, D
t
, ω
=
p
θ
tk
|θ
tk+1
, D
tk
, ω
p
Y
(
t,k)
|θ
tk
, θ
tk+1
, D
tk
, ω
θ
tk
|θ
tk+
p
Y
|θ
tk+1
, D
tk
, ω
(t,k)
= p
θ
tk
|θ
tk+1
, D
tk
, ω
,
where Y
={Y
tk+1
, ..., Y
t
}.Given θ
tk+1
, Y
is independent of θ
tk
. In other words,
(t,k) (t,k)
p
Y
(t,k)
|θ
tk
, θ
tk+1
, D
tk
, ω
= p
Y
(
t,k)
|θ
tk+1
, D
tk
, ω
. Thus, (11.27) reduces to
p
θ
tk
|D
t
, ω =
p
θ
tk
|θ
tk+1
, D
tk
, ω
p
θ
tk+1
|D
t
, ω
dθ
tk+1
. (11.28)
Although we cannot obtain (11.28) analytically we can use Monte Carlo methods to draw
samples from p(θ
tk
|D
t
, ω). Due to the Markovian nature of the state parameters, we can
rewrite p(θ
1
, ..., θ
t
|D
t
, ω) as
p
θ
t
|D
t
, ω
p
θ
t1
|θ
t
, D
t1
, ω
...p
θ
1
|θ
2
, D
1
, ω
. (11.29)
We note that p(θ
t
|D
t
, ω) is available from (11.9) and p(θ
t1
|θ
t
, D
t1
, ω) for any t as
p
θ
t1
|θ
t
, D
t1
, ω
p
θ
t
|θ
t1
, D
t1
, ω
p
θ
t1
|D
t1
, ω
, (11.30)
where the rst term is available from (11.4) and the second term from (11.6). It is straight-
forward to show that
θ
t1
|θ
t
, D
t1
, ω
ShGamma[(1 γ)α
t1
, β
t1
; (γθ
t
, )],
which is a shifted gamma density dened over γθ
t
< θ
t1
< .
Therefore, given (11.29) and the posterior samples generated from the full conditional
of ω, we can obtain a sample from p
θ
1
, ..., θ
t
|ω, z
t
, D
t
by sequentially simulating the
individual latent rates as follows:
1. Assume the starting points θ
(
1
0)
, ..., θ
(
t
0)
at j = 0.
Repeat for j > 0,
253 Bayesian Modeling of Time Series of Counts with Business Applications
2. Using the generated ω
(j)
, sample θ
(j)
from (θ
t
|ω
(j)
, D
t
).
t
3. Using the generated ω
(j)
, for each n =t 1, ..., 1 generate θ
n
(j)
from
θ
n
|θ
n
(j
+
)
1
, ω
(j)
, D
n
where θ
n
(j
+
)
1
is the value generated in the previous step.
If we repeat this a large number of times, we can obtain samples from the full condi-
tional of the latent rates. Consequently, we can obtain samples from the joint density of
the model parameters by iteratively sampling from the full conditionals, p(ω|θ
1
, ..., θ
t
, D
t
)
and p(θ
1
, ..., θ
t
|ω, D
t
), via the Gibbs sampler. Once we have the posterior samples from
p(θ
1
, ..., θ
t
, ω|D
t
) we can also obtain the posterior samples of λ
t
s in a straightforward
z
t
manner using the identity λ
t
= θ
t
e
ψ
.
11.4 Multivariate Extension
It is possible to consider several extensions of the basic model to analyze multivariate count
time series. For instance, the observations of interest can be the number of occurrences of
an event during day t of year j. Another possibility is to consider the analysis of J different
Poisson time series. For instance, for a given year, the weekly spending habits of J different
households which can exhibit dependence can be modeled using such a structure. Several
extensions have been proposed by Aktekin and Soyer (2011), where multiplicative Pois-
son rates for (11.3) are considered. An alternate approach for modeling multivariate time
series of counts is described by Ravishanker, Venkatesan, and Hu (2015; Chapter 20 in this
volume).
In what follows, we present a model for J Poisson time series that are assumed to be
affected by the same environment. We assume that
Y
jt
Pois λ
jt
,forj = 1, ..., J, (11.31)
where λ
jt
= λ
j
θ
t
, λ
j
is the arrival rate specic to the jth series and θ
t
is the common term
modulating λ
j
. For example, in the case where Y
jt
is the number of grocery store trips of
household j at time t, λ
j
is the household-specic rate and we can think of θ
t
as the effect of
a common economic environment that the households are exposed to at time t. The values
of θ
t
> 1 represent a more favorable economic environment than usual, implying higher
shopping rates.
This is analogous to the concept of an accelerated environment for operating conditions
of components used by Lindley and Singpurwalla (1986) in life testing. Our case can be
considered as a dynamic version of their setup since we have the Markovian evolution
of θ
t
s as
θ
t
=
θ
t1
t
, (11.32)
γ
where, as earlier,
t
|D
t1
, λ
1
, ..., λ
J
Beta [γα
t1
, (1 γ)α
t1
] with α
t1
> 0, 0 < γ < 1,
and D
t1
={D
t2
, Y
1(t1)
, ..., Y
J(t1)
}. Furthermore, we assume that
λ
j
Gamma a
j
, b
j
,forj = 1, ..., J, (11.33)
254 Handbook of Discrete-Valued Time Series
and a priori, λ
j
s are independent of each other as well as of θ
0
.Given θ
t
s and λ
j
s, Y
jt
s are
conditionally independent. In other words, all J series are affected by the same common
environment and given that we know the uncertainty about the environment, they will be
independent.
At time 0, we assume that θ
0
|D
0
Gamma(α
0
, β
0
), and by induction we can show that
θ
t1
|D
t1
, λ
1
, ..., λ
J
Gamma
α
t1
, β
t1
, (11.34)
and
θ
t
|D
t1
, λ
1
, ..., λ
J
Gamma
γα
t1
, γβ
t1
. (11.35)
In addition, the ltering density at time t can be obtained as
θ
t
|D
t
, λ
1
, ..., λ
J
Gamma
α
t
, β
t
, (11.36)
where α
t
= γα
t1
+Y
1t
+...+Y
Jt
and β
t
= γβ
t1
+λ
1
+...+λ
J
. Consequently, the marginal
distributions of Y
jt
for any j can be obtained as
γα
t1
Y
jt
γα
t1
+ Y
jt
1
λ
j
λ
j
p =
1
, (11.37)
Y
jt
|λ
j
, D
t1
Y
jt
γβ
t1
+ λ
j
γβ
t1
+ λ
j
which is a negative binomial model as earlier. The multivariate distribution of (Y
1t
, ··· , Y
Jt
)
can be obtained as
γα
t1
+
j
Y
jt
λ
j
Y
jt
p
Y
1t
, ..., Y
Jt
|λ
1
, ..., λ
J
, D
t1
=
(
γα
t1
)
j
Y
jt
+ 1
j
γβ
t1
+
j
λ
j
γα
t1
γβ
t1
×
, (11.38)
γβ
t1
+
j
λ
j
which is a dynamic multivariate distribution of negative binomial type. The bivariate
distribution p(Y
it
, Y
jt
|λ
i
, λ
j
, D
t1
) can be obtained as
γα
t1
+ Y
it
+ Y
jt
γβ
t1
γα
t1
λ
i
Y
it
(
γα
t1
)
(
Y
it
+ 1
)
Y
jt
+ 1
λ
i
+ λ
j
+ γβ
t1
λ
i
+ λ
j
+ γβ
t1
Y
jt
λ
j
× , (11.39)
λ
i
+ λ
j
+ γβ
t1
which is a bivariate negative binomial distribution for integer values of γα
t1
. This distri-
bution is the dynamic version of the negative binomial distribution proposed by Arbous
and Kerrich (1951) for modeling accident numbers.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset