12: Hidden Markov Models for Discrete-Valued Time Series (4/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

282

Handbook of Discrete-Valued Time Series

One-state model Two-state model

−1 −1

−2 −2

−3 −3

−4 −4

−4 −3 −2 −1 0 1 2 3 4

Three-state model

−1

−2

−3

−4

−4 −3 −2 −1 0 1 2 3 4

FIGURE 12.5

Soap sales: normal QQ plot of the (mid-)quantile residuals for the one-, two- and three-state stationary models.

⎛

⎞

0.864 0.117 0.019

⎝

⎠

�

� =

0.445 0.538 0.017

0.000 0.298 0.702

and stationary distribution

�

δ = (0.722, 0.220, 0.058).

The mean (5.42) and variance (14.72) implied by the model certainly reect the observed

overdispersion. The implied ACF (see Section 12.3.3) is given by



= 0.5392 × 0.6823

+ 0.0926 × 0.4220

(The non-unit eigenvalues of

�

� are 0.6823 and 0.4220.) This ACF is close to the sample ACF

for the rst four lags; see Table 12.3.

Global decoding and local decoding were carried out, and the results are shown in

Figure 12.6. The 7 weeks (out of 242) in which global decoding and local decoding differ

are indicated there.

283

Hidden Markov Models for Discrete-Valued Time Series

TABLE 12.3

Three-State Stationary Poisson–HMM for Soap Sales Data: Comparison of Sample and Model

Autocorrelations

Lag 12345

Sample ACF 0.392 0.250 0.178 0.136 0.038 0.044

Model ACF 0.407 0.268 0.178 0.120 0.081 0.055

0 50 100 150 200 250

Week

FIGURE 12.6

Global decoding of soap sales: the sequence of states, that is, a posteriori the most likely. The black dots indicate

the seven occasions on which local decoding led to different states being identied as most likely.

Figure 12.7 displays forecast distributions under this model for weekly sales 1 and 2

weeks into the future, plus the corresponding stationary distribution.

Finally, under this model the state prediction probabilities for the next 3 weeks, com-

pared to the estimated stationary distribution

�

δ, are indicated here.

Week State 1 2 3

243 0.844 0.138 0.019

244 0.790 0.178 0.031

245 0.763 0.198 0.040

�

δ 0.722 0.220 0.058

Probability

0.20

0.15

0.10

0.05

0.00

0 5 10 15 20

Count

FIGURE 12.7

Forecast distributions for weekly counts of soap sales: one-step-ahead (left vertical lines), two-step-ahead (middle

vertical lines), stationary distribution (right vertical lines).

284 Handbook of Discrete-Valued Time Series

In this case, the convergence to the stationary distribution is relatively fast, which is not

surprising as the second largest eigenvalue of �

�

(0.6823) is not close to 1.

12.8 Extensions and Concluding Remarks

One of the principal advantages of the use of HMMs as time series models, in par-

ticular if they are tted by direct numerical maximization of likelihood, is the ease of

extending or adapting the basic models in order to accommodate known or suspected

special features of the data. We have not here dwelt on the many variations that are

possible, such as the modeling of additional dependencies at observation level, at latent

process level, or between these levels. A selection of possibilities is given by Zucchini and

MacDonald (2009, Section 8.6). An example of the last category of additional dependencies

is the model of Zucchini et al. (2008) for a binary time series {X

} of animal feeding behavior,

which is depicted in Figure 12.8. In that model only the feeding behavior {X

} is observed,

and the “nutrient levels” {N

} (an exponentially smoothed version of feeding behavior) are

permitted to inuence the transition probabilities governing the latent process {C

Other important topics not discussed in this chapter, or described only briey, are the

use of HMMs as models for longitudinal data, that is, multiple time series; the incorpora-

tion of covariates; the use of Bayesian estimation methods; the structuring of the t.p.m. to

reduce the number of parameters required by an HMM; and the construction of HMMs that

(accurately) approximate less tractable models having a continuous-valued latent Markov

process.

For HMMs as models for longitudinal data, see Altman (2007), Maruotti (2011, 2015),

Schliehe-Diecks et al. (2012), and Bartolucci et al. (2013). For examples of models with

covariates, see Zucchini and MacDonald (2009, Chapter 14). For Bayesian methods, see

the works cited in Section 12.4. For structuring of the transition probability matrix, see, for

example, Cooper and Lipsitch (2004) and Langrock (2011). For discretization of continuous-

valued latent processes and the resulting application of HMM methods, see Langrock

(2011).

To conclude, we suggest that many discrete-valued time series can be usefully modeled

by HMMs or variations thereof, and the models relatively easily tted by direct numeri-

cal maximization of likelihood; EM and Bayesian methods are obvious alternatives. The

unity—across various types of discrete data—of model structure, and of techniques for

FIGURE 12.8

Directed graph of animal feeding model of Zucchini et al. (2008).

285 Hidden Markov Models for Discrete-Valued Time Series

estimation, forecasting, and diagnostic checking, makes HMMs a promising set of models

for a wide variety of discrete-valued time series.

Acknowledgments

The James M. Kilts Center, University of Chicago Booth School of Business, is thanked for

making available the data analyzed in Section 12.7. The reviewer is thanked for constructive

comments and suggestions.

References

Altman, R. M. (2007). Mixed hidden Markov models: An extension of the hidden Markov model to

the longitudinal data setting. Journal of the American Statistical Association, 102:201–210.

Bartolucci, F., Farcomeni, A., and Pennoni, F. (2013). Latent Markov Models for Longitudinal Data.

Chapman & Hall/CRC Press, Boca Raton, FL.

Baum, L. E., Petrie, T., Soules, G., and Weiss, N. (1970). A maximization technique occurring in the

statistical analysis of probabilistic functions of Markov chains. Annals of Mathematical Statistics,

41:164–171.

Bulla, J. and Berzel, A. (2008). Computational issues in parameter estimation for stationary hidden

Markov models. Computational Statistics, 23:1–18.

Churchill, G. A. (1989). Stochastic models for heterogeneous DNA sequences. Bulletin of Mathematical

Biology, 51:79–94.

Cooper, B. and Lipsitch, M. (2004). The analysis of hospital infection data using hidden Markov

models. Biostatistics, 5:223–237.

Cox, D. R. (1981). Statistical analysis of time series: Some recent develpoments. Scandivanian Journal

of Statistics, 8:93–115.

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data

via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B, 39:1–38.

Durbin, R., Eddy, S. R., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic

Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, U.K.

Frühwirth-Schnatter, S. (2006). Finite Mixture and Markov Switching Models. Springer, New York.

Juang, B. H. and Rabiner, L. R. (1991). Hidden Markov models for speech recognition. Technometrics,

33:251–272.

Langrock, R. (2011). Some applications of nonlinear and non-Gaussian state-space modelling by

means of hidden Markov models. Journal of Applied Statistics, 38(12):2955–2970.

Leroux, B. G. and Puterman, M. L. (1992). Maximum-penalized-likelihood estimation for indepen-

dent and Markov-dependent mixture models. Biometrics, 48(2):545–558.

Little, R. J. A. (2009). Selection and pattern-mixture models. In Fitzmaurice, G., Davidian, M.,

Verbeke, G., and Molenberghs, G., editors, Longitudinal Data Analysis, pp. 409–431. Chapman &

Hall/CRC, Boca Raton, FL.

MacDonald, I. L. (2014). Numerical maximisation of likelihood: A neglected alternative to EM?

International Statistical Review, 82(2):296–308.

Maruotti, A. (2011). Mixed hidden Markov models for longitudinal data: An overview. International

Statistical Review, 79(3):427–454.

Maruotti, A. (2015). Handling non-ignorable dropouts in longitudinal data: A conditional model

based on a latent Markov heterogeneity structure. TEST, 24:84–109.

286 Handbook of Discrete-Valued Time Series

Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech

recognition. IEEE Transactions on Information Theory, 77(2):257–284.

Rydén, T. (2008). EM versus Markov chain Monte Carlo for estimation of hidden Markov models: A

computational perspective (with discussion). Bayesian Analysis, 3(4):659–688.

Schliehe-Diecks, S., Kappeler, P., and Langrock, R. (2012). On the application of mixed hidden Markov

models to multiple behavioural time series. Interface Focus, 2:180–189.

Scott, S. L. (2002). Bayesian methods for hidden Markov models: Recursive computing in the

21st century. Journal of the American Statistical Association, 97:337–351.

University of Chicago Booth School of Business (2015). http://research.chicagobooth.edu/kilts/

marketing-databases/dominicks. Accessed date April 28, 2015.

Viterbi, A. J. (1967). Error bounds for convolutional codes and an asymptotically optimal decoding

algorithm. IEEE Transactions on Information Theory, 13:260–269.

Viterbi, A. J. (2006). A personal history of the Viterbi algorithm. IEEE Signal Processing Magazine,

23:120–142.

Welch, L. R. (2003). Hidden Markov models and the Baum–Welch algorithm. IEEE Information Theory

Society Newsletter, 53(1): 10–13.

Zucchini, W. (2000). An introduction to model selection. Journal of Mathematical Psychology,

44(1):41–61.

Zucchini, W. and MacDonald, I. L. (2009). Hidden Markov Models for Time Series: An Introduction Using

R. Chapman & Hall/CRC, London, U.K./Boca Raton, FL.

Zucchini, W., Raubenheimer, D., and MacDonald, I. L. (2008). Modeling time series of animal

behavior by means of a latent-state model with feedback. Biometrics, 64:807–815.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 12: Hidden Markov Models for Discrete-Valued Time Series (4/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
12: Hidden Markov Models for Discrete-Valued Time Series (4/4)