20
Dynamic Models for Time Series of Counts with a
Marketing Application
Nalini Ravishanker, Rajkumar Venkatesan, and Shan Hu
CONTENTS
20.1 Introduction...................................................................................425
20.2 Application to Marketing Actions of a Pharmaceutical Firm. . ........................427
20.3 Dynamic ZIPModels for Univariate Prescription Counts..............................429
20.4 Hierarchical Multivariate DynamicModels for Prescription Counts.................432
20.4.1 Finite Mixtures of Multivariate Poisson Distributions..........................432
20.4.2 HMDM Model Description.........................................................434
20.4.3 Bayesian Inference for the HMDM Model........................................435
20.5 Multivariate Dynamic Finite Mixture Model.............................................436
20.5.1 MDFM Model Description..........................................................436
20.5.2 Bayesian Inference for the MDFM Model.........................................437
20.6 Summary........... .. .. .. .. .. .. .. .. .. .. .. ........... .. .. .. .. .. .. .. .. .. .. .. ........... .. .. .. .. ..439
20.A Appendix.......................................................................................440
20.A.1 Complete Conditional Distributions for the HMDM Model. .. ... .. . .. ... .. . ..441
20.A.2 Complete Conditional Distributions and Sampling Algorithms for the
MDFMModel.........................................................................442
References............................................................................................443
20.1 Introduction
In many applications, including marketing, we observe at different times and for differ-
ent subjects counts of some event of interest. Accurate modeling of such time series of
counts (responses) for N subjects over T time periods as functions of relevant covariates
(subject-specic and time-varying), and incorporating dependence over time, is becoming
increasingly important in several applications. In situations where we observe a vector of
counts for each subject at each time, we are also interested in incorporating the association
between the components of the count vectors. In this chapter, we describe the modeling of
univariate and multivariate time series of counts in the context of a marketing application
that involves modeling/predicting product sales.
While count data regression is a widely used applied statistical tool today (Kedem and
Fokianos, 2002), models for count time series are less common. The main approaches
include a regression-type approach using quasi-likelihood as discussed in Zeger (1988), a
Poisson–Gamma mixture modeling approach described by Harvey and Fernandes (1989),
the generalized linear autoregressive moving average (GLARMA) model discussed in
425
426 Handbook of Discrete-Valued Time Series
Davis et al. (2003), and the dynamic generalized linear model (GLM) discussed in Gamer-
man (1998) and Landim and Gamerman (2000). We propose to employ hierarchical
dynamic models and illustrate on a marketing example. For Gaussian dynamic linear
models (DLMs), also often referred to as Gaussian state space models, Kalman (1960) and
Kalman and Bucy (1961) popularized a recursive algorithm for optimal estimation and pre-
diction of the state vector, which then enables the prediction of the observation vector; see
West (1989) for details. Carlin et al. (1992) described the use of Markov chain Monte Carlo
(MCMC) methods for non-Gaussian and nonlinear state space models. Chen et al. (2000) is
an excellent reference text for MCMC methods.
Hierarchical dynamic linear models (HDLMs) combine the stratied parametric linear
models (Lindley and Smith, 1972) and the DLMs into a general framework, and have been
particularly useful in econometric, education, and health care applications (Gamerman and
Migon, 1993). The Gaussian HDLM includes a set of one or more dimensions reducing
structural equations along with the observation equation and state (or system) equation
of the DLM. Landim and Gamerman (2000) further extended the Gaussian HDLM to a
more general class of models where the response vector has a matrix-valued normal dis-
tribution. For situations where the time series of responses consists of counts, DLMs have
been generalized to dynamic generalized linear models (DGLMs) or exponential family
state space models, which assume that the sampling distribution is a member of the expo-
nential family of distributions, such as the Poisson or negative binomial distributions. The
DGLMs may be viewed as dynamic versions of the GLMs (McCullagh and Nelder, 1989).
For univariate time series, Fahrmeir and Kaufmann (1991) discussed Bayesian inference
via an extended Kalman lter approach, while Gamerman (1998) described the use of the
Metropolis–Hastings algorithm combined with the Gibbs sampler in repeated use of an
adjusted version of Gaussian DLM. Applications of state space models of counts include
Weinberg et al. (2007) in operations management and Aktekin et al. (2014) in nance,
for instance. Wikle and Anderson (2003) described a dynamic zero-inated Poisson (ZIP)
model framework for tornado report counts, incorporating spatial and temporal effects.
Gamerman et al. (2015; Chapter 8 in this volume) gives an excellent discussion of Bayesian
DGLMs, with illustrations.
In many applications, the response consists of a vector-valued time series of counts, and
there is a need to develop statistical modeling approaches for estimation and prediction.
Fahrmeir (1992) described posterior inference via extended Kalman ltering for multivari-
ate DGLMs. In this chapter, we describe hierarchical dynamic models for univariate and
multivariate count times series. Specically, we discuss a ZIP sampling distribution for the
univariate case and a multivariate Poisson (MVP) sampling distribution for the multivari-
ate case, incorporating covariates that may vary over location and/or time. The use of the
MVP distribution enables us to model associations between the components of the count
response vector, while the dynamic framework allows us to model the temporal behavior.
The hierarchical structure enables us to capture the location (or subject)-specic effects over
time. We also propose a multivariate dynamic nite mixture (MDFM) model framework to
reduce the dimension of the state parameter and also to include the possibilities of negative
correlations between the component of the multivariate time series.
The format of the chapter is as follows. Section 20.2 gives a description of the market-
ing application, including a description of the data. Section 20.3 describes a dynamic ZIP
model for univariate count time series. Section 20.4 rst reviews the MVP distribution and
nite mixtures of MVP distributions and then describes Bayesian inference for a hierar-
chical dynamic model t to multivariate time series of counts, where the coefcients are
427 Dynamic Models for Time Series of Counts with a Marketing Application
customer specic and also vary over time. Section 20.5 shows details and modeling results
from a parsimonious MDFM model. Section 20.6 provides a brief summary.
20.2 Application to Marketing Actions of a Pharmaceutical Firm
We describe statistical analyses pertaining to marketing data from a large multinational
pharmaceutical rm. Analysis of the drivers of new prescriptions written by physicians
is of interest to marketing researchers. However, most of the existing research focuses on
physician-level sales for a single drug within a category and do not consider the association
over time between the sales of a drug and its competitors within a category. We are also
interested in the effect of a rm’s detailing on the sales of its own drug and competitors
and would like to decompose the association in sales among competing drugs between
marketing activities of a drug in the category and coincidence induced by general industry
trends.
We carry out an empirical analysis using monthly data over a three-year period from
a multinational pharmaceutical company, pertaining to physician prescriptions, and sales
calls directed toward the physicians. Sales calls denote visits made to physician ofces by
the rm’s representatives. Similar to most other research studies in this context, we treat
physicians as customers of the pharmaceutical rm. The behavioral data collected monthly
by the rm over a period of 3 years consist of the number of new prescriptions from a
physician (sales) and the number of sales calls directed toward the physicians (detailing).
The number of sales is the primary customer behavior on which we focus in this study.
We focus on one of the newer drugs launched by the rm in a large therapeutic drug cat-
egory (1 of the 10 largest therapeutic categories in the United States) as the own drug.The
therapeutic category contains more than three major drugs, and the own drug possesses
an intermediate market share. Due to condentiality concerns, we are unable to reveal any
other information about the drug category or the pharmaceutical rm. We are interested in
modeling patterns in the number of prescriptions written by a physician on the own drug,a
leader drug,anda challenger drug. We classify those drugs with the highest market shares in
this specic therapeutic category as leaders and the other competing drugs as challengers.
The sales calls directed toward the customer by the rm constitute customer relationship
management (CRM) actions.
Existing research shows that detailing and sampling (giving the physician drug sam-
ples) inuence new prescriptions from physicians (Mizik and Jacobson, 2004). Montoya
et al. (2010) state that after accounting for dynamics in physician prescription writing
behavior, detailing seems to be most effective in acquiring new physicians, whereas sam-
pling is most effective in obtaining recurring prescriptions from existing physicians. The
database consists of monthly prescription histories on the own drug for 45 continuous
months within the last decade from a sample of physicians from the American Medical
Association (AMA) database. The time window of our data starts 1 year after introduction
of the own drug. Exploratory data analysis shows that while the rm obtains on average
three new prescriptions per month from a single physician, and salespeople call on a physi-
cian on average about twice a month, there is considerable variation in both the monthly
level of sales per physician and the number of sales calls directed toward the physician
each month.
428 Handbook of Discrete-Valued Time Series
During these 45 months, the pharmaceutical rm also collected attitudinal data,that
is, monthly information on customer (physician) attitudes regarding all the drugs in the
therapeutic category and their corresponding salespeople. The sampling frame for the sur-
vey was obtained by combining the list of physicians available in the rm database and the
AMA database and was expected to cover over 95% of the physicians in the United States.
The information obtained in the survey that is relevant to our study include (1) physician
ratings (on a seven-point scale) of the salesperson for each drug in terms of overall perfor-
mance, credibility, knowledge of the disease, and knowledge of medications; (2) physician
ratings (on a seven-point scale) of each drug on its overall performance; (3) demographic
information such as the physician zip code and specialty; and (4) estimates of the num-
ber of times a salesperson from the drug company visited the physician in the last month.
While the overall response rate was about 15% of all contacted physicians, the response
rate among physicians who prescribed the own drug at least once before the time frame
of the data was 35%. These statistics are similar to the response rates obtained in other
studies in the pharmaceutical industry (Ahearne et al., 2007). Overall, 6249 physicians had
responded at least once to the survey regarding at least one drug in the therapeutic cat-
egory. An exploratory analysis of variance (ANOVA) analysis excluded the presence of
selection bias. Although customer attitudes are known to inuence customer reactions to
marketing communications of a rm, they are rarely included in models that determine
customer value. Exploratory analysis shows that sales calls and attitudes toward the rm
correlate positively with sales, and attitudes toward competition correlate negatively with
sales of the own drug.
The own drug represents a signicantly different chemical formulation and further tar-
gets a different function of the human body to cure the disease condition than the drugs
available at the time of introduction in the therapeutic category. It is, therefore, reason-
able to expect that physicians will learn about the efcacy of the drug over time, resulting
in variation (either increase or decrease) in sales and attitudes over time. This expecta-
tion is supported by multiple exploratory tests of the customer sales histories. We observe
that the average level of sales (across all customers) ranges from about one in the rst
month to four in the last month. An ANOVA test rejected the null hypothesis that the mean
level of sales was the same across the months. The variation in sales over time motivated
us to develop a dynamic model framework where the coefcients in the customer-level
sales response model could vary across customers and over time. Venkatesan et al. (2014)
discussed a dynamic ZIP framework that combines sparse survey–based physician atti-
tude data that are not available at regular intervals, with physician-level transaction and
marketing histories that are available at regular time intervals. Univariate (own or com-
petitor) prescription counts are modeled in order to discuss retention and sales; this model
is discussed in Section 20.3 of this chapter.
An important step of the marketing research is to jointly model the number of prescrip-
tions of different drugs written by the physicians over time, taking into account possible
associations between them. Almost all the current research focuses on physician-level sales
for a single drug within a category and does not consider the association between the sales
over time of a drug and its competitions within a category. The effect of a rm’s detailing
on the sales of its own drug and competitors is also of interest. We describe an MDFM
model, which provides a useful framework for studying the evolution of sales of a set
of competing drugs within a category. Using this parsimonious model for multivariate
counts, association in sales among competing drugs may be decomposed between market-
ing activities of a drug in the category and coincidence induced by general industry trends.
429 Dynamic Models for Time Series of Counts with a Marketing Application
The model employs a mixture of multivariate Poisson distributions to model the vector of
counts and induces parsimony by allowing some model coefcients to dynamically evolve
over time and others to be subject specic and static over time. A richer and more general
model formulation involves a hierarchical setup where the sampling distribution is a mix-
ture of multivariate Poisson and all the coefcients in the model formulation are allowed
to be dynamic and subject specic. The general hierarchical multivariate dynamic model
(HMDM) framework is outlined in Section 20.4, followed by a description of the MDFM
model in Section 20.5.
20.3 Dynamic ZIP Models for Univariate Prescription Counts
Let Y
i,t
denote the observed new prescription counts of the own drug from physician i in
month t,for i = 1, ..., N and t = 1, ..., T,andlet D
i,t
be the level of detailing (sales calls)
directed at the ith physician in month t. We assume that Y
i,t
follows a ZIP model (Lambert,
1992), that is, the ith physician at time t can belong to either of two latent (unobserved)
states, the inactive state corresponding to B
i,t
= 1, or the active state with B
i,t
= 0. The states
have the interpretation that zero new prescriptions will be observed with high probability
from physicians in the inactive state. When the physician is in the active state, the number
of new prescriptions can assume values k = 0, 1, 2, .... Due to market forces, marketing and
other inuences, a physician is likely to move from the active to the inactive state and vice
versa. In our context, we interpret a physician in the active state as being retained by a rm
and a physician in the inactive state to be dormant, and we assume that a physician never
quits a relationship, and there is always a nite probability that the physician will return to
prescribing the own drug. This formulation is a special case of the hidden Markov model
with two states, active and inactive (Netzer et al., 2008; Zucchini and MacDonald, 2010).
Let λ
i,t
> 0 be the mean prescription count for physician i at time t,andlet0 <
i,t
< 1
denote P(B
i,t
= 1). The ZIP model formulation is
P(Y
i,t
= 0|λ
i,t
,
i,t
) =
i,t
+ (1
i,t
) exp(λ
i,t
),
P(Y
i,t
= k|λ
i,t
,
i,t
) = (1
i,t
) exp(λ
i,t
)λ
k
/k!, k = 1, 2, ... (20.1)
i,t
The distribution for Y
i,t
can be written as a mixture distribution, that is, Y
i,t
= V
i,t
(1
B
i,t
), where B
i,t
Bernoulli(
i,t
), V
i,t
Poisson(λ
i,t
), and B
i,t
and V
i,t
are independent,
latent physician-specic dynamic parameters that are modeled as functions of observed
covariates. It is reasonable to model the natural logarithm of λ
i,t
and the logit of
i,t
as
ln(λ
i,t
) = β
λ
0,i,t
+ β
λ
1,i,t
ln(D
i,t
+ 1) + β
λ
2,i,t
R
i,t
,
logit(
i,t
) = β
0,i,t
+ β
1,i,t
ln(D
i,t
+ 1) + β
2,i,t
R
i,t
, (20.2)
where R
i,t
denotes the RFM variable (see later) which is widely used to predict customer
response to offers in direct marketing (Blattberg et al., 2008), and captures behavioral loy-
alty of customers. In general, recency (R) refers to the time since a customer’s last purchase
(with a ceiling at 3 months), frequency (F) is the number of times a customer made a pur-
chase in the last 3 months, and monetary value (M) is the amount spent in the last 3 months.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset