20: Dynamic Models for Time Series of Counts with a Marketing Application (1/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Dynamic Models for Time Series of Counts with a

Marketing Application

Nalini Ravishanker, Rajkumar Venkatesan, and Shan Hu

CONTENTS

20.1 Introduction...................................................................................425

20.2 Application to Marketing Actions of a Pharmaceutical Firm. . ........................427

20.3 Dynamic ZIPModels for Univariate Prescription Counts..............................429

20.4 Hierarchical Multivariate DynamicModels for Prescription Counts.................432

20.4.1 Finite Mixtures of Multivariate Poisson Distributions..........................432

20.4.2 HMDM Model Description.........................................................434

20.4.3 Bayesian Inference for the HMDM Model........................................435

20.5 Multivariate Dynamic Finite Mixture Model.............................................436

20.5.1 MDFM Model Description..........................................................436

20.5.2 Bayesian Inference for the MDFM Model.........................................437

20.6 Summary........... .. .. .. .. .. .. .. .. .. .. .. ........... .. .. .. .. .. .. .. .. .. .. .. ........... .. .. .. .. ..439

20.A Appendix.......................................................................................440

20.A.1 Complete Conditional Distributions for the HMDM Model. .. ... .. . .. ... .. . ..441

20.A.2 Complete Conditional Distributions and Sampling Algorithms for the

MDFMModel.........................................................................442

References............................................................................................443

20.1 Introduction

In many applications, including marketing, we observe at different times and for differ-

ent subjects counts of some event of interest. Accurate modeling of such time series of

counts (responses) for N subjects over T time periods as functions of relevant covariates

(subject-specic and time-varying), and incorporating dependence over time, is becoming

increasingly important in several applications. In situations where we observe a vector of

counts for each subject at each time, we are also interested in incorporating the association

between the components of the count vectors. In this chapter, we describe the modeling of

univariate and multivariate time series of counts in the context of a marketing application

that involves modeling/predicting product sales.

While count data regression is a widely used applied statistical tool today (Kedem and

Fokianos, 2002), models for count time series are less common. The main approaches

include a regression-type approach using quasi-likelihood as discussed in Zeger (1988), a

Poisson–Gamma mixture modeling approach described by Harvey and Fernandes (1989),

the generalized linear autoregressive moving average (GLARMA) model discussed in

425

426 Handbook of Discrete-Valued Time Series

Davis et al. (2003), and the dynamic generalized linear model (GLM) discussed in Gamer-

man (1998) and Landim and Gamerman (2000). We propose to employ hierarchical

dynamic models and illustrate on a marketing example. For Gaussian dynamic linear

models (DLMs), also often referred to as Gaussian state space models, Kalman (1960) and

Kalman and Bucy (1961) popularized a recursive algorithm for optimal estimation and pre-

diction of the state vector, which then enables the prediction of the observation vector; see

West (1989) for details. Carlin et al. (1992) described the use of Markov chain Monte Carlo

(MCMC) methods for non-Gaussian and nonlinear state space models. Chen et al. (2000) is

an excellent reference text for MCMC methods.

Hierarchical dynamic linear models (HDLMs) combine the stratied parametric linear

models (Lindley and Smith, 1972) and the DLMs into a general framework, and have been

particularly useful in econometric, education, and health care applications (Gamerman and

Migon, 1993). The Gaussian HDLM includes a set of one or more dimensions reducing

structural equations along with the observation equation and state (or system) equation

of the DLM. Landim and Gamerman (2000) further extended the Gaussian HDLM to a

more general class of models where the response vector has a matrix-valued normal dis-

tribution. For situations where the time series of responses consists of counts, DLMs have

been generalized to dynamic generalized linear models (DGLMs) or exponential family

state space models, which assume that the sampling distribution is a member of the expo-

nential family of distributions, such as the Poisson or negative binomial distributions. The

DGLMs may be viewed as dynamic versions of the GLMs (McCullagh and Nelder, 1989).

For univariate time series, Fahrmeir and Kaufmann (1991) discussed Bayesian inference

via an extended Kalman lter approach, while Gamerman (1998) described the use of the

Metropolis–Hastings algorithm combined with the Gibbs sampler in repeated use of an

adjusted version of Gaussian DLM. Applications of state space models of counts include

Weinberg et al. (2007) in operations management and Aktekin et al. (2014) in nance,

for instance. Wikle and Anderson (2003) described a dynamic zero-inated Poisson (ZIP)

model framework for tornado report counts, incorporating spatial and temporal effects.

Gamerman et al. (2015; Chapter 8 in this volume) gives an excellent discussion of Bayesian

DGLMs, with illustrations.

In many applications, the response consists of a vector-valued time series of counts, and

there is a need to develop statistical modeling approaches for estimation and prediction.

Fahrmeir (1992) described posterior inference via extended Kalman ltering for multivari-

ate DGLMs. In this chapter, we describe hierarchical dynamic models for univariate and

multivariate count times series. Specically, we discuss a ZIP sampling distribution for the

univariate case and a multivariate Poisson (MVP) sampling distribution for the multivari-

ate case, incorporating covariates that may vary over location and/or time. The use of the

MVP distribution enables us to model associations between the components of the count

response vector, while the dynamic framework allows us to model the temporal behavior.

The hierarchical structure enables us to capture the location (or subject)-specic effects over

time. We also propose a multivariate dynamic nite mixture (MDFM) model framework to

reduce the dimension of the state parameter and also to include the possibilities of negative

correlations between the component of the multivariate time series.

The format of the chapter is as follows. Section 20.2 gives a description of the market-

ing application, including a description of the data. Section 20.3 describes a dynamic ZIP

model for univariate count time series. Section 20.4 rst reviews the MVP distribution and

nite mixtures of MVP distributions and then describes Bayesian inference for a hierar-

chical dynamic model t to multivariate time series of counts, where the coefcients are

427 Dynamic Models for Time Series of Counts with a Marketing Application

customer specic and also vary over time. Section 20.5 shows details and modeling results

from a parsimonious MDFM model. Section 20.6 provides a brief summary.

20.2 Application to Marketing Actions of a Pharmaceutical Firm

We describe statistical analyses pertaining to marketing data from a large multinational

pharmaceutical rm. Analysis of the drivers of new prescriptions written by physicians

is of interest to marketing researchers. However, most of the existing research focuses on

physician-level sales for a single drug within a category and do not consider the association

over time between the sales of a drug and its competitors within a category. We are also

interested in the effect of a rm’s detailing on the sales of its own drug and competitors

and would like to decompose the association in sales among competing drugs between

marketing activities of a drug in the category and coincidence induced by general industry

trends.

We carry out an empirical analysis using monthly data over a three-year period from

a multinational pharmaceutical company, pertaining to physician prescriptions, and sales

calls directed toward the physicians. Sales calls denote visits made to physician ofces by

the rm’s representatives. Similar to most other research studies in this context, we treat

physicians as customers of the pharmaceutical rm. The behavioral data collected monthly

by the rm over a period of 3 years consist of the number of new prescriptions from a

physician (sales) and the number of sales calls directed toward the physicians (detailing).

The number of sales is the primary customer behavior on which we focus in this study.

We focus on one of the newer drugs launched by the rm in a large therapeutic drug cat-

egory (1 of the 10 largest therapeutic categories in the United States) as the own drug.The

therapeutic category contains more than three major drugs, and the own drug possesses

an intermediate market share. Due to condentiality concerns, we are unable to reveal any

other information about the drug category or the pharmaceutical rm. We are interested in

modeling patterns in the number of prescriptions written by a physician on the own drug,a

leader drug,anda challenger drug. We classify those drugs with the highest market shares in

this specic therapeutic category as leaders and the other competing drugs as challengers.

The sales calls directed toward the customer by the rm constitute customer relationship

management (CRM) actions.

Existing research shows that detailing and sampling (giving the physician drug sam-

ples) inuence new prescriptions from physicians (Mizik and Jacobson, 2004). Montoya

et al. (2010) state that after accounting for dynamics in physician prescription writing

behavior, detailing seems to be most effective in acquiring new physicians, whereas sam-

pling is most effective in obtaining recurring prescriptions from existing physicians. The

database consists of monthly prescription histories on the own drug for 45 continuous

months within the last decade from a sample of physicians from the American Medical

Association (AMA) database. The time window of our data starts 1 year after introduction

of the own drug. Exploratory data analysis shows that while the rm obtains on average

three new prescriptions per month from a single physician, and salespeople call on a physi-

cian on average about twice a month, there is considerable variation in both the monthly

level of sales per physician and the number of sales calls directed toward the physician

each month.

428 Handbook of Discrete-Valued Time Series

During these 45 months, the pharmaceutical rm also collected attitudinal data,that

is, monthly information on customer (physician) attitudes regarding all the drugs in the

therapeutic category and their corresponding salespeople. The sampling frame for the sur-

vey was obtained by combining the list of physicians available in the rm database and the

AMA database and was expected to cover over 95% of the physicians in the United States.

The information obtained in the survey that is relevant to our study include (1) physician

ratings (on a seven-point scale) of the salesperson for each drug in terms of overall perfor-

mance, credibility, knowledge of the disease, and knowledge of medications; (2) physician

ratings (on a seven-point scale) of each drug on its overall performance; (3) demographic

information such as the physician zip code and specialty; and (4) estimates of the num-

ber of times a salesperson from the drug company visited the physician in the last month.

While the overall response rate was about 15% of all contacted physicians, the response

rate among physicians who prescribed the own drug at least once before the time frame

of the data was 35%. These statistics are similar to the response rates obtained in other

studies in the pharmaceutical industry (Ahearne et al., 2007). Overall, 6249 physicians had

responded at least once to the survey regarding at least one drug in the therapeutic cat-

egory. An exploratory analysis of variance (ANOVA) analysis excluded the presence of

selection bias. Although customer attitudes are known to inuence customer reactions to

marketing communications of a rm, they are rarely included in models that determine

customer value. Exploratory analysis shows that sales calls and attitudes toward the rm

correlate positively with sales, and attitudes toward competition correlate negatively with

sales of the own drug.

The own drug represents a signicantly different chemical formulation and further tar-

gets a different function of the human body to cure the disease condition than the drugs

available at the time of introduction in the therapeutic category. It is, therefore, reason-

able to expect that physicians will learn about the efcacy of the drug over time, resulting

in variation (either increase or decrease) in sales and attitudes over time. This expecta-

tion is supported by multiple exploratory tests of the customer sales histories. We observe

that the average level of sales (across all customers) ranges from about one in the rst

month to four in the last month. An ANOVA test rejected the null hypothesis that the mean

level of sales was the same across the months. The variation in sales over time motivated

us to develop a dynamic model framework where the coefcients in the customer-level

sales response model could vary across customers and over time. Venkatesan et al. (2014)

discussed a dynamic ZIP framework that combines sparse survey–based physician atti-

tude data that are not available at regular intervals, with physician-level transaction and

marketing histories that are available at regular time intervals. Univariate (own or com-

petitor) prescription counts are modeled in order to discuss retention and sales; this model

is discussed in Section 20.3 of this chapter.

An important step of the marketing research is to jointly model the number of prescrip-

tions of different drugs written by the physicians over time, taking into account possible

associations between them. Almost all the current research focuses on physician-level sales

for a single drug within a category and does not consider the association between the sales

over time of a drug and its competitions within a category. The effect of a rm’s detailing

on the sales of its own drug and competitors is also of interest. We describe an MDFM

model, which provides a useful framework for studying the evolution of sales of a set

of competing drugs within a category. Using this parsimonious model for multivariate

counts, association in sales among competing drugs may be decomposed between market-

ing activities of a drug in the category and coincidence induced by general industry trends.

429 Dynamic Models for Time Series of Counts with a Marketing Application

The model employs a mixture of multivariate Poisson distributions to model the vector of

counts and induces parsimony by allowing some model coefcients to dynamically evolve

over time and others to be subject specic and static over time. A richer and more general

model formulation involves a hierarchical setup where the sampling distribution is a mix-

ture of multivariate Poisson and all the coefcients in the model formulation are allowed

to be dynamic and subject specic. The general hierarchical multivariate dynamic model

(HMDM) framework is outlined in Section 20.4, followed by a description of the MDFM

model in Section 20.5.

20.3 Dynamic ZIP Models for Univariate Prescription Counts

Let Y

i,t

denote the observed new prescription counts of the own drug from physician i in

month t,for i = 1, ..., N and t = 1, ..., T,andlet D

i,t

be the level of detailing (sales calls)

directed at the ith physician in month t. We assume that Y

i,t

follows a ZIP model (Lambert,

1992), that is, the ith physician at time t can belong to either of two latent (unobserved)

states, the inactive state corresponding to B

i,t

= 1, or the active state with B

i,t

= 0. The states

have the interpretation that zero new prescriptions will be observed with high probability

from physicians in the inactive state. When the physician is in the active state, the number

of new prescriptions can assume values k = 0, 1, 2, .... Due to market forces, marketing and

other inuences, a physician is likely to move from the active to the inactive state and vice

versa. In our context, we interpret a physician in the active state as being retained by a rm

and a physician in the inactive state to be dormant, and we assume that a physician never

quits a relationship, and there is always a nite probability that the physician will return to

prescribing the own drug. This formulation is a special case of the hidden Markov model

with two states, active and inactive (Netzer et al., 2008; Zucchini and MacDonald, 2010).

Let λ

i,t

> 0 be the mean prescription count for physician i at time t,andlet0 <

i,t

< 1

denote P(B

i,t

= 1). The ZIP model formulation is

P(Y

i,t

= 0|λ

i,t

, 

i,t

) = 

i,t

+ (1 − 

i,t

) exp(−λ

i,t

P(Y

i,t

= k|λ

i,t

, 

i,t

) = (1 − 

i,t

) exp(−λ

i,t

)λ

/k!, k = 1, 2, ... (20.1)

i,t

The distribution for Y

i,t

can be written as a mixture distribution, that is, Y

i,t

= V

i,t

(1 −

i,t

), where B

i,t

∼ Bernoulli(

i,t

), V

i,t

∼ Poisson(λ

i,t

), and B

i,t

and V

i,t

are independent,

latent physician-specic dynamic parameters that are modeled as functions of observed

covariates. It is reasonable to model the natural logarithm of λ

i,t

and the logit of 

i,t

ln(λ

i,t

) = β

0,i,t

+ β

1,i,t

ln(D

i,t

+ 1) + β

2,i,t

i,t

logit(

i,t

) = β



0,i,t

+ β



1,i,t

ln(D

i,t

+ 1) + β



2,i,t

i,t

, (20.2)

where R

i,t

denotes the RFM variable (see later) which is widely used to predict customer

response to offers in direct marketing (Blattberg et al., 2008), and captures behavioral loy-

alty of customers. In general, recency (R) refers to the time since a customer’s last purchase

(with a ceiling at 3 months), frequency (F) is the number of times a customer made a pur-

chase in the last 3 months, and monetary value (M) is the amount spent in the last 3 months.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 20: Dynamic Models for Time Series of Counts with a Marketing Application (1/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
20: Dynamic Models for Time Series of Counts with a Marketing Application (1/5)