20: Dynamic Models for Time Series of Counts with a Marketing Application (2/5)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google



 



  

430 Handbook of Discrete-Valued Time Series

RFM is then a weighted average of these variables, and a common rule of thumb for the

weights is 60%, 30%, and 10% for R, F,and M respectively. The three component vari-

ables (R, F,and M) are calculated at time t for each physician i as moving averages over

3 months prior to time t, wih respective weights 0.6, 0.3, and 0.1. To verify the validity of

these weights in our study, we ran a logistic regression on the calibration data, where the

response variable is a binary variable taking the value 1 if physician i wrote a new prescrip-

tion at time t and taking value 0 otherwise, and R, F,and M are predictors in this model.

Suppose the estimated regression coefcients are denoted by

,and

, the weight

for recency (R) was computed as

).Weightsfor F and M may be obtained

similarly.















Let β

i,t

, β

2,i,t

and β



i,t

, β



2,i,t

,sothat β

i,t



, β

,

is a

1,i,t

, β

i,t

1,i,t

, β



i,t

p = 6-dimensional vector. We assume the hierarchical (or structural) equation

i,t

= γ

+ AO

i,t



+ AC

i,t



+ CD

i,t



+ Z

 + v

i,t

, (20.3)

where AO

i,t

= diag(ao

i,t

, ..., ao

i,t

) denote attitudes towards the own drug, AC

i,t

diag(ac

i,t

, ..., ac

i,t

) denote attitudes towards the competitive drug, CD

i,t

denote the esti-

mates made by physician i of the competitive detailing at time t, Z

represents physician

demographics, v

i,t

∼ N

(0, V

) denote the errors, and γ

is the p-dimensional state vector

whose dynamic evolution is described by the state (or system) equation

= Gγ

t−1

+ w

, (20.4)

where G is an identity matrix since a random walk evolution is assumed, and w

∼

(0, W) are the state errors. The model structure assumes that customer attitudes form

in all time periods, but are observed only when customers respond to the survey. If cus-

tomer attitudes are observed at time t, they affect the dynamic response coefcients in the

hierarchical equation, that is, β

i,t

and thus affect γ

for  = t, t + 1, ...as well. Note that in

(20.3), the predictor R

i,t

may be replaced by ln(Y

i,t−1

+ 1).

Venkatesan et al. (2014) also included a model for handling the endogeneity of sales

calls by modeling D

i,t

as a Poisson distribution conditional on its mean η

i,t

, and mod-

eling ln(η

i,t

) = ζ



i,t,k

.The ζ coefcients enable us to infer whether the rm

considers customer sales potential and responsiveness to sales calls in its detailing plans.

In general, endogeneity between sales and sales calls may be handled in two ways.

One approach consists of including lagged detailing as well as D

i,t

in (20.2). We, how-

ever, use another approach that accommodates the endogeneity by explicitly modeling

the process that generates detailing D

, so that including only D

i,t

in (20.2), and not its

lagged values, is sufcient. Note the similarity to the incidental parameter issue raised by

Lancaster (2000).

Fairly standard, conditionally conjugate prior distributions, as usually adopted in

HDLMs (Landim and Gamerman, 2000), are assumed: π(V

) is an inverse-Wishart,

IW(n

, S

) and π(W) is IW(n

, S

),with n

= n

= 2p + 1, and S

= S

= (2p + 1)I

2p+1

;

π(

), π(

),and π(

) are each MVN(0, 100I

); π() is MVN(0, 100I

) where K denotes

the number of customer demographic predictors; π(ζ) is MVN(ζ

, V

);and π(γ

) is



431 Dynamic Models for Time Series of Counts with a Marketing Application

MVN(0, 100I

). For details on the choice of hyperparameters, see Venkatesan et al. (2014).

AGibbs sampling algorithm is employed to estimate the posterior distribution of the model

parameters. The coefcients 

, 

,and  are obtained through suitable multivariate

normal draws, the variances are routine draws from inverse Wishart distributions, the

Forward-Filtering-Backward-Sampling (FFBS) algorithm enables sampling γ

(see Carter

and Kohn 1994; Fruhwirth-Schnatter 1994), and the Metropolis–Hastings algorithm is used

to generate samples from other parameters. Modeling details as well as detailed results and

comparisons with several other models are given in Venkatesan et al. (2014). In particular,

the deviance information criterion (DIC) was the smallest for the hierarchical dynamic ZIP

model that included attitudes in (20.3), followed by the corresponding model without atti-

tudes. The dynamic models performed better than the corresponding static models. The

hierarchical dynamic ZIP model also showed the best in-sample and hold-out predictive

performance, giving the smallest mean absolute deviation (MAD) both for 1-month-ahead

and 12-month-ahead predictions. Physician attitudes, when available, affected β

i,t

and γ

Information provided by posterior and predictive distributions from convergent MCMC

samples for the model parameters of the hierarchical dynamic ZIP model enables the rm

to make decisions about customer selection and resource allocation by analyzing the cus-

tomer lifetime value (CLV) metric. CLV was computed over 35 months, because the rm

revealed that it did not plan its sales force allocations over 3 years ahead, and is

∗

+36



(1 −



i,t

)

i,t

− c

i,t

CLV

(1 + d

∗

)

t−T

∗

, (20.5)

i=T

∗

where T

∗

= 10, d

∗

is the discount coefcient, c

i,t

is the unit cost of a sales call, and

i,t

and

i,t

denote the predicted means of the sales and detailing, respectively.

Ongoing collection of physician attitudes via surveys requires an annual investment of

over $1 million from the rm, which would wish to evaluate whether the nancial returns

from collecting and using these attitudes in modeling exceeds the investment. Venkate-

san et al. (2014) used customer selection and customer-level resource allocation based on

a hold-out sample of 1000 physicians. The objective of the customer selection process is to

identify the physicians who would be protable in the future so that they can be prioritized

for targeting. Physician-level sales and retention were predicted from months 10 to 45,

and these predictions were used to compute the physician’s CLV using (20.5). Missing atti-

tudes in the hold-out sample were imputed using an ordered probit model (Albert and

Chib, 1993).

Predictive results from a hierarchical dynamic ZIP model that includes physican atti-

tude information in (20.3) were compared to results from a model that does not include

data on attitudes, in order to quantify the implications to the rm and discuss selection of

protable physicians. Physicians can be classied into quintiles based on the actual CLV,

the CLV predicted from the hierarchical dynamic ZIP model that includes customer atti-

tudes, and the CLV predicted from a hierarchical dynamic ZIP model that did not include

customer attitudes. The incremental prot from including customer attitudes was equiva-

lent to 0.93% of the total CLV obtained from physicians identied to be in the top quintile

based on their observed prots. This implies that if the rm was targeting the top quin-

tile of its customer base, the returns from including customer attitudes to select the most





432 Handbook of Discrete-Valued Time Series

likely physicans to target will be 0.93% higher than not including customer attitudes.

Similarly, the returns from including customer attitudes would be higher by 3.57%,

29.62%, 79.33%, and 24.12% relative to not including customer attitudes, if the rm tar-

gets the second, third, fourth, and the fth quintiles, respectively. The incremental prots

from including attitudes were highest for the mid-tier groups, that is, third and fourth

quintiles.

20.4 Hierarchical Multivariate Dynamic Models for Prescription Counts

Let Y

= (Y

1,it

, ..., Y

m,it

),for t = 1, ..., T, denote the m-dimensional time series of new

prescription counts from physician i, where i = 1, ..., N. The components of the vec-

tor correspond to counts of the rm’s own drug and the competing drugs. We propose

a nite mixture of multivariate Poisson distributions as a sampling distribution of the

m-dimensional vector, which allows negative as well as positive associations between

counts of the own drug and the competing drugs. We start with a review of mixtures of

multivariate Poisson distributions in Section 20.4.1 and then show a general hierarchical

dynamic modeling framework in Section 20.4.2.

20.4.1 Finite Mixtures of Multivariate Poisson Distributions

Following Mahamunulu (1967) and Johnson et al. (1997), the denition of an m-variate

Poisson distribution for a random vector of counts Y is based on a mapping g : N

→ N

q ≥ m, such that Y = g(X) = AX. Here, X = (X

, ..., X

)



is a vector of unobserved

independent Poisson random variables, that is, X

∼ Poisson(λ

) for r = 1, ..., q;and A

is an arbitrary m × q matrix which determines the properties of the multivariate Poisson

distribution. The m-dimensional vector Y = (Y

, ..., Y

)



= AX follows a multivariate

Poisson distribution with parameters λ = (λ

, ..., λ

)



and pmf MP

(y|λ) given by

P(Y = y|λ) =

P(X = x) = P(X

= x

|λ

), (20.6)

x∈g

−1

(y) x∈g

−1

(y)

r=1

where g

−1

(Y) denotes the inverse image of Y ∈ N

and for r = 1, ..., q, the pmf of the

univariate Poisson distribution is P(X

= x

|λ

) = exp(−λ

)λ

!. The mean vector and

variance–covariance matrix of Y conditional on λ are given by

E(Y|λ) = Aλ;Cov(Y|λ) = AA



, (20.7)

where  = diag(λ

, ..., λ

). When m = 1, MP

(y|λ) in (20.6) reduces to the univariate

Poisson pmf P(Y = y|λ) = exp(−λ)λ

/y!. Use of the multivariate Poisson distribution for

modeling applications has been sparse, possibly due to the complicated form of the pmf

(20.6) which does not lend itself to easy computation.

Karlis and Meligkotsidou (2005) proposed a two-way covariance structured multivari-

ate Poisson distribution, which permits more realistic modeling of multivariate counts

in practical applications. This distribution is constructed by setting A =[A

], where





433 Dynamic Models for Time Series of Counts with a Marketing Application

= I

captures the main effects; A

captures the two-way covariance effects; A

is an

m ×[m(m −1)]/2 binary matrix; each column of A

has exactly two ones and (m −2) zeros

and no duplicate columns exist; and q = m +[m(m − 1)]/2. Correspondingly, split the

parameter λ into two parts, that is, λ

(1)

= (λ

, ..., λ

)



, which corresponds to the m main

effects, and λ

(2)

= (λ

m+1

, ..., λ

)



which corresponds to the m(m −1)/2 pairwise covariance

effects. When m = 2, q = 3, let Y = (Y

, Y

)



,andlet Y

= X

and Y

= X

, where

∼ Poisson(λ

), i = 1, 2, 3. The two-way covariance structured bivariate Poisson pmf is



 





(y|λ) = exp{−(λ

+ λ

)}



, (20.8)

i=0

where s = min(y

, y

). When m = 3, q = 6, let Y = (Y

, Y

)



,andlet Y

= X

+ X

= X

+ X

,and Y

= X

+ X

, where X

∼ Poisson(λ

) for i = 1, ...,6. The

two-way covariance structured trivariate Poisson pmf is

−X



λ λ

(y|λ) = exp − λ

− X

)!(y

− X

i=1 (X

, X

)∈C

−X

, (20.9)

− X

)!X

where the summation is over the set C such that C =[(X

, X

) ∈ N

: (X

+ X

≤

) ∩ (X

+ X

≤ y

) ∩ (X

+ X

≤ y

)] =∅]. For m = 2and m = 3, the matrix A has the

respective forms









100110

101

and



010101



011

001011

Under this structure, the variance–covariance matrix of Y given in (20.7) does not accom-

modate negative associations among the components of Y (Karlis and Meligkotsidou,

2005).

We proposed an approach for calculating the multivariate Poisson pmf which is faster

than the recursive scheme proposed by Tsiamyrtzis and Karlis (2004). When m = 2,

let y

and y

denote the observed counts, and without loss of generality, assume that

≤ y

,so that min(y

, y

) = y

. Since X

is the common term in the denitions of Y

and Y

, it is straightforward to obtain the set of possible values that X

can assume, that is,

= 0, ...,min(y

, y

), and obtain the corresponding values assumed by X

and X

to be,

respectively, X

= y

− x

and X

= y

− x

. We have solved for all possible sets of values

for the inverse image of y,thatis, x ∈ g

−1

(y). The pmf for the bivariate Poisson distribu-

tion can be calculated using (20.8). When m = 3, without loss of generality, we assume that

≤ y

. The possible values for x

and x

are in the set C

= (0, ..., y

),andthepos-

sible values for x

areintheset C

= (0, ..., y

). We have in total L different combinations

for (x

, x

), where L = (length of set C

)

× (length of set C

) = (y

+ 1)

+ 1).The

corresponding values for X

, X

can be calculated from (20.9). Let C

∗

denote the set of

L different combinations of possible values for all q = 6 independent Poisson variables.



   



434 Handbook of Discrete-Valued Time Series

Since it is possible that in the set C

∗

, X

,or X

may assume negative values, a subset of

∗

which only contains nonnegative values of X

, X

,andX

is the inverse image of y.The

pmf of the trivariate Poisson distribution is then obtained using (20.9). Computing times

for evaluating the multivariate Poisson pmfs is discussed in Hu (2012).

Karlis and Meligkotsidou (2007) proposed nite mixtures of multivariate Poisson dis-

tributions, which allow for overdispersion in both the marginal distributions and negative

correlations, and thus offer a wide range of models for real data applications. The pmf of a

nite mixture of H multivariate Poisson distributions with mixing proportions π

, ..., π

is given by

p(y|) =

(y|λ

h=1

where  denotes the set of parameters (λ

, ..., λ

, π

, ..., π

H−1

). The expectation and

covariance of Y conditional on λ are





H H









E(Y|λ) =

Aλ

;Cov(Y) = A



(

+ λ



) − π





h=1 h=1 h=1 h=1

where 

= diag(λ

1,h

, ..., λ

q,h

20.4.2 HMDM Model Description

A general framework for an HMDM allowing only for positive associations between com-

ponents of Y

i,t

is discussed in Ravishanker et al. (2014), by assuming a multivariate Poisson

sampling distribution. Here, we extend this general formulation to a mixture of multivari-

ate Poisson sampling distribution. The observation equation and a model for the latent

process λ

j,i,t,h

of the extended HMDM are given in the following:

p(y

i,t

|λ

i,t,h

) = π

i,t

|λ

i,t,h

h=1

ln λ

j,i,t,h

= B



,i,t

j,i,t,h

+ S



,i,t

j,h

, j = 1, ..., q, (20.10)

where B

j,i,t

= (B

j,i,t,1

, ..., B

j,i,t,a

)



is an a

-dimensional vector of exogenous predictors

with location-time-varying (dynamic) coefcients δ

j,i,t,h

= (δ

j,i,t,h,1

, ..., δ

j,i,t,h,a

) and S

j,i,t

j,i,t,1

, ..., S

j,i,t,b

)



is a b

-dimensional vector of exogenous predictors with static coefcients

j,h

= (η

j,h,1

, ..., η

j,h,b

)



. We assume that the model either includes δ

j,i,t,h,1

which represents

the location-time-varying intercept, or includes η

j,h,1

which represents the static intercept,

that is, either D

j,i,t,1

= 1or S

j,i,t,1

= 1. A simple formulation of (20.10) could set a

= 1for

j = 1, ..., q,set b

= b > 1for j = 1, ..., m,and b

= 0for j = m + 1, ..., q, which implies

using only the location-specic and time-dependent intercept to model the Poisson means

corresponding to the association portion, and the location–time intercept together with

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 20: Dynamic Models for Time Series of Counts with a Marketing Application (2/5)

Create new playlist

Sign In

Sign Up

Table of Contents for
20: Dynamic Models for Time Series of Counts with a Marketing Application (2/5)