3.2 Response Probability

The first issue in customer acquisition is to model the probability of prospects being acquired. Hansotia and Wang [5] argued that prospects' response likelihood varies based on their profiles and the promotional materials they received. The authors adopted a logistic regression to model prospects' probability of response and used prospect profile variables as predictors. In logistic regression, because we can only observe whether the prospects respond or not, a latent response variable img indicating unobserved utility is assumed. Thus, img is usually defined such that

(3.1) equation

where img the acquisition of customer i (1 = acquired, 0 = not acquired) and img = a vector of covariates affecting the acquisition of customer i. The probability that the prospect responds is given by

(3.2) equation

For a logistic regression, img has a logistic distribution, with mean 0 and variance equal to img/img. The cumulative distribution function of img is expressed as

(3.3) equation

and

(3.4) equation

Setting the estimated value of img to img, when img is estimated by img, we have

(3.5) equation

Equation 3.5, the probability of response img, is estimated by the log-odds function as the well-known logistic regression model

(3.6) equation

To present how to estimate the logit model, we adopt Franses and Paap's (2001) introduction of the MLE method for the logit model. A brief introduction of MLE is provided in Appendix A. The likelihood function for the logit model is the product of the choice probabilities over the img individuals, that is

(3.7) equation

where img is the cumulative distribution function according to the standardized logistic distribution, and the log-likelihood is

(3.8) equation

Due to fact that

(3.9) equation

the gradient is given by

(3.10) equation

and the Hessian matrix is given by

(3.11) equation

By using prospects' profile variables as explanatory variables in the logistic regression model, the authors estimated an equation to predict each prospect's probability of responding to the acquisition offer. The estimated model can then be used to understand prediction accuracy within the same dataset (i.e., in-sample prediction) or score prospects not in the dataset for prediction (i.e., out-of-sample prediction). Prospects then can be sorted based on the predicted probability for the purpose of selection and optimal resource allocation.

Wangenheim and Bayón [11] argued that customer satisfaction influenced the number of WOM referrals which had an impact on customer acquisition. They asserted that the reception of a WOM referral has an increased marginal effect on the likelihood of a prospect to purchase. And they managed to answer two interesting questions related to customer acquisition: first, whether prospects' purchase likelihood is a function of WOM referrals; and, second, whether the characteristics of the WOM referrals resource including source expertise and similarity influence the probability that received WOM induces a purchase behavior. Since the authors conducted empirical studies in the energy market where the number of competitors is relatively small, they used customers' switching behavior as the dependent variable in modeling customer acquisition. To answer the first question, they only included the variable indicating the reception of WOM referral as the independent variable and used a binary logistic regression model to examine the effect that the independent variable has on the dependent variable:

(3.12) equation

where img indicated the switching behavior, and the binary independent variable img indicated whether WOM referral had been received. Thus, the effect of WOM referral on switching was obtained by subtracting the switching probability of a customer who had not received WOM img from that of a customer who had received WOM img. To answer the second question, the authors only included the sample of prospects who had received WOM referral in the modeling procedure and also used a logit model to estimate whether switching behavior is a function of WOM referral resource characteristics, source expertise and similarity. which were included as independent variables, conditioning on receiving WOM referral:

(3.13) equation

where img indicated the two independent variables, source expertise and source similarity.

A binary logit model is not the only model that can be used for response probabilities. The type of model depends on the assumption of the distribution of the error term. Referring back to Equation 3.1, if the error term img has a logistic distribution, the corresponding model is a logit model; if the error term has a standard normal distribution, the corresponding model is a probit model. In this case the cumulative distribution function of img for a probit model is

(3.14) equation

The probit model is also estimated by the MLE method. Following Franses and Paap (2001), the relevant likelihood function is given by

(3.15) equation

and the corresponding log-likelihood function is

(3.16) equation

Differentiating img with respect to img gives

(3.17) equation

and the Hessian matrix

(3.18) equation

Similar to the logit model, the probit model is often used to model binary response variables, especially in cases where there is a desire to estimate a two-stage model. The probit model, while more complicated to estimate due to its lack of a closed-form solution, is theoretically appealing in a two-stage modeling framework due to its standard normal distribution. This makes it useful when the second stage is a least squares regression with a normally distributed error term. For example, Reinartz et al. [7] linked customer acquisition and relationship duration together using a probit two-stage least squares model. These authors used a probit model to determine the selection or acquisition process as shown by Equation 3.1 and predicted each prospect's response probability. The estimated probability was then included in the duration model as an independent variable to account for the interaction between acquisition and duration.

As the prospects' response likelihood is not always behaviorally observed, attitudinal propensity scale is also used to measure prospects' response intention. One might argue that attitudinal propensity measures are not perfectly correlated with behaviors, but researchers sometimes still use attitudinal dependent variables considering that attitude is usually positively related with behavior. Lix et al. [1] used two dichotomous attitudinal dependent variables, one indicating whether joined the membership (attitudinal) and the other one indicating whether used direct marketing channels (behavioral). The authors also used two interval-scale dependent variables, indicating prospects' attitudinal propensity toward pro-environment and US products. The authors included 150 independent variables containing prospects' demographic and life-style information in the regression and log-linear analysis. A brief introduction to log-linear analysis is provided in Appendix B. As Hansotia and Wang (1995), they used the estimated model to score samples in a holdout dataset and measured the effectiveness of their estimated models.

Whether it is the case that a logit or probit framework is used to model response probability, the output of the model is quite useful for determining which customer the firm is likely to acquire. In addition, the results of the binary choice model can provide the drivers of customer acquisition which can be useful for managers to make decisions in future customer acquisition campaigns.

3.2.1 Empirical Example: Response Probability

One of the key questions we want to answer with regard to customer acquisition is whether we can determine which future prospects have the highest likelihood of adoption. To do this we first need to know which past prospects were acquired and which were not. In the dataset provided for this chapter we have a binary variable which identifies whether or not a prospect was acquired by the firm (and hence became a customer) and a set of drivers which are likely to help explain a customer's decision to adopt. At the end of this example you should be able to identify the following:

1. The drivers of customer acquisition likelihood.
2. The parameter estimates from the response probability model.

A B2B firm wants to improve the acquisition rate of customers and reduce the acquisition spending on prospects by better understanding which prospects are most likely to adopt. A random sample of 500 prospects (some of whom became customers) was taken from a recent prospect list. The information we need for our model includes the following list of variables:

Dependent variable
Acquisition 1 if the prospect was acquired, 0 otherwise
Independent variables
Acq_Expense Dollars spent on marketing efforts to try and acquire that prospect
Acq_Expense_SQ Square of dollars spent on marketing efforts to try and acquire that prospect
Industry 1 if the prospect is in the B2B industry, 0 otherwise
Revenue Annual sales revenue of the prospect's firm (in millions of dollars)
Employees Number of employees in the prospect's firm

In this case, we have a binary dependent variable (Acquisition) which tells us whether the prospect did adopt (= 1) or did not adopt (= 0). We also have five independent variables we believe will be drivers of adoption. First, we have how many dollars the firm spent on each prospect (Acq_Expense) and the squared value of that variable (Acq_Expense_SQ). We want to use both the linear and squared term since we expect that for each additional dollar spent on the acquisition effort for a given prospect, there will be a diminishing return to the value of that dollar. Second, since the focal firm of this example is a B2B firm, the other three variables are firmographic variables of the prospects. These include whether the prospect sells to B2B (= 1) or B2C (= 0) customers (Industry), how much (in millions) that the prospect firms makes in annual revenue (Revenue), and how many employees the prospect firm has (Employees).

First, we need to model the probability that a prospect will adopt. Since our dependent variable (Acquisition) is binary, we select a logistic regression using the modeling framework as described earlier in the chapter (see Equation 3.1). We could also select a probit model and in general achieve the same results. In this case the y variable is Acquisition and the x variables represent the five independent variables in our database. When we run the logistic regression we get the following result:

img

As we can see from the results, four of the five independent variables are significant at a p-value of 5% or better with only Industry being statistically non-significant. First, this means that acquisition expense has a positive, but diminishing effect (Acq_Expense > 0 and Acq_Expense_SQ < 0) on acquisition likelihood. Second, it suggests that a prospect who is B2B (vs. B2C) will not matter in terms of acquisition likelihood, all else being equal. Third, the higher the Revenue that the prospect has, the more likely the prospect will be acquired. And, finally, the more Employees the prospect has, the more likely the prospect will be acquired.

It is also important to understand exactly how changes in the drivers of acquisition likelihood are likely to lead to either increases or decreases in acquisition likelihood. To do this we need to determine the odds ratio for each of the parameter estimates. Since we are dealing with a logistic regression, this means that we are interested in the log-odds ratio. For example, for Revenue = x,

equation

and, for Revenue = x + 1,

equation

By dividing the second equation by the first we get

equation

We then simplify the equation to get the following:

equation

When we compute the log-odds ratio for each of the statistically significant variables we get the following results for an increase in 1 unit of the independent variable.

Variable Log-odds ratio
Acq_Expense exp(0.06696−0.00008*Acq_Expense)
Revenue 1.033
Employees 1.005

We gain the following insights from the log-odds ratios. With regard to Acq_ Expense, we see that the odds ratio is dependent on the level of Acq_Expense. This is due to the fact that we include both the level and squared terms for Acq_Expense. For example, if we usually spend $500 on a given prospect, by spending $501 we should see an increase in the likelihood of acquisition by exp(0.06696 −0.00008*500) = exp(0.0296) = 1.027. This means that by increasing our spending from $500 to $501, we should see an increase in acquisition likelihood by 2.7%. It is also important to note that this will vary depending on the initial level of Acq_ Expense. With regard to Revenue, we see that for each increase in Revenue by $1 million the acquisition likelihood should increase by 3.3%. Finally with regard to Employees, we see that for each increase in Employees by 1 person the acquisition likelihood should increase by 0.5%.

As a result we now know how changes in acquisition expense and changes in prospect characteristics are likely to either increase or decrease our likelihood of acquisition. This information can provide significant insights to managers who are charged with determining the optimal amount of resources to spend on acquisition.

3.2.2 How Do You Implement it?

To implement the logistic regression in this example we used the PROC Logistic feature in SAS. While we did use SAS to estimate the model, many other statistical packages are capable of estimating a logistic regression including (but not limited to) SPSS, MATLAB, and GAUSS.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset