Abstract
We present a Bayesian framework for estimating the customer lifetime value (CLV) and the customer equity (CE) based on the purchasing behavior deducible from the market surveys on customer purchasing behavior. The proposed framework systematically addresses the challenges faced when the future value of customers is estimated based on survey data. The scarcity of the survey data and the sampling variance are countered by utilizing the prior information and quantifying the uncertainty of the CE and CLV estimates by posterior distributions. Furthermore, information on the purchase behavior of the customers of competitors available in the survey data is integrated to the framework. The introduced approach is directly applicable in the domains where a customer relationship can be thought to be monogamous. As an example on the use of the framework, we analyze a consumer survey on mobile phones carried out in Finland in February 2013. The survey data contains consumer given information on the current and previous brand of the phone and the times of the last two purchases.
Similar content being viewed by others
References
Abe, M. (2009). Counting your customers one by one: A hierarchical Bayes extension to the Pareto/NBD model. Marketing Science, 28(3), 541–553.
Allison, P.D. (1985). Survival analysis of backward recurrence times. Journal of the American Statistical Association, 80(390), 315–322.
Bauer, H., Hammerschmidt, M., Braehler, M. (2003). Customer lifetime value concept and its contribution to corporate valuation. Yearbook of Marketing and Consumer Research, 1(1), 49–67.
Bejou, D., Keiningham, T., Aksoy, L. (Eds.) (2006). Customer lifetime value – reshaping the way we manage to maximize profit: Haworth Press.
Blattberg, R.C., Byung-Do, K., Neslin, S.A. (Eds.) (2008). Database marketing: analyzing and managing customers: Springer.
Borle, S., Singh, S.S., Jain, D.C. (2008). Customer lifetime value measurement. Management science, 54(1), 100–112.
Brooks, S.P., & Gelman, A. (1998). General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434–455.
Fader, P., & Hardie, B. (2010). Customer-base valuation in a contractual setting: The perils of ignoring heterogeneity. Marketing Science, 29(1).
Fader, P., Hardie, B., Lee, K.L. (2005a). Counting Your Customers the easy way: An alternative to the Pareto/NBD model. Marketing Science, 24(2), 275–284.
Fader, P., Hardie, B., Lee, K.L. (2005b). RFM and CLV: Using iso-value curves for customer base analysis. Journal of Marketing Research XLII(November), 415–430.
Finnish Communications Regulatory Authority (FICORA) (2013). Toimialakatsaus 2012 (In Finnish). https://www.viestintavirasto.fi/attachments/Toimialakatsaus2012.pdf.
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis, 1(3), 515–533.
Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B. (2013). Bayesian data analysis, 3rd edn. Boca Raton, FL: Chapman & Hall/CRC.
Gupta, S., & Lehmann, D. (2005). Managing customers as investments: the strategic value of customers in the long run. Upper Saddle River, NJ: Wharton School Publishing.
Herniter, J. (1971). A probablistic market model of purchase timing and brand selection. Management Science, 18(4-Part-II), P–102.
Hubbard, D.W. (2010). How to measure anything: finding the value of intangibles in business, 2nd edn. Hoboken, NJ: Wiley.
Jen, L., Chou, C.-H., Allenby, G.M. (2009). The importance of modeling temporal dependence of timing and quantity in direct marketing. Journal of Marketing Research, 46(4), 482–493.
Kumar, V., & George, M. (2007). Measuring and maximizing customer equity: a critical analysis. Journal of the Academy of Marketing Science, 35(4), 157–171.
Kumar, V., & Petersen, J.A. (2005). Using a customer-level marketing strategy to enhance firm’s performance. Journal of the Academy of Marketing Science, 33(4), 505–519.
Kumar, V., Venkatesan, R., Bohling, T., Beckmann, D. (2008). The power of CLV: Managing customer lifetime value at IBM. Marketing Science, 27(4), 585–599.
Lunn, D., Spiegelhalter, D., Thomas, A., Best, N. (2009). The BUGS project: Evolution, critique and future directions (with discussion). Statistics in Medicine, 28, 3049–3082.
Nagano, S., Ichikawa, Y., Takaya, N., Uchiyama, T., Abe, M. (2013). Nonparametric hierarchal bayesian modeling in non-contractual heterogeneous survival data. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining(pp. 668–676). ACM.
Pfeifer, P. (2011). On estimating current-customer equity using company summary data. Journal of Interactive Marketing, 25(1), 1–14.
R Core Team (2012). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Rossi, P.E., Allenby, G.M., McCulloch, R. (2005). Bayesian statistics and marketing. Hoboken, NJ: Wiley.
Rust, R.T., Zeithaml, V.A., Lemon, K.N. (2001). Driving customer equity: how customer lifetime value is reshaping corporate strategy. New York: Simon and Schuster.
Schmittlein, D., Morrison, D., Colombo, R. (1987). Counting your customers: Who they are and what will they do next Management Science, 33(1), 1–24.
Schmittlein, D.C., Bemmaor, A.C., Morrison, D.G. (1985). Technical note – why does the NBD model work? Robustness in representing product purchases, brand purchases and imperfectly recorded purchases. Marketing Science, 4(3), 255–266.
Singh, S.S., Borle, S., Jain, D.C. (2009). A generalized framework for estimating customer lifetime value when customer lifetimes are not observed. Quantitative Marketing and Economics, 7(2), 181–205.
Statistics Finland (2012). Statistical Yearbook of Finland 2012.
Sturtz, S., Ligges, U., Gelman, A. (2005). R2WinBUGS: A package for running WinBUGS from R. Journal of Statistical Software, 12(3), 1–16.
Venkatesan, R., & Kumar, V. (2004). A customer lifetime value framework for customer selection and resource allocation strategy. Journal of Marketing, 68(10), 106–125.
Vilcassim, N.J., & Jain, D.C. (1991). Modeling purchase-timing and brand-switching behavior incorporating explanatory variables and unobserved heterogeneity. Journal of Marketing Research, 29–41.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Model for the simulation example
The estimation method is demonstrated with simulated data from a heterogeneous semi-Markov brand switching model which is defined as follows:
-
1.
The number of transactions made by an individual i follows a Poisson process with the transaction rate λ i .
-
2.
Transaction rate λ i follows a Gamma distribution with the probability density function
$$f(\lambda_{i} \mid \gamma, \delta) = \frac{\delta^{\gamma}}{\Gamma(\gamma)} \lambda_{i}^{\gamma-1} e^{-\delta \lambda_{i}}, \quad \lambda_{i}>0,$$where γ > 0 is the shape parameter, δ > 0 is the rate parameter and Γ stands for the Gamma function.
-
3.
After any transaction, an individual may change the brand with a probability that depends on the current brand. For the focal company, the probability of repurchase for individual i is p i . The probability of acquisition, i.e. change from any competitor to the focal company, is 1−q i for individual i.
-
4.
Repurchase probability p i follows Beta(α p ,β p ) distribution and competitor repurchase probability q i follows Beta(α q ,β q ) distribution.
The model is related to the long modeling tradition in the marketing literature (Herniter 1971; Schmittlein et al. 1985; Vilcassim and Jain 1991; Fader et al. 2005a). An individual is either in a state where she has made the last transaction with the focal company, or the individual is in a ‘competitor state’, where she has made the last transaction with one of the competitors of the focal company. The transaction rate is assumed to be the same in the both states and independent on the transition probabilities.
Full purchase histories are generated for the population from where small survey samples are drawn. The CE estimated from the sample is compared with the CE of the population. The procedure is repeated for a number of survey samples to obtain information on the sampling variation.
The survey data collected for the individuals i = 1,2,…,n are the current state \(S^{(0)}_{i}\) (1 for the focal company and 0 for the competitors), the previous state \(S^{(-1)}_{i}\), the time between the last two purchases T i and the time from the latest transaction \(T^{*}_{i}\). As the transactions follow the Poisson process, the time between the purchases T i follows exponential distribution with rate λ i . The time from the latest purchase to the day of the survey \(T^{*}_{i}\) is an observation from the same exponential distribution because the Poisson process is memoryless. For the current state it holds
For the previous state it holds
where \(S^{(-2)}_{i}\) is the state before the previous. For the state \(S^{(-2)}_{i}\) there are no observations but the formula
follows from the equilibrium state of the Markov chain characterizing the brand switching.
1.1 A.1 Simulation setup
We first simulate the complete purchase histories of 100,000 individuals who are divided between the focal company and the competitors according to the market shares. Then, by using the proposed approach, we estimate the average CLV and CE from the small ‘survey sample’ of the simulated data, and compare the result to the true CLV and CE. The parameters used in the simulation are γ = 3, δ = 10, α p =4, β p =6, α q =4, β q =6 and the intensity is defined as the number of transactions per year. The value of a purchase assumed to be 100 euros. The purchase histories are generated for the 40 years forward and 30 years backward from the time of the survey. The true CLVs and CEs are calculated using the whole population and the generated purchase histories for the forthcoming 40 years. With the annual discounting rate of 10 % this leads to CE of 10.0 million euros for the population of the 100,000 individuals and an average CLV of 100 euros. For the current customers of the focal company, the average CLV equals 120 euros and for the current customers of competitors, the average CLV equals 91 euros. These numbers are compared to the estimates from a small survey samples from the same population. For the each individual selected to the sample, only the variables \(S^{(0)}_ i\), \(S^{(-1)}_{i}\), T i and \(T^{*}_{i}\) are recorded at the time of the survey, which means the amount of the data from the sample is exiguous compared to the full future purchase histories of 100,000 individuals. To illustrate the effect of the sample size to the accuracy of the estimates, the sample sizes are varied from 100 to 1000.
The model and the chosen prior distributions can be written as follows:
For parameters γ, δ, α p , β p , α q and β q we use weakly informative prior distributions (Gelman 2006). Parameters γ and δ describe the shape and scale of the Gamma distribution where the values for the intensity λ i are drawn. We define \(\gamma =m_{\lambda }^{2}/v_{\lambda }\) and δ = m λ /v λ where m λ ∼Gamma(2,1) is the mean of the intensity distribution and v λ ∼Gamma(2,1) is the variance of the intensity distribution. In other words, the expected mean of the intensity distribution is 2 years and the expected standard deviation of the intensity distribution is 1.4 years but there is a considerable uncertainty on the intensity distribution. With these priors, the 95 % Bayes interval for the purchase intervals in the population is (10−2,8×1011) indicating that the priors are rather uninformative.
Parameters α p and β p describe the Beta distribution from where the individual repurchase probabilities are drawn and parameters α q and β q describe the Beta distribution from where the individual competitor repurchase probabilities are drawn. We define α p =k p m p and β p =k p (1−m p ) where m p ∼Uniform(0,1) is the expected average repurchase probability and k p ∼Gamma(10,1) controls the variation of the repurchase probabilities in the population. Similarly we define α q =k q m q and β q =k q (1−m q ) where m q ∼Uniform(0,1) and k q ∼Gamma(10,1). These priors for the expected average repurchase probabilities are uninformative but the Gamma(10,1) for k p and k q makes sure that there is a reasonable variation of repurchase probabilities in the population. With these priors, the 95 % Bayes interval for the repurchase probability p in the population is (0.001,0.999) indicating that the prior is otherwise flat but there are peaks near 0 and 1.
1.2 A.2 BUGS code
The analysis is carried out using OpenBUGS 3.2.2 (Lunn et al. 2009), R (R Core Team 2012) and R2OpenBUGS R package (Sturtz et al. 2005). The BUGS code for the model is given as:
model {
for(i in 1:N)
{
lambda[i] dgamma(gammal,deltal)
p[i] dbeta(alphap,betap)
q[i] dbeta(alphaq,betaq)
tau[i] dexp(lambda[i])
taustar[i] dexp(lambda[i])
m0[i] <- (1-q[i])/(2-q[i]-p[i])
S2[i] dbern(m0[i])
S1prob[i] <- p[i]*S2[i]+(1-q[i])*(1-S2[i])
S1[i] dbern(S1prob[i])
S0prob[i] <- p[i]*S1[i]+(1-q[i])*(1-S1[i])
S0[i] dbern(S0prob[i])
}
ml dgamma(2,1)
vl dgamma(2,1)
gammal <- ml*ml/(vl+0.00001)
deltal <- ml/(vl+0.00001)
mp dunif(0,1)
mq dunif(0,1)
kp dgamma(10,1)
kq dgamma(10,1)
alphap <- kp*mp
betap <- kp*(1-mp)
alphaq <- kq*mq
betaq <- kq*(1-mq)
}
1.3 A.3 Simulation results
The simulation results are shown in Fig. 5 and in Table 6. From Fig. 5, it can be seen that the estimated posterior distributions are concentrated around the true value of the CE and the systematic bias is small or non-existing. As expected, the variance is smaller for the larger sample sizes. Sample sizes of 800 or more seem to give sufficient accuracy of estimation.
The CLV posterior distributions for the customers of the focal company and the customers of a competitor are presented in Table 6. It can be seen that the posteriors estimated from a sample of size 1000 are very similar to the true CLV distribution of the population.
Appendix B: BUGS code for the mobile phone data
The BUGS code for the mobile phone survey is given as
model {
for(i in 1:Nretained)
{
interval[i] dgamma(kappa,lambda[i])C(it˙min[i],it˙max[i])
log(lambda[i]) <- betaconst+betaprev[prevbrand[i]]+
betaagegr[agegr[i]]+
betagender[gender[i]]+
betaincomegr[incomegr[i]]+
betaarea[area[i]]
repurchase[i] dbern(p[i])
logit(p[i]) <- alphaconst+alphaprev[prevbrand[i]]+
alphaagegr[agegr[i]]+
alphagender[gender[i]]+
alphaincomegr[incomegr[i]]+
alphaarea[area[i]]
}
for(i in (Nretained+1):(Nretained+Nchurned))
{
interval[i] dgamma(kappa,lambda[i])C(it˙min[i],it˙max[i])
log(lambda[i]) <- betaconst+betaprev[prevbrand[i]+
betaagegr[agegr[i]]+
betagender[gender[i]]+
betaincomegr[incomegr[i]]+
betaarea[area[i]]
repurchase[i] dbern(p[i])
logit(p[i]) <- alphaconst+alphaprev[prevbrand[i]]+
alphaagegr[agegr[i]]+
alphagender[gender[i]]+
alphaincomegr[incomegr[i]]+
alphaarea[area[i]]
for(brand in 1:Nbrands)
{
q[i,brand] <- (1-equals(prevbrand[i],brand))
*qq[brand,agegr[i]]
(sum(qq[,agegr[i]])-qq[prevbrand[i],agegr[i]])
}
aquisition[i] dbern(q[i,newbrand[i]])
}
for(i in (Nretained+Nchurned+1):N)
{
interval[i] dgamma(kappa,lambda[i])C(it˙min[i],it˙max[i])
log(lambda[i]) <- betaconst+betaprev[prevbrand[i]]+
betaagegr[agegr[i]]+
betagender[gender[i]]+
betaincomegr[incomegr[i]]+
betaarea[area[i]]
}
kappa dgamma(1,1)
betaconst dnorm(0,0.001)
alphaconst dnorm(0,0.001)
for(h in 2:Nbrands)
{
betaprev[h] dnorm(0,0.001)
alphaprev[h] dnorm(0,0.001)
}
betaprev[1] <- 0
alphaprev[1] <- 0
for(h in 2:Nagegr)
{
betaagegr[h] dnorm(0,0.001)
alphaagegr[h] dnorm(0,0.001)
}
betaagegr[1] <- 0
for(h in 2:Nincomegr)
{
betaincomegr[h] dnorm(0,0.001)
}
betaincomegr[1] <- 0
for(h in 2:Narea)
{
betaarea[h] dnorm(0,0.001)
}
betaarea[1] <- 0
betagender[2] dnorm(0,0.001)
betagender[1] <- 0
alphaagegr[1] <- 0
for(h in 2:Nincomegr)
{
alphaincomegr[h] dnorm(0,0.001)
}
alphaincomegr[1] <- 0
for(h in 2:Narea)
{
alphaarea[h] dnorm(0,0.001)
}
alphaarea[1] <- 0
alphagender[2] dnorm(0,0.001)
alphagender[1] <- 0
for(h1 in 1:Nbrands)
{
for(h2 in 1:Nagegr)
{
qq[h1,h2] dbeta(2,2)
}
}
}
Rights and permissions
About this article
Cite this article
Karvanen, J., Rantanen, A. & Luoma, L. Survey data and Bayesian analysis: a cost-efficient way to estimate customer equity. Quant Mark Econ 12, 305–329 (2014). https://doi.org/10.1007/s11129-014-9148-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11129-014-9148-4