Improving customer profit predictions with customer mindset metrics through multiple overimputation

Abstract

Research and practice have called for the incorporation of customer mindset metrics (CMMs) to improve the accuracy of models that predict individual customer profits. However, as CMMs are self-reported data, collected through customer surveys, they are seldom available for a firm’s entire customer database and in addition always measured with some degree of error. Their usage in models for individual-level predictions of customer profit has therefore proven challenging. We offer a solution through a new method called multiple overimputation (MO). MO treats missing data as an extreme form of measurement error and imputes the CMMs for both customers with observed, albeit with measurement error, as well as missing values, that are then included as predictors in a model of individual customer profits. Through a simulation study, empirical application in the pharmaceutical industry, and a customer selection exercise, we demonstrate the predictive and economic value of applying MO in the context of CRM.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Notes

  1. 1.

    Note that the estimation sample in MO does not have to be restricted to customers with observed CMMs, because all CMMs are overimputed.

  2. 2.

    For confidentiality reasons, we cannot reveal any further information about the drug category or the pharmaceutical firm.

  3. 3.

    This practice is common in the pharmaceutical industry, although it would be ideal to survey physicians at random points in time. The firm used these surveys to inform its salesforce evaluation and training, but not to determine sales calls levels for individual customers.

  4. 4.

    Please note that there are no standard items to measure attitudinal CMMs in the literature. In general, related studies measure for instance customers’ product- or service-related satisfaction (e.g., Bowman and Narayandas 2004, Cooil et al. 2007) or performance perceptions (e.g., Petersen et al. 2018) which should also be appropriate in our study of pharmaceutical sales.

  5. 5.

    For better readability, we use “CMMs” to mean “relative CMMs” throughout.

  6. 6.

    Our modeling framework similarly applies to predictions of customer lifetime value (CLV) by extending the projection window to three years (Venkatesan and Kumar 2004).

  7. 7.

    We also evaluated a regular Poisson model, but found the ZIP model to provide better model fit and predictive accuracy.

  8. 8.

    Predicted sales are obtained by first predicting a customer’s retention status and then sales conditional on retention. The MAD of predicted and observed sales therefore evaluates the accuracy of both the sales and retention models.

  9. 9.

    Please note that these metrics do not apply to VAR models.

  10. 10.

    MO is therefore an effective alternative to minimize the threat of the mere measurement effect, because it does not require firms to reach out to a broad sample of customers. As such, it reduces the chances of over-estimating the effects of sales calls that are actually attributable to the mere measurement of CMMs.

  11. 11.

    Although we have no definite information about the firm’s actual customer selection process, it was not based on CMM information and therefore likely similar to, or even less effective than, Model 1.

  12. 12.

    In a simulation study, our model specification and estimation algorithm satisfactorily recovered the true parameters.

  13. 13.

    In the rest of the manuscript, CMMs therefore refer to customer i’s prior CMMs.

  14. 14.

    Although, in general, customers’ CMMs as well as their spending behavior can vary over time, due to the nature of our data and similar to Petersen et al. (2018), we treat these variables as time-invariant and compute their average value during period 2, prior to making predictions in period 3.

  15. 15.

    We repeated the estimation by varying the specification of the initialization time period 1. The substantive results remained unchanged.

References

  1. Aaker, D.A., & Jacobson, R. (1994). The financial information content of perceived quality. Journal of Marketing Research, 31(2), 191–201.

    Article  Google Scholar 

  2. Abe, M. (2009). Counting your customers one by one: a hierarchical bayes extension to the pareto/nbd model. Marketing Science, 28(3), 541–553.

    Article  Google Scholar 

  3. Adigüzel, F., & Wedel, M. (2008). Split questionnaire design for massive surveys. Journal of Marketing Research, 45(5), 608–617.

    Article  Google Scholar 

  4. Ahearne, M., Jelinek, R., Jones, E. (2007). Examining the effect of salesperson service behavior in a competitive context. Journal of the Academy of Marketing Science, 35(4), 603–616.

    Article  Google Scholar 

  5. Aksoy, L., Cooil, B., Groening, C., Keiningham, T.L. (2008). The long-term stock market valuation of customer satisfaction. Journal of Marketing, 72(4), 105–122.

    Article  Google Scholar 

  6. Allenby, G.M., & Ginter, J.L. (1995). Using extremes to design products and segment markets. Journal of Marketing Research, 32(4), 392–403.

    Article  Google Scholar 

  7. Alwin, D.F., & Krosnick, J.A. (1991). The reliability of survey attitude measurement: the influence of question and respondent attributes. Sociological Methods & Research, 20(1), 139–181.

    Article  Google Scholar 

  8. Anderson, E.W., Fornell, C., Mazvancheryl, S.K. (2004). Customer satisfaction and shareholder value. Journal of marketing, 68(4), 172–185.

    Article  Google Scholar 

  9. Arora, N. (2006). Estimating joint preference: a sub-sampling approach. International Journal of Research in Marketing, 23(4), 409–418.

    Article  Google Scholar 

  10. Bijmolt, T.H., Leeflang, P.S., Block, F., Eisenbeiss, M., Hardie, B.G., Lemmens, A., Saffert, P. (2010). Analytics for customer engagement. Journal of Service Research, 13(3), 341–356.

    Article  Google Scholar 

  11. Blackwell, M., Honaker, J., King, G. (2017). A unified approach to measurement error and missing data: overview and applications. Sociological Methods & Research, 46(3), 303–341.

    Article  Google Scholar 

  12. Bolton, R.N. (1998). A dynamic model of the duration of the customer’s relationship with a continuous service provider: the role of satisfaction. Marketing Science, 17(1), 45–65.

    Article  Google Scholar 

  13. Bolton, R.N., Kannan, P.K., Bramlett, M.D. (2000). Implications of loyalty program membership and service experiences for customer retention and value. Journal of the Academy of Marketing Science, 28(1), 95–108.

    Article  Google Scholar 

  14. Bolton, R.N., & Lemon, K.N. (1999). A dynamic model of customers’ usage of services: usage as an antecedent and consequence of satisfaction. Journal of Marketing Research :171–186.

  15. Bolton, R.N., Lemon, K.N., Verhoef, P.C. (2004). The theoretical underpinnings of customer asset management: a framework and propositions for future research. Journal of the Academy of Marketing Science, 32(3), 271–292.

    Article  Google Scholar 

  16. Bowman, D., & Narayandas, D. (2004). Linking customer management effort to customer profitability in business markets. Journal of Marketing Research, 41(4), 433–447.

    Article  Google Scholar 

  17. Bradlow, E.T., Hu, Y., Ho, T. -H. (2004). A learning-based model for imputing missing levels in partial conjoint profiles. Journal of Marketing Research, 41(4), 369–381.

    Article  Google Scholar 

  18. Brown, B., Kanagasabai, K., Serpa Pinto, G. (2017). Capturing value from your customer data. Retrieved December 1, 2018 from https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/capturing-value-from-your-customer-data/.

  19. Cooil, B., Keiningham, T.L., Aksoy, L., Hsu, M. (2007). A longitudinal analysis of customer satisfaction and share of wallet: investigating the moderating effect of customer characteristics. Journal of marketing, 71(1), 67–83.

    Article  Google Scholar 

  20. De Haan, E., Verhoef, P.C., Wiesel, T. (2015). The predictive ability of different customer feedback metrics for retention. International Journal of Research in Marketing, 32(2), 195–206.

    Article  Google Scholar 

  21. Dong, X., Janakiraman, R., Xie, Y. (2014). The effect of survey participation on consumer behavior: the moderating role of marketing communication. Marketing Science, 33(4), 567–585.

    Article  Google Scholar 

  22. Donkers, B., Verhoef, P.C., de Jong, M.G. (2007). Modeling clv: a test of competing models in the insurance industry. Quantitative Marketing and Economics, 5(2), 163–190.

    Article  Google Scholar 

  23. Du, R.Y., Kamakura, W.A., Mela, C.F. (2007). Size and share of customer wallet. Journal of Marketing, 71(2), 94–113.

    Article  Google Scholar 

  24. Ebbes, P., Papies, D., Van Heerde, H.J. (2011). The sense and non-sense of holdout sample validation in the presence of endogeneity. Marketing Science, 30(6), 1115–1122.

    Article  Google Scholar 

  25. European Commission. (2018). Data protection in the EU. Retrieved May 1, 2018 from https://ec.europa.eu/info/law/law-topic/data-protection/data-protection-eu_en/.

  26. Fader, P.S., Hardie, B.G., Lee, K.L. (2005a). Counting your customers the easy way: an alternative to the pareto/nbd model. Marketing Science, 24(2), 275–284.

    Article  Google Scholar 

  27. Fader, P.S., Hardie, B.G., Lee, K.L. (2005b). Rfm and clv: using iso-value curves for customer base analysis. Journal of Marketing Research, 42(4), 415–430.

    Article  Google Scholar 

  28. Fischer, M., & Albers, S. (2010). Patient-or physician-oriented marketing: what drives primary demand for prescription drugs? Journal of Marketing Research, 47(1), 103–121.

    Article  Google Scholar 

  29. Fornell, C., & Larcker, D.F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research :39–50.

  30. Fornell, C., Mithas, S., Morgeson III, F.V., Krishnan, M.S. (2006). Customer satisfaction and stock prices: high returns, low risk. Journal of Marketing, 70(1), 3–14.

    Article  Google Scholar 

  31. Ghosh, S.K., Mukhopadhyay, P., Lu, J.-C.J. (2006). Bayesian analysis of zero-inflated regression models. Journal of Statistical planning and Inference, 136(4), 1360–1375.

    Article  Google Scholar 

  32. Gibbons, R.V., Landry, F.J., Blouch, D.L., Jones, D.L., Williams, F.K., Lucey, C.R., Kroenke, K. (1998). A comparison of physicians’ and patients’ attitudes toward pharmaceutical industry gifts. Journal of General Internal Medicine, 13(3), 151–154.

    Article  Google Scholar 

  33. Gilula, Z., & McCulloch, R. (2013). Multi level categorical data fusion using partially fused data. Quantitative Marketing and Economics, 11(3), 353–377.

    Article  Google Scholar 

  34. Gilula, Z., McCulloch, R.E., Rossi, P.E. (2006). A direct approach to data fusion. Journal of Marketing Research, 43(1), 73–83.

    Article  Google Scholar 

  35. Gönül, F.F., Carter, F., Petrova, E., Srinivasan, K. (2001). Promotion of prescription drugs and its impact on physicians’ choice behavior. Journal of Marketing, 65(3), 79–90.

    Article  Google Scholar 

  36. Granville, K. (2018). Facebook and cambridge analytica what you need to know as fallout widens. Retrieved May 1, 2018 from https://www.nytimes.com/2018/03/19/technology/facebook-cambridge-analytica-explained.html/.

  37. Greene, W.H. (2017). Econometric analysis. London: Pearson.

    Google Scholar 

  38. Gruca, T.S., & Rego, L.L. (2005). Customer satisfaction, cash flow, and shareholder value. Journal of Marketing, 69(3), 115–130.

    Article  Google Scholar 

  39. Gupta, S., & Zeithaml, V. (2006). Customer metrics and their impact on financial performance. Marketing Science, 25(6), 718–739.

    Article  Google Scholar 

  40. Gustafsson, A., Johnson, M.D., Roos, I. (2005). The effects of customer satisfaction, relationship commitment dimensions, and triggers on customer retention. Journal of Marketing, 69(4), 210–218.

    Article  Google Scholar 

  41. Horsky, D., Misra, S., Nelson, P. (2006). Observed and unobserved preference heterogeneity in brand-choice models. Marketing Science, 25(4), 322–335.

    Article  Google Scholar 

  42. Ittner, C.D., & Larcker, D.F. (1998). Are nonfinancial measures leading indicators of financial performance? an analysis of customer satisfaction. Journal of Accounting Research, 36, 1–35.

    Article  Google Scholar 

  43. Johansson, J.K., Dimofte, C.V., Mazvancheryl, S.K. (2012). The performance of global brands in the 2008 financial crisis: a test of two brand value measures. International Journal of Research in Marketing, 29(3), 235–245.

    Article  Google Scholar 

  44. Kamakura, W.A., & Wedel, M. (1997). Statistical data fusion for cross-tabulation. Journal of Marketing Research :485–498.

  45. Kamakura, W.A., & Wedel, M. (2000). Factor analysis and missing data. Journal of Marketing Research, 37 (4), 490–498.

    Article  Google Scholar 

  46. Kamakura, W.A., & Wedel, M. (2003). List augmentation with model based multiple imputation: a case study using a mixed-outcome factor model. Statistica Neerlandica, 57(1), 46–57.

    Article  Google Scholar 

  47. Kamakura, W.A., Wedel, M., De Rosa, F., Mazzon, J.A. (2003). Cross-selling through database marketing: a mixed data factor analyzer for data augmentation and prediction. International Journal of Research in marketing, 20(1), 45–65.

    Article  Google Scholar 

  48. Koperwas, A. (2015). Are you sharing the customer journey across your org? Retrieved May 1, 2018 from https://theblog.adobe.com/sharing-customer-journey-across-your-org/.

  49. Kumar, V., Venkatesan, R., Bohling, T., Beckmann, D. (2008). Practice prize report—the power of clv: managing customer lifetime value at ibm. Marketing Science, 27(4), 585–599.

    Article  Google Scholar 

  50. Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34(1), 1–14.

    Article  Google Scholar 

  51. Luo, X., Homburg, C., Wieseke, J. (2010). Customer satisfaction, analyst stock recommendations, and firm value. Journal of Marketing Research, 47(6), 1041–1058.

    Article  Google Scholar 

  52. Malthouse, E.C., & Blattberg, R.C. (2005). Can we predict customer lifetime value? Journal of Interactive Marketing, 19(1), 2–16.

    Article  Google Scholar 

  53. Manchanda, P., Rossi, P.E., Chintagunta, P.K. (2004). Response modeling with nonrandom marketing-mix variables. Journal of Marketing Research, 41(4), 467–478.

    Article  Google Scholar 

  54. Martin, K.D., & Murphy, P.E. (2017). The role of data privacy in marketing. Journal of the Academy of Marketing Science, 45(2), 135–155.

    Article  Google Scholar 

  55. Maynes, J., & Rawson, A. (2016). Linking the customer experience to value. Retrieved May 1, 2018 from https://www.mckinsey.com/business-functions/marketing-and-sales/our-insights/linking-the-customer-experience-to-value/.

  56. McKinney, W.P., Schiedermayer, M., Simpson, D.E., Rich, E.C. (1990). Pharmaceutical sales representatives. Journal of the American Medical Association, 264(13), 1693–1697.

    Article  Google Scholar 

  57. Mittal, V., & Kamakura, W.A. (2001). Satisfaction, repurchase intent, and repurchase behavior: investigating the moderating effect of customer characteristics. Journal of marketing research, 38(1), 131–142.

    Article  Google Scholar 

  58. Mizik, N., & Jacobson, R. (2004). Are physicians easy marks? Quantifying the effects of detailing and sampling on new prescriptions. Management Science, 50(12), 1704–1715.

    Article  Google Scholar 

  59. Mizik, N., & Jacobson, R. (2009). Valuing branded businesses. Journal of Marketing, 73(6), 137–153.

    Article  Google Scholar 

  60. Montoya, R., Netzer, O., Jedidi, K. (2010). Dynamic allocation of pharmaceutical detailing and sampling for long-term profitability. Marketing Science, 29(5), 909–924.

    Article  Google Scholar 

  61. Musalem, A., Bradlow, E.T., Raju, J.S. (2008). Who’s got the coupon? Estimating consumer preferences and coupon usage from aggregate information. Journal of Marketing Research, 45(6), 715–730.

    Article  Google Scholar 

  62. Narayanan, S., Manchanda, P., Chintagunta, P.K. (2005a). Temporal differences in the role of marketing communication in new product categories. Journal of Marketing Research, 42(3), 278–290.

    Article  Google Scholar 

  63. Narayanan, S., Manchanda, P., Chintagunta, P.K. (2005b). Temporal differences in the role of marketing communication in new product categories. Journal of Marketing Research, 42(3), 278–290.

    Article  Google Scholar 

  64. Petersen, J.A., Kumar, V., Polo, Y., Sese, F.J. (2018). Unlocking the power of marketing: understanding the links between customer mindset metrics, behavior, and profitability. Journal of the Academy of Marketing Science, 46(5), 813–836.

    Article  Google Scholar 

  65. Phillips, L.W. (1981). Assessing measurement error in key informant reports: a methodological note on organizational analysis in marketing. Journal of Marketing Research :395–415.

  66. Qian, Y., & Xie, H. (2011). No customer left behind: a distribution-free bayesian approach to accounting for missing xs in marketing models. Marketing Science, 30(4), 717–736.

    Article  Google Scholar 

  67. Qian, Y., & Xie, H. (2014). Which brand purchasers are lost to counterfeiters? An application of new data fusion approaches. Marketing Science, 33(3), 437–448.

    Article  Google Scholar 

  68. Reinartz, W.J., & Kumar, V. (2000). On the profitability of long-life customers in a noncontractual setting: an empirical investigation and implications for marketing. Journal of Marketing, 64(4), 17–35.

    Article  Google Scholar 

  69. Reinartz, W.J., & Kumar, V. (2003). The impact of customer relationship characteristics on profitable lifetime duration. Journal of marketing, 67(1), 77–99.

    Article  Google Scholar 

  70. Reinartz, W.J., & Venkatesan, R. (2008). Decision models for customer relationship management (crm). In Handbook of marketing decision models (pp. 291–326): Springer.

  71. Rust, R.T., Lemon, K.N., Zeithaml, V.A. (2004). Return on marketing: using customer equity to focus marketing strategy. Journal of Marketing, 68(1), 109–127.

    Article  Google Scholar 

  72. Schmittlein, D.C., Morrison, D.G., Colombo, R. (1987). Counting your customers: who-are they and what will they do next? Management Science, 33(1), 1–24.

    Article  Google Scholar 

  73. Seiders, K., Voss, G.B., Grewal, D., Godfrey, A.L. (2005). Do satisfied customers buy more? Examining moderating influences in a retailing context. Journal of Marketing, 69(4), 26–43.

    Article  Google Scholar 

  74. Srinivasan, S., Vanhuele, M., Pauwels, K. (2010). Mind-set metrics in market response models: an integrative approach. Journal of Marketing Research, 47(4), 672–684.

    Article  Google Scholar 

  75. Venkatesan, R., & Kumar, V. (2004). A customer lifetime value framework for customer selection and resource allocation strategy. Journal of marketing, 68(4), 106–125.

    Article  Google Scholar 

  76. Verhoef, P.C. (2003). Understanding the effect of customer relationship management efforts on customer retention and customer share development. Journal of marketing, 67(4), 30–45.

    Article  Google Scholar 

  77. Verhoef, P.C., & Franses, P.H. (2003). Combining revealed and stated preferences to forecast customer behaviour: three case studies. International Journal of Market Research, 45(4), 1–8.

    Article  Google Scholar 

  78. Verhoef, P.C., Franses, P.H., Hoekstra, J.C. (2001). The impact of satisfaction and payment equity on cross-buying: a dynamic model for a multi-service provider. Journal of Retailing, 77(3), 359– 378.

    Article  Google Scholar 

  79. Voss, G.B., Godfrey, A., Seiders, K. (2010). How complementarity and substitution alter the customer satisfaction–repurchase link. Journal of Marketing, 74(6), 111–127.

    Article  Google Scholar 

  80. Wedel, M., & Kannan, P. (2016). Marketing analytics for data-rich environments. Journal of Marketing, 80 (6), 97–121.

    Article  Google Scholar 

  81. Weijters, B., Cabooter, E., Schillewaert, N. (2010). The effect of rating scale format on response styles: the number of response categories and response category labels. International Journal of Research in Marketing, 27(3), 236–247.

    Article  Google Scholar 

  82. Zheng, Z., & Padmanabhan, B. (2006). Selectively acquiring customer information: a new data acquisition problem and an active learning-based solution. Management Science, 52(5), 697– 712.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Rajkumar Venkatesan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

J. Andrew Petersen served as Area Editor for this article.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 261 KB)

Appendices

Appendix A: Details on model specification for sales and retention

Zero-inflated Poisson (ZIP) model

In each month t during period 3 (i.e., t = 11-45), we observe for each customer i (i = 1 to N) the level of sales (\(y_{it}^{p3}\)) and sales calls (\(Det_{it}^{p3}\)) directed toward that customer. We assume that sales from customer i in month t follow a ZIP model (Lambert 1992), such that at any time t customer i can belong to either of two latent states, dormant, or inactive, (Bit = 1) versus active (Bit = 0). Market forces, marketing, and other influences likely affect customer i’s switching between states. We assume that customers never quit a relationship, such that there is always a finite probability (1 − πit) that they will prescribe the firm’s drugs, in line with extant research (Kumar et al. 2008). Under the ZIP model, the probability that sales (yit) from customer i in time t equals k is;

$$ \begin{array}{@{}rcl@{}} p(y_{it}^{p3}&=&0|\lambda_{it},\pi_{it})=\pi_{it}+(1-\pi_{it})\exp(-\lambda_{it})\\ p(y_{it}^{p3}&=&k|\lambda_{it},\pi_{it})=(1-\pi_{it})\frac{\lambda^{k}_{it}\exp(-\lambda_{it})}{k!},\\ k&=&1,2,..., \end{array} $$
(A.1a)

where λit > 0. As per Eq. A.1a, customer i is active (πit = 0) when sales reach at least one new prescription in time t (i.e., \(y_{it}^{p3} > 0\)). When we do not observe sales in time t, customer i could either belong to the dormant state with probability πit or the active state with probability 1 − πit , or yit = 0. We therefore include the term (1 − πit)exp(−λit) when modeling the probability that sales equal 0, or \(p (y_{it}^{p3} = 0)\). Both, λit and πit are unknown customer-specific parameters, modeled as functions of observed covariates (Ghosh et al. 2006). We rewrite Eq. A.1a as a mixture model of latent random variables \(V_{it}^{p3}\) and \(B_{it}^{p3}\);

$$ \begin{array}{@{}rcl@{}} y_{it}^{p3} &=& V_{it}(1 - B_{it}),\\ V_{it}&\sim&Poisson(\lambda_{it}), {\text{and}}\\ B_{it}&\sim& Bernoulli(\pi_{it}). \end{array} $$
(A.1b)

The expected number of new prescriptions from physician i in time t, which represents the Poisson mean λit, is modeled as;

$$ ln(\lambda_{it})=\beta^{\lambda}_{i}X^{\lambda p3}_{it}, $$
(A.2a)

where \(\beta ^{\lambda }_{i}\) represents the customer-specific coefficients and \(X^{\lambda p3}_{it}\) refers to the corresponding covariates that capture customer past purchase behavior, i.e., lagged sales (\(y_{it-1}^{p3}\)) and firm actions, i.e., sales calls (\(Det_{it}^{p3}\)). We model the Bernoulli random variable Bit (Eq. A.1b), which represents the probability that customer i is inactive in time t, as;

$$ logit(\pi_{it})=\beta^{\pi}_{i}X^{\pi p3}_{it} $$
(A.2b)

Similar to Eq. A.2a, \(\beta ^{\pi }_{i}\) represents the customer-specific coefficients and \(X^{\pi p3}_{it}\) refers to the covariates that capture firm actions and customer past purchase behavior.Footnote 12

Hierarchical model

With the following hierarchical model of customer-specific coefficients, \(\beta _{i} = (\beta ^{\lambda }_{i},\beta ^{\pi }_{i})\), we can assess the influence of CMMs and their behavioral predictors on sales and retention;

$$ \beta_{i}=\gamma Z_{i}^{p2}+\nu_{i} $$
(A.3)

We measure observed customer heterogeneity covariates (Zi) during period 2, to control for the endogeneity among sales and CMMs (or the reinforcing effect of sales on CMMs). Our model thus captures the influence of customer i’s prior CMMs (during period 2) on his or her future behavior (during period 3).Footnote 13 The specific heterogeneity covariates include CMMs (\(\overline {CMM_{i}^{p2}}\)), specialty, i.e., whether the physician is a specialist in a certain medical field (SPCi), and logarithm of average period 2 sales, \(ln(\overline {y^{p2}_{i}}\)), as a proxy for the size of the customer wallet. By accounting for these measures, we can evaluate the effect of CMMs over and above commonly available measures of observed customer heterogeneity.Footnote 14

Further, νi represents the unobserved heterogeneity component that we assume to follow a multivariate normal distribution with zero mean and a variance-covariance matrix V. Similar to Allenby and Ginter (1995), in the absence of γ’s and covariates (Zi), Eq. A.3 represents a standard random effects distribution for βi. Since CMMs can be considered part of the unobserved heterogeneity, νi allows us to assess the value of including CMM information in the hierarchical model (Eq. A.3) over and above a random effects specification of unobserved heterogeneity.

Appendix: B : Model estimation and prediction of twelve-months ahead customer profits in the holdout sample

We estimate the CMM imputation model based on behavioral predictors from period 1 using the estimation sample of 407 customers. We conduct the estimation of parameters in the ZIP and imputation models as well as the prediction of sales in the holdout sample in a fully Bayesian framework employing MCMC algorithms to enable posterior inference. We provide the prior specifications for the model parameters, estimation, and imputation algorithms in Web Appendix B in Supplementary Material. Each MCMC iteration in our model estimation proceeds in three phases. In the first phase, we simulate draws from the posterior distribution of the MO model parameters (Eq. 5) and use them to replace CMMs in the estimation dataset. In the second phase, we simulate draws from the posterior distribution of the ZIP model parameters using the multiple overimputed data from the first phase (Eqs. A.1aA.2aA.2b, and A.3 in Appendix A). In the third phase, we simulate the predictive posterior distribution of sales, retention, and overimputed CMMs for customer i in month T + k (where k = 1, 2, ...12; T = 10) at the end of every iteration of the MCMC algorithm as follows;

  1. 1.

    Predict CMMs (\(\hat {CMM_{i}^{p2}}\)) with Equation 5. Use predicted CMM values for all customers to accomplish overimputation.

  2. 2.

    Predict customer i’s hierarchical coefficients \(\hat {\beta _{i}}\) using Equation A.3. Predicted CMMs (\(\hat {CMM_{i}^{p2}}\)) come from step 1.

  3. 3.

    Predict \(\hat {\pi _{iT+k}}\) and \(\hat {y_{iT+k}}\) using the predicted coefficients (\(\hat {\beta _{i}}\)), lagged sales (\(\hat {y_{iT+k-1}^{p3}}\)), and sales calls (\(Det_{iT+k}^{p3}\)), predicted in the holdout period.

For each iteration of the MCMC algorithm, the predicted values \(\hat {\pi _{i}}=(\hat {\pi _{T+1}},\hat {\pi _{T+2}}, ...\hat {\pi _{T+12}})\) and \(\hat {y_{i}}=(\hat {y_{T+1}}, \hat {y_{T+2}}, ..., \hat {y_{T+12}})\) serve to compute the profits for customer i from Eq. 4. The posterior expected profit for customer i is the Monte Carlo average;

$$ E[P_{i}(\hat{\beta_{i}}(\hat{CMM_{i}^{p2}},\hat{\pi_{i}},\hat{y_{i}})]=\sum\limits_{l=1}^{np}P_{i}(\hat{\beta_{i}}(\hat{CMM_{i}^{p2}}),\hat{\pi_{i}},\hat{y_{i}},l)/np, $$
(B.1)

where, np refers to the number of posterior iterations.

Of the 50,000 MCMC algorithm iterations, we employ the initial 30,000 as burn-in and the last 20,000 as the posterior sample to make inferences. To assess convergence, we also assess trace plots and simulate the posterior distribution using five different parallel chains. The multivariate potential scale reduction factor (MPSRF), computed using the posterior sample of five chains ranging from 1.2 to .9 (across all variables), indicates convergence in the posterior sample.Footnote 15

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Venkatesan, R., Bleier, A., Reinartz, W. et al. Improving customer profit predictions with customer mindset metrics through multiple overimputation. J. of the Acad. Mark. Sci. 47, 771–794 (2019). https://doi.org/10.1007/s11747-019-00658-6

Download citation

Keywords

  • Customer profit prediction
  • Multiple overimputation
  • Imputation
  • Mindset metrics
  • Zero inflated poisson models