Basic Net Scoring Methods: The Uplift Approach

Michel, René; Schnakenburg, Igor; von Martens, Tobias

doi:10.1007/978-3-030-22625-1_3

René Michel⁴,
Igor Schnakenburg⁵ &
Tobias von Martens⁴

397 Accesses
1 Altmetric

Abstract

Compared to the classical scoring approach, the difficulty with net scoring is that the target variable, i.e. the uplift, is not defined for an individual observation. Rather, the impact of a treatment is measured by a comparison of structurally identical groups of observations which have (target group) or have not (control group) received the treatment. The underlying problem is that an observation cannot be treated and not treated at the same time. Due to this interaction of the response and the treatment variable, gross scoring methods are not directly applicable, yet they present the basis from which to move on. In this chapter, several statistical methods for net scoring are presented. Firstly, a general and formal description of the net scoring problem is provided. Then, a wide variety of statistical methods for uplift modeling are presented and their respective advantages and disadvantages are described. The two final sections deal with appropriate methods for responses or treatments that are not binary, contrary to what is assumed for most parts of the book.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The notion uplift-uplift comes to mind: It nicely emphasizes the second order nature of net modeling and sticks in mind.
2.
The trade-off between stability and significance depends on the problem at hand.

References

F. Alemi, H. Erdman, I. Griva, and C.H. Evans. Improved statistical methods are needed to advance personalized medicine. Open Transl. Med. Journal, 1:16–20, 2009. https://doi.org/10.2174/1876399500901010016.
Article Google Scholar
P. Austin. Using ensemble-based methods for directly estimating causal effects: An investigation of tree-based g-computation. Multivariate Behavioral Research, 47:115–135, 2012.
Article Google Scholar
I. Bose and X. Chen. Quantitative models for direct marketing: A review from systems perspective. European Journal of Operational Research, 195(1):1–16, 2009.
Article MathSciNet Google Scholar
P. Bremaud. An Introduction to Probabilistic Modeling. Springer, 2012.
MATH Google Scholar
M.D. Chickering and D. Heckerman. A decision theoretic approach to targeted advertising. 2000. in Boutilier, C., and Goldszmidt, M. (Eds.), Uncertainty in Artificial Intelligence, Proceedings of the Sixteenth Conference, Morgan Kaufman, San Mateo, California.
Google Scholar
H. Cramér. Mathematical Methods of Statistics. University Press, Princeton, 1945.
MATH Google Scholar
M. Falk, H. Fischer, J. Hain, F. Marohn, and R. Michel. Statistik in Theorie und Praxis - mit Anwendungen in R. Springer, Munich, 2014.
Book Google Scholar
M. Falk, F. Marohn, and B. Tewes. Foundations of Statistical Analyses - Examples with SAS. Birkhäuser, Basel, 2003.
Google Scholar
W. Gersten. Zielgruppenselektion für Direktmarketingkampagnen. Lang, Frankfurt am Main, 2005.
Google Scholar
S. Gross and R. Tibshirani. Data shared lasso: A novel tool to discover uplift. Computational Statistics and Data Analytics, 101:226–235, 2016.
Article MathSciNet Google Scholar
L. Guelman, M. Guillén, and A.M. Perez-Marin. Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. UB Riskcenter Working Paper Series, 2014(06), 2014.
Google Scholar
L. Guelman, M. Guillén, and A.M. Perez-Marin. Uplift random forests. Cybernetics and Systems, (46(3-4)):230–248, 2015.
Article Google Scholar
J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, Elsevier, San Francisco, 2006.
MATH Google Scholar
B. Hansotia and B. Rukstales. Incremental value modeling. Journal of Interactive Marketing, 16(3):35–46, 2002.
Article Google Scholar
K. Imai and M. Ratkovic. Estimating treatment effect heterogeneity in randomized program evaluation. The Annals of Applied Statistics, 7(1):443–470, 2013.
Article MathSciNet Google Scholar
M. Jaskowski and S. Jaroszewicz. Uplift modeling for clinical trial data. ICML 2012 Workshop on Clinical Data Analysis, 2012.
Google Scholar
F. Kuusisto, V. Costa, H. Nassif, E. Burnside, D. Page, and J. Shavlik. Support vector machines for differential prediction. Proceedings of the 2014th European Conference on Machine Learning and Knowledge Discovery in Databases, 2:50–65, 2014.
Article Google Scholar
K. Larsen. Net models. 2009. Presentation at the 12th Annual Data Mining Conference (2009), available at: http://www.youtube.com/watch?v=JN3WE8IZNVY.
K. Larsen. Net lift models. 2010. Presentation at the Analytics 2010 Conference, available at: http://www.sas.com/events/aconf/2010/pres/larsen.pdf.
K. Larsen. Net Lift Models: Optimizing the Impact of Your Marketing Efforts - Course Notes. SAS Institute Inc., Cary, 2010.
Google Scholar
V. Lo. The true lift model - a novel data mining approach to response modeling in database marketing. SIGKDD Explorations, 4(2):78–86, 2002.
Article Google Scholar
R. Michel, I. Schnakenburg, and T. von Martens. Methods of variable pre-selection for netscore modeling. Journal of Research in Interactive Marketing, 7(4):257–268, 2013.
Article Google Scholar
R. Michel, I. Schnakenburg, and T. von Martens. A modified χ ²-test for uplift models with applications in marketing performance measurement. 2014. arXiv:1401.7001.
Google Scholar
E.W.T. Ngai, L. Xiu, and D.C.K. Chau. Application of data mining techniques in customer relationship management - a literature review and classification. Expert Systems with Applications, 36(2):2592–2602, 2009.
Article Google Scholar
N.J. Radcliffe. Using control groups to target on predicted lift: Building and assessing uplift models. Direct Marketing Journal, 1:14–21, 2007.
Google Scholar
N.J. Radcliffe and P.D. Surry. Real-world uplift modeling with significance-based uplift trees. 2011. Technical Report, Stochastic Solutions.
Google Scholar
J. Robbins. A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect. Mathematical modelling, 7:1395–1512, 1986. https://doi.org/10.1109/ICDM.2010.62.
MathSciNet Google Scholar
P.R. Rosenbaum and D.B. Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika, 70:41–55, 1983.
Article MathSciNet Google Scholar
P.R. Rosenbaum and D.B. Rubin. Reducing bias in observational studies using subclassifications on the propensity score. Journal of the American Statistical Association, 79:516–524, 1984.
Article Google Scholar
K. Rudas and S. Jaroszewicz. Linear regression for uplift modeling. Data Mining and Knowledge Discovery, 32:1275–1305, 2018.
Article MathSciNet Google Scholar
P. Rzepakowski and S. Jaroszewicz. Decision trees for uplift modeling with single and multiple treatments. Knowledge and Information Systems, 32:303–327, 2012.
Article Google Scholar
SAS. Statistics 2: ANOVA and Regression Course Notes. SAS Institute Inc., Cary, 2012.
Google Scholar
SAS. Advanced Analytics Using SAS Enterprise Miner. SAS Institute Inc., Cary, 2016.
Google Scholar
R. Schinazi. Probability with Statistical Applications. Birkhäuser, second edition, 2012.
Google Scholar
J.M. Snowden, S. Rose, and K.M. Mortimer. Implementation of G-computation on a simulated data set: Demonstration of a causal inference technique. American Journal of Epidemiology, 173:731–738, 2011.
Article Google Scholar
H. Strasser and C. Weber. On the asymptotic theory of permutation statistics. Mathematical Methods of Statistics, 8:220–250, 1999.
MathSciNet MATH Google Scholar
L. Tian, A.A. Alizadeh, A.J. Gentles, and R. Tibshirani. A simple method for detecting interactions between a treatment and a large number of covariates. Journal of the American Statistical Association, 109(508), 2014.
Article MathSciNet Google Scholar
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1):267–288, 1996.
MathSciNet MATH Google Scholar
T. Vanderweele. Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford, 2015.
Google Scholar
Y. Zhao, X. Fang, and D. Simchi-Levi. A practically competitive and provably consistent algorithm for uplift modeling. 2017 IEEE International Conference on Data Mining (ICDM), pages 1171–1176, 2017.
Google Scholar
Y. Zhao, X. Fang, and D. Simchi-Levi. Uplift modeling with multiple treatments and general response types. Proceedings of the 2017 SIAM International Conference on Data Mining, pages 588–596, 2017.
Google Scholar

Download references

Author information

Authors and Affiliations

Deutsche Bank AG, Frankfurt am Main, Germany
René Michel & Tobias von Martens
DeTeCon International GmbH, Berlin, Germany
Igor Schnakenburg

Authors

René Michel
View author publications
You can also search for this author in PubMed Google Scholar
Igor Schnakenburg
View author publications
You can also search for this author in PubMed Google Scholar
Tobias von Martens
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Michel, R., Schnakenburg, I., von Martens, T. (2019). Basic Net Scoring Methods: The Uplift Approach. In: Targeting Uplift. Springer, Cham. https://doi.org/10.1007/978-3-030-22625-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-22625-1_3
Published: 10 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22624-4
Online ISBN: 978-3-030-22625-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics