Székely Regularization for Uplift Modeling

Part of the Studies in Computational Intelligence book series (SCI, volume 605)


Uplift modeling is a subfield of machine learning concerned with predicting the causal effect of an action at the level of individuals. This is achieved by using two training sets: treatment, containing objects which have been subjected to an action and control, containing objects on which the action has not been performed. An uplift model then predicts the difference between conditional success probabilities in both groups. Uplift modeling is best applied to training sets obtained from randomized controlled trials, but such experiments are not always possible, in which case treatment assignment is often biased. In this paper we present a modification of Uplift Support Vector Machines which makes them less sensitive to such a bias. This is achieved by including in the model formulation an additional term which penalizes models which score treatment and control groups differently. We call the technique Székely regularization since it is based on the energy distance proposed by Székely and Rizzo. Optimization algorithm based on stochastic gradient descent techniques has also been developed. We demonstrate experimentally that the proposed regularization term does indeed produce uplift models which are less sensitive to biased treatment assignment.


Propensity Score Regularization Term Treatment Assignment Right Heart Catheterization Penalty Coefficient 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by Research Grant no. N N516 414938 of the Polish Ministry of Science and Higher Education (Ministerstwo Nauki i Szkolnictwa Wyższego) from research funds for the period 2010–2014. Ł.Z. was co-funded by the European Union from resources of the European Social Fund. Project POKL ‘Information technologies: Research and their interdisciplinary applications’, Agreement UDA-POKL.04.01.01-00-051/10-00.


  1. 1.
    Bach F, Moulines E (2011) Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Proceedings of advances in neural information processing systems 24 (NIPS 2011)Google Scholar
  2. 2.
    Guelman L, Guillén M, Pérez-Marín AM (2012) Random forests for uplift modeling: an insurance customer retention case. In: Modeling and simulation in engineering, economics and management. Lecture notes in business information processing (LNBIP), vol 115. Springer, pp. 123–133Google Scholar
  3. 3.
    Holland PW (1986) Statistics and causal inference. J Am Stat Assoc 81(396):945–960CrossRefGoogle Scholar
  4. 4.
    Jaśkowski M, Jaroszewicz S (2012) Uplift modeling for clinical trial data. In: ICML 2012 workshop on machine learning for clinical data analysis, Edinburgh, June 2012Google Scholar
  5. 5.
    Jr Connors AF, Speroff T, Dawson NV et al (1996) The effectiveness of right heart catheterization in the initial care of critically ill patients. JAMA 276(11):889–897Google Scholar
  6. 6.
    Koronacki J, Ćwik J (2008) Statystyczne systemy ucza̧ce siȩ. Exit, Warsaw (In Polish)Google Scholar
  7. 7.
    Kushner HJ, Yin GG (2003) Stochastic approximation and recursive algorithms and applications. SpringerGoogle Scholar
  8. 8.
    Kuusisto F, Costa VS, Nassif H, Burnside E, Page D, Shavlik J (2014) Support vector machines for differential prediction. In: ECML-PKDDGoogle Scholar
  9. 9.
    Polyak BT, Juditsky AB (1992) Acceleration of stochastic approximation by averaging. SIAM J Control Optim 30(4):838–855MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Radcliffe NJ, Surry PD (1999) Differential response analysis: Modeling true response by isolating the effect of a single action. In: Proceedings of credit scoring and credit control VI. Credit Research Centre, University of Edinburgh Management SchoolGoogle Scholar
  11. 11.
    Radcliffe NJ, Surry PD (2011) Real-world uplift modelling with significance-based uplift trees. Portrait Technical Report TR-2011-1, Stochastic SolutionsGoogle Scholar
  12. 12.
    Robins J, Rotnitzky A (2004) Estimation of treatment effects in randomised trials with non-compliance and a dichotomous outcome using structural mean models. Biometrika 91(4):763–783MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Rosenbaum PR (1987) Model-based direct adjustment. J Am Stat Assoc 82(398):387–394CrossRefzbMATHGoogle Scholar
  14. 14.
    Rosenbaum PR, Rubin DB (1983) The central role of the propensity score in observational studies for causal effects. Biometrika 70(1):41–55MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Rzepakowski P, Jaroszewicz S (2010) Decision trees for uplift modeling. In: Proceedings of the 10th IEEE international conference on data mining (ICDM), Sydney, Australia, pp. 441–450 Dec 2010Google Scholar
  16. 16.
    Rzepakowski P, Jaroszewicz S (2012) Decision trees for uplift modeling with single and multiple treatments. Knowl Inf Syst 32:303–327 AugustCrossRefGoogle Scholar
  17. 17.
    Sołtys M, Jaroszewicz S, Rzepakowski P (2014) Ensemble methods for uplift modeling. Data mining and knowledge discovery, pp. 1–29 (online first)Google Scholar
  18. 18.
    Szekely GJ, Rizzo ML (2004) Testing for equal distributions in high dimension. Interstat, Nov 2004Google Scholar
  19. 19.
    Szekely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending ward’s minimum variance method. J Classif 22(2):151–183MathSciNetCrossRefGoogle Scholar
  20. 20.
    Szekely GJ, Rizzo ML, Bakirov NK (2007) Measuring and testing dependence by correlation of distances. Ann Stat 35(6):2769–2794Google Scholar
  21. 21.
    Vansteelandt S, Goetghebeur E (2003) Causal inference with generalized structural mean models. J R Stat Soc B 65(4):817–835MathSciNetCrossRefGoogle Scholar
  22. 22.
    Zaniewicz L, Jaroszewicz S (2013) Support vector machines for uplift modeling. In: The first IEEE ICDM workshop on causal discovery (CD 2013), Dallas, Dec 2013Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Computer SciencePolish Academy of SciencesWarsawPoland
  2. 2.National Institute of TelecommunicationsWarsawPoland

Personalised recommendations