Abstract
In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which patients should be treated with a certain type of treatment? These are typical causal questions involving the effect or the change in outcomes made by an intervention. The questions cannot be answered with traditional classification methods as they only use associations to predict outcomes. For personalised marketing, these questions are often answered with uplift modelling. The objective of uplift modelling is to estimate causal effect, but its literature does not discuss when the uplift represents causal effect. Causal heterogeneity modelling can solve the problem, but its assumption of unconfoundedness is untestable in data. So practitioners need guidelines in their applications when using the methods. In this paper, we use causal classification for a set of personalised decision making problems, and differentiate it from classification. We discuss the conditions when causal classification can be resolved by uplift (and causal heterogeneity) modelling methods. We also propose a general framework for causal classification, by using off-the-shelf supervised methods for flexible implementations. Experiments have shown two instantiations of the framework work for causal classification and for uplift (causal heterogeneity) modelling, and are competitive with the other uplift (causal heterogeneity) modelling methods.
This is a preview of subscription content,
to check access.



References
Fernandez, C., Provost, F.: Causal classification: treatment effect vs. outcome estimation (2018). http://www.misrc.umn.edu/workshops/2018/spring/Causal_Targeting_Feb_2018b.pdf
Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688–701 (1974)
Imbens, G.W., Rubin, D.B.: Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, Cambridge (2015)
Lo, V.S.: The true lift model: a novel data mining approach to response modeling in database marketing. ACM SIGKDD Explor. Newsl. 4(2), 78–86 (2002)
Radcliffe, N.J., Surry, P.D.: Differential Response Analysis: Modeling True Responses by Isolating the Effect of a Single Action. Credit Scoring and Credit Control IV, Devon (1999)
Gutierrez, P., Gérardy, J.-Y.: Causal inference and uplift modelling: a review of the literature. In: Proceedings of the 3rd International Conference on Predictive Applications and APIs, Proceedings of Machine Learning Research, vol. 67, pp. 1–13 (2017)
Rzepakowski, P., Jaroszewicz, S.: Uplift modeling in direct marketing. J. Telecommun. Inf. Technol. 2012, 43–50 (2012)
Devriendt, F., Moldovan, D., Verbeke, W.: A literature survey and experimental evaluation of the state-of-the-art in uplift modeling: a stepping stone toward the development of prescriptive analytics. Big Data 6(1), 13–41 (2018)
Athey, S., Imbens, G.: Recursive partitioning for heterogeneous causal effects. Proc. Nat. Acad. Sci. 113(27), 7353–7360 (2016)
Wager, S., Athey, S.: Estimation and inference of heterogeneous treatment effects using random forests. J. Am. Stat. Assoc. 113(523), 1228–1242 (2018)
K’unzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. 116(10), 4156–4165 (2019)
Rosenbaum, R.P., Rubin, B.D.: The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55 (1983)
Pearl, J.: Causality: Models, Reasoning, and Inference, 2nd edn. Cambridge University Press, Cambridge (2009)
Spirtes, P., Glymour, C.C., Scheines, R.: Scheines. The MIT Press, Cambridge (2000)
Aliferis, C.F., Statnikov, A., Tsamardinos, I., Mani, S., Koutsoukos, X.D.: Local causal and Markov blanket induction for causal discovery and feature selection for classification Part I: Algorithms and empirical evaluation. J. Mach. Learn. Res. 11, 171–234 (2010)
Guo, R., Cheng, L., Li, J., Hahn, P.R., Liu, H.: A survey of learning causality with data: problems and methods. ACM Comput. Surv. 53(4), 75:1–75:37 (2020)
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)
Aliferis, C., Tsamardinos, I., Statnikov, A.: Hiton: a novel Markov blanket algorithm for optimal variable selection. In: AMIA Annual Symposium Proceedings, vol. 2003, American Medical Informatics Association, pp. 21–25 (2003)
Statnikov, A., Tsamardinos, I., Brown, L.E., Aliferis, C.F.: Causal explorer: a matlab library of algorithms for causal discovery and variable selection for classification. Chall. Mach. Learn. 2, 267 (2010)
Cowell, R.G., Dawid, P., Lauritzen, S.L., Spiegelhalter, D.J.: Probabilistic Networks and Expert Systems: Exact Computational Methods for Bayesian Networks. Springer, Berlin (2006)
Beinlich, I.A., Suermondt, H.J., Chavez, R.M., Cooper, G.F.: The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In: The Second European Conference in Artificial Intelligence in Medicine. Springer, pp. 247–256 (1989)
Jensen, C.S.: Blocking Gibbs sampling for inference in large and complex Bayesian networks with applications in genetics. Ph.D. Thesis, Aalborg University (1997)
Spellman, P.T., et al.: Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9(12), 3273–3297 (1998)
Guelman, L., Guillén, M., Pérez-Marín, A.M.: Uplift random forests. Cybern. Syst. Intell. Syst. Bus. Econ. 46(3–4), 230–248 (2015)
Guelman, L., Guillén, M., Marín, A..M..P.: Optimal personalized treatment rules for marketing interventions: a review of methods, a new proposal, and an insurance case study. UB Riskcenter Working Paper Series (2014)
Su, X., Tsai, C.-L., Wang, H., Nickerson, D., Li, B.: Subgroup analysis via recursive partitioning. J. Mach. Learn. Res. 10, 141–158 (2009)
Kallus, N.: Classifying treatment responders under causal effect monotonicity. In: Proceedings of the International Conference on Machine Learning 2019, pp. 3201–3210 (2019)
Häggström, J.: Data driven confounder selection via Markov and Bayesian networks. Biometrics 74, 389–398 (2018)
Hillstrom, K.: The minethatdata e-mail analytics and data mining challenge (2008)
Almond, D., Chay, K.Y., Lee, D.S.: The costs of low birth weight. Q. J. Econ. 120(3), 1031–1083 (2005)
Radcliffe, N.: Using control groups to target on predicted lift: building and assessing uplift model. Direct Mark. Anal. J. 1, 14–21 (2007)
Radcliffe, N., Surry, P.: Real-world uplift modelling with significance-based uplift trees, Tech. rep. White Paper TR-2011-1, Stochastic Solutions (2011)
Gubela, R., Bequé, A., Lessmann, S., Gebert, F.: Conversion uplift in e-commerce: a systematic benchmark of modeling strategies. Int. J. Inf. Technol. Decis. Mak. 18(03), 747–791 (2019)
Foster, J.C., Taylor, J.M.G., Ruberg, S.J.: Subgroup identification from randomized clinical trial data. Stat. Med. 30(24), 2867–2880 (2011)
Dudik, M., Langford, J., Li, L.: Doubly robust policy evaluation and learning. In: Proceedings of the 28th International Conference on Machine Learning, pp. 1097–1104 (2011)
Shalit, U., Johansson, F.D., Sontag, D.: Estimating individual treatment effect: generalization bounds and algorithms. In: Proceedings of International Conference on Machine Learning, in PMLR 70, pp. 3076–3085 (2017)
Louizos, C., Shalit, U., Mooij, J.M., Sontag, D., Zemel, R., Welling, M.: Causal effect inference with deep latent-variable models. In: Advances in Neural Information Processing Systems, pp. 6446–6456 (2017)
Künzel, S.R., Stadie, B.C., Vemuri, N., Ramakrishnan, V., Sekhon, J.S., Abbeel, P.: Transfer learning for estimating causal effects using neural networks, Tech. rep. (2018). arXiv:1808.07804v1
Guo, R., Li, J., Liu, H.: Learning individual causal effects from networked observational data. In: Proceedings of ACM International Conference on Web Search and Data Mining, vol. 2020, pp. 232–240 (2020)
Guo, R., Li, J., Liu, H.: Counterfactual evaluation of treatment assignment functions with networked observational data. In: Proceedings of the SIAM International Conference on Data Mining 2020, pp. 271–279 (2020)
VanderWeele, T.J., Shpitser, I.: A new criterion for confounder selection. Biometrics 67(4), 1406–1413 (2011)
De Luna, X., Waernbaum, I., Richardson, T.S.: Covariate selection for the nonparametric estimation of an average treatment effect. Biometrika 98(4), 861–875 (2011)
Entner, D., Hoyer, P., Spirtes, P.: Data-driven covariate selection for nonparametric estimation of causal effects. In: Artificial Intelligence and Statistics, pp. 256–264 (2013)
Maathuis, M.H., Colombo, D., et al.: A generalized back-door criterion. Ann. Stat. 43(3), 1060–1088 (2015)
Hansotia, B., Rukstales, B.: Incremental value modeling. J. Interact. Mark. 16(3), 35–46 (2002). https://doi.org/10.1002/dir.10035
Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling. In: IEEE International Conference on Data Mining, pp. 441–450 (2010)
Rzepakowski, P., Jaroszewicz, S.: Decision trees for uplift modeling with single and multiple treatments. Knowl. Inf. Syst. 32(2), 303–327 (2012)
Nassif, H., Wu, Y., Page, D., Burnside, E.: Logical differential prediction Bayes net, improving breast cancer diagnosis for older women. In: American Medical Informatics Association Annual Symposium Proceedings, vol. 2012, pp. 1330–1339 (2012)
Nassif, H., Kuusisto, F., Burnside, E., Page, D., Shavlik, J., Costa, V.: Score as you lift (sayl): a statistical relational learning approach to uplift modeling. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 595–611 (2013)
Jaskowski, M., Jaroszewicz, S.: Uplift modeling for clinical trial data. In: Workshop on Clinical Data Analysis (2012)
Kane, K., Lo, V.S., Zheng, J.: Mining for the truly responsive customers and prospects using true-lift modeling: comparison of new and existing methods. J. Mark. Anal. 2, 218–238 (2014)
Acknowledgements
This work has been supported by ARC Discovery Projects Grant DP170101306, ARC Discovery Early Career Researcher Award DE200100200, and National Science Foundation of China (under grant 61876206).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, J., Zhang, W., Liu, L. et al. A general framework for causal classification. Int J Data Sci Anal 11, 127–139 (2021). https://doi.org/10.1007/s41060-021-00249-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-021-00249-1