Advertisement

Frontiers of Computer Science

, Volume 12, Issue 5, pp 939–949 | Cite as

Achieving data-driven actionability by combining learning and planning

  • Qiang Lv
  • Yixin Chen
  • Zhaorong Li
  • Zhicheng Cui
  • Ling Chen
  • Xing Zhang
  • Haihua Shen
Research Article
  • 17 Downloads

Abstract

A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. Existing effort in deriving such actionable knowledge is few and limited to simple action models while in many real applications those models are often more complex and harder to extract an optimal solution.

In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.

Keywords

actionable knowledge extraction machine learning planning random forest 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61502412, 61379066, and 61402395), Natural Science Foundation of the Jiangsu Province (BK20150459, BK20151314, and BK20140492), Natural Science Foundation of the Jiangsu Higher Education Institutions (15KJB520036), United States NSF grants (IIS-0534699, IIS-0713109, CNS-1017701), Microsoft Research New Faculty Fellowship, and the Research Innovation Program for Graduate Student in Jiangsu Province (KYLX16_1390).

Supplementary material

11704_2017_6315_MOESM1_ESM.ppt (322 kb)
Supplementary material, approximately 228 KB.

References

  1. 1.
    Mitchell T M. Machine learning and data mining. Communications of the ACM, 1999, 42(11): 30–36CrossRefGoogle Scholar
  2. 2.
    Bailey T C, Chen Y X,Mao Y, Lu C Y, Hackmann G,Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8: 236–242CrossRefGoogle Scholar
  3. 3.
    Johnson R A, Gong R, Greatorex-Voith S, Anand A, Fritzler A. A data-driven framework for identifying high school students at risk of not graduating on time. Bloomberg Data for Good Exchange, 2015Google Scholar
  4. 4.
    Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the AAAI Conference on Artificial Intelligence. 1996, 828–834Google Scholar
  5. 5.
    Liu B, HsuW, Ma YM. Pruning and summarizing the discovered associations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134Google Scholar
  6. 6.
    Cao L B, Zhang C Q. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88CrossRefGoogle Scholar
  7. 7.
    Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312CrossRefGoogle Scholar
  8. 8.
    DeSarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20CrossRefGoogle Scholar
  9. 9.
    Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47CrossRefGoogle Scholar
  10. 10.
    Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31CrossRefGoogle Scholar
  11. 11.
    Hilderman R J, Hamilton H J. Applying objective interestingness measures in data mining systems. In: Proceedings of European Conference of Principles of Data Mining and Knowledge Discovery. 2000, 432–439CrossRefGoogle Scholar
  12. 12.
    Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514CrossRefGoogle Scholar
  13. 13.
    Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17CrossRefGoogle Scholar
  14. 14.
    Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. In: Proceedings of the International Conference on Learning Representations. 2014Google Scholar
  15. 15.
    Yang Q, Yin J, Ling C, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688CrossRefGoogle Scholar
  16. 16.
    Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56CrossRefGoogle Scholar
  17. 17.
    Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188CrossRefGoogle Scholar
  18. 18.
    Friedman J, Hastie T, Tibshirani R. The Elements of Statistical Learning, Vol 1. New York: Springer-Verlag, 2001Google Scholar
  19. 19.
    Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124CrossRefGoogle Scholar
  20. 20.
    Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154CrossRefGoogle Scholar
  21. 21.
    Mohan A, Chen Z, Weinberger K. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, 2011, 14: 77–89Google Scholar
  22. 22.
    Lu Q, Cui Z C, Chen Y X, Chen X P. Extracting optimal actionable plans from additive tree models. Frontiers of Computer Science, 2017, 11(1): 160–173CrossRefGoogle Scholar
  23. 23.
    Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55: 119–139MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32CrossRefzbMATHGoogle Scholar
  26. 26.
    Fox M, Long D. PDDL2.1: An extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 2003, 20: 61–124CrossRefzbMATHGoogle Scholar
  27. 27.
    Bäckström C, Nebel B. Complexity results for SAS+ planning. Computational Intelligence, 1995, 11(4): 625–655MathSciNetCrossRefGoogle Scholar
  28. 28.
    Jonsson P, Bäckström C. State-variable planning under structural restrictions: algorithms and complexity. Artificial Intelligence, 1998, 100(1–2): 125–176MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Helmert M. The fast downward planning system. Journal of Artificial Intelligence Research, 2006, 26: 191–246CrossRefzbMATHGoogle Scholar
  30. 30.
    Kautz H A, Selman B. Planning as satisfiability. In: Proceedings of European Conference on Artificial Intelligence. 1992, 359–363Google Scholar
  31. 31.
    Blum A, Furst M L. Fast planning through planning graph analysis. Artificial Intelligence, 1997, 90(1–2): 281–300CrossRefzbMATHGoogle Scholar
  32. 32.
    Lu Q, Huang R Y, Chen Y X, Xu Y, Zhang W X, Chen G L. A SATbased approach to cost-sensitive temporally expressive planning. ACM Transactions on Intelligent Systems and Technology, 2014, 5(1): 18Google Scholar
  33. 33.
    Huang R Y, Chen Y X, Zhang W X. A novel transition based encoding scheme for planning as satisfiability. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2010, 89–94Google Scholar
  34. 34.
    Huang R Y, Chen Y X, Zhang W X. SAS+ planning as satisfiability. Journal of Artificial Intelligence Research, 2012, 43: 293–328MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Balyo T, Chrpa L, Kilani A. On different strategies for eliminating redundant actions from plans. In: Proceedings of the 7th Annual Symposium on Combinatorial Search. 2014, 10–18Google Scholar

Copyright information

© Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of Information EngineeringYangzhou UniversityYangzhouChina
  2. 2.Department of Computer Science and EngineeringWashington University in St. LouisSt. LouisUSA
  3. 3.School of ManagementFudan UniversityShanghaiChina
  4. 4.School of Computer and Control EngineeringUniversity of Chinese Academy of ScienceBeijingChina

Personalised recommendations