Skip to main content
Log in

Extracting optimal actionable plans from additive tree models

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Although amazing progress has been made in machine learning to achieve high generalization accuracy and efficiency, there is still very limited work on deriving meaningful decision-making actions from the resulting models. However, in many applications such as advertisement, recommendation systems, social networks, customer relationship management, and clinical prediction, the users need not only accurate prediction, but also suggestions on actions to achieve a desirable goal (e.g., high ads hit rates) or avert an undesirable predicted result (e.g., clinical deterioration). Existing works for extracting such actionability are few and limited to simple models such as a decision tree. The dilemma is that those models with high accuracy are often more complex and harder to extract actionability from.

In this paper, we propose an effective method to extract actionable knowledge from additive tree models (ATMs), one of the most widely used and best off-the-shelf classifiers. We rigorously formulate the optimal actionable planning (OAP) problem for a given ATM, which is to extract an actionable plan for a given input so that it can achieve a desirable output while maximizing the net profit. Based on a state space graph formulation, we first propose an optimal heuristic search method which intends to find an optimal solution. Then, we also present a sub-optimal heuristic search with an admissible and consistent heuristic function which can remarkably improve the efficiency of the algorithm. Our experimental results demonstrate the effectiveness and efficiency of the proposed algorithms on several real datasets in the application domain of personal credit and banking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Mao Y, Chen W L, Chen Y X, Lu C Y, Kollef M, Bailey T. An integrated data mining approach to real-time clinical monitoring and deterioration warning. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2012, 1140–1148

    Chapter  Google Scholar 

  2. Bailey T C, Chen Y X, Mao Y, Lu C Y, Hackmann G, Micek S T, Heard K M, Faulkner K M, Kollef M H. A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. Journal of Hospital Medicine, 2013, 8(5): 236–242

    Article  Google Scholar 

  3. Cortez P, Embrechts M J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences, 2013, 225: 1–17

    Article  Google Scholar 

  4. Moro S, Cortez P, Rita P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 2014, 62: 22–31

    Article  Google Scholar 

  5. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R. Intriguing properties of neural networks. 2013, arXiv preprint arXiv:1312.6199

    Google Scholar 

  6. Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. Volume 1. Springer Series in Statistics Springer, 2001

  7. Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1): 116–124

    Article  Google Scholar 

  8. Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2): 137–154

    Article  Google Scholar 

  9. Mohan A, Chen Z, Weinberger K Q. Web-search ranking with initialized gradient boosted regression trees. Journal of Machine Learning Research, Workshop and Conference Proceedings, 2011, 14: 77–89

    Google Scholar 

  10. Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32

    Article  MATH  Google Scholar 

  11. Freund Y, Schapire R E. A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 1997, 55(1): 119–139

    Article  MathSciNet  MATH  Google Scholar 

  12. Friedman J H. Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29: 1189–1232

    Article  MathSciNet  MATH  Google Scholar 

  13. Yang Q, Yin J, Ling C X, Chen T. Postprocessing decision trees to extract actionable knowledge. In: Proceedings of the 3rd IEEE International Conference on Data Mining. 2003, 685–688

    Chapter  Google Scholar 

  14. Manindra A, Thomas T. Satisfiability Problems. Technical Report. 2000

    Google Scholar 

  15. Cai S W. Balance between complexity and quality: local search for minimum vertex cover in massive graphs. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015, 747–753

    Google Scholar 

  16. Russel S, Norvig P. Artificial Intelligence: A Modern Approach. 2nd Ed. Upper Saddle River: Prentice-Hall, 2003

    Google Scholar 

  17. Bache K, Lichman M. UCI Machine Learning Repository. Technical Report. 2013

    Google Scholar 

  18. Kohavi R. Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining. 1996, 202–207

    Google Scholar 

  19. Cui Z C, Chen W L, He Y J, Chen Y X. Optimal action extraction for random forests and boosted trees. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 179–188

    Chapter  Google Scholar 

  20. De Sarbo W S, Ramaswamy V. Crisp: customer response based iterative segmentation procedures for response modeling in direct marketing. Journal of Direct Marketing, 1994, 8(3): 7–20

    Article  Google Scholar 

  21. Levin N, Zahavi J. Segmentation analysis with managerial judgment. Journal of Direct Marketing, 1996, 10(3): 28–47

    Article  Google Scholar 

  22. Hilderman R J, Hamilton H J. Applying objective interestingness measures in data mining systems. In: Proceedings of the European Symposium on Principles of Data Mining and Knowledge Discovery. 2000, 432–439

    Chapter  Google Scholar 

  23. Cao L B, Luo D, Zhang C Q. Knowledge actionability: satisfying technical and business interestingness. International Journal of Business Intelligence and Data Mining, 2007, 2(4): 496–514

    Article  Google Scholar 

  24. Liu B, Hsu W. Post-analysis of learned rules. In: Proceedings of the National Conference on Artificial Intelligence. 1996, 828–834

    Google Scholar 

  25. Liu B, Hsu W, Ma Y. Pruning and summarizing the discovered associations. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999, 125–134

    Google Scholar 

  26. Cao L B, Zhang C Q, Yang Q, Bell D, Vlachos M, Taneri B, Keogh E, Yu P S, Zhong N, Ashrafi M Z, Taniar D, Dubossarsky E, Graco W. Domain-driven, actionable knowledge discovery. IEEE Intelligent Systems, 2007, 22(4): 78–88

    Article  Google Scholar 

  27. Cao L B, Zhao Y C, Zhang H F, Luo D, Zhang C Q, Park E K. Flexible frameworks for actionable knowledge discovery. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(9): 1299–1312

    Article  Google Scholar 

  28. Yang Q, Yin J, Ling C, Pan R. Extracting actionable knowledge from decision trees. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(1): 43–56

    Article  Google Scholar 

  29. Zhou Z H, Jiang Y. Nec4.5: neural ensemble based c4.5. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(6): 770–773

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by China Postdoctoral Science Foundation (2013M531527), the Fundamental Research Funds for the Central Universities (0110000037), the National Natural Science Foundation of China (Grant Nos. 61502412, 61033009, and 61175057), Natural Science Foundation of the Jiangsu Province (BK20150459), Natural Science Foundation of the Jiangsu Higher Education Institutions (15KJB520036), National Science Foundation, United States (IIS-0534699, IIS-0713109, CNS-1017701), and a Microsoft Research New Faculty Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Lu.

Additional information

Qiang Lu is currently an assistant professor in the College of Information Engineering at Yangzhou University, China. He received the BE and PhD degrees from the School of Computer Science and Technology, University of Science and Technology of China (USTC), China in 2007 and 2012, respectively. He received the National Natural Science Foundations of China and Jiangsu Province, Joint PhD Training Scholarship from the China Scholarship Council, and the China Postdoctoral Science Foundation. He has published more than ten papers in journals and conference proceedings, including the ACM TIST, IEEE TSC, EAAI, AAAI’13, ICAPS’11, Cloud- Com’11, and IPC’11. He is a member of the ACM and the CCF. His research interests include data mining, automated planning and scheduling, parallel and distributed computing, and cloud computing.

Zhicheng Cui is now a second year PhD candidate in the Department of Computer Science and Engineering at Washington University in St Louis (WUSTL), USA, supervised by Prof. Yixin Chen. Prior to joining WUSTL, he received his BE in computer science from University of Science and Technology of China (USTC), China in 2014. His research interests are data mining and machine learning, in the area of large scale time series analysis.

Yixin Chen is an associate professor of computer science at the Washington University in St. Louis, USA. He received the PhD degree in computer science from the University of Illinois at Urbana- Champaign, USA in 2005. His work on planning has won First-Class Prizes in the International Planning Competitions (2004 and 2006). He has won the Best Paper Award in AAAI (2010) and ICTAI (2005), and Best Paper nomination at KDD (2009). He has received an Early Career Principal Investigator Award from the Department of Energy (2006) and a Microsoft Research New Faculty Fellowship (2007). Dr. Chen is a senior member of IEEE. He serves as an associate editor on the IEEE Transactions on Knowledge and Data Engineering, and ACM Transactions on Intelligent Systems and Technology. His research interests include nonlinear optimization, constrained search, planning and scheduling, data mining, and data warehousing.

Xiaoping Chen is a full professor with the School of Computer Science and Technology and the Directors of the Robotics Lab and the Center for Artificial Intelligence Research at University of Science and Technology of China (USTC), China. He received his PhD in computer science from USTC in 1997. He established and has led the USTC Robotics Lab and its robot team, WrightEagle, which won 7 champions and 11 runners-up in RoboCup world championships. Prof. Chen found and has led the KeJia Project, which won the Best Autonomous Robotics Award at the IJCAI 2013 Video Competition and the First Prize for General Robot Skills at the IJCAI 2013 Robot Competition. He published about 130 papers, including some appeared in AIJ, IJCAI, AAAI, AAMAS, KR, UAI, ICLP, ICAPS, IJHR, IROS, and JHRI. In 2010, he won the USTC President Award for Research Excellence, which has been the topmost research award at USTC with 1 or 2 scientists being presented annually. Prof. Chen has been working in the fields of artificial intelligence and intelligent service robotics. His current research interests include problem-solving with open knowledge, situated NLP, semantic perception, and automated planning.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Q., Cui, Z., Chen, Y. et al. Extracting optimal actionable plans from additive tree models. Front. Comput. Sci. 11, 160–173 (2017). https://doi.org/10.1007/s11704-016-5273-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-016-5273-4

Keywords

Navigation