Skip to main content

Bi-criteria optimization problems for decision rules

Abstract

We consider bi-criteria optimization problems for decision rules and rule systems relative to length and coverage. We study decision tables with many-valued decisions in which each row is associated with a set of decisions as well as single-valued decisions where each row has a single decision. Short rules are more understandable; rules covering more rows are more general. Both of these problems—minimization of length and maximization of coverage of rules are NP-hard. We create dynamic programming algorithms which can find the minimum length and the maximum coverage of rules, and can construct the set of Pareto optimal points for the corresponding bi-criteria optimization problem. This approach is applicable for medium-sized decision tables. However, the considered approach allows us to evaluate the quality of various heuristics for decision rule construction which are applicable for relatively big datasets. We can evaluate these heuristics from the point of view of (i) single-criterion—we can compare the length or coverage of rules constructed by heuristics; and (ii) bi-criteria—we can measure the distance of a point (length, coverage) corresponding to a heuristic from the set of Pareto optimal points. The presented results show that the best heuristics from the point of view of bi-criteria optimization are not always the best ones from the point of view of single-criterion optimization.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

References

  1. Amin, T., Chikalov, I., Moshkov, M., & Zielosko, B. (2013). Dynamic programming approach for exact decision rule optimization. In A. Skowron & Z. Suraj (Eds.), Rough sets and intelligent systems, ISRL 42 (pp. 211–228). Berlin: Springer.

    Chapter  Google Scholar 

  2. Azad, M., Chikalov, I., & Moshkov, M. (2013). Optimization of decision rule complexity for decision tables with many-valued decisions. In IEEE international conference on systems, man, and cybernetics (pp. 444–448).

  3. Blockeel, H., Schietgat, L., Struyf, J., Džeroski, S., & Clare, A. (2006). Decision trees for hierarchical multilabel classification: A case study in functional genomics. In European conference on principles and practice of knowledge discovery in databases, Lecture notes in computer science (Vol. 4213, pp. 18–29).

    Google Scholar 

  4. Bonates, T., Hammer, P. L., & Kogan, A. (2008). Maximum patterns in datasets. Discrete Applied Mathematics, 156(6), 846–861.

    Article  Google Scholar 

  5. Boros, E., Hammer, P. L., Ibaraki, T., Kogan, A., Mayoraz, E., & Muchnik, I. (2000). An implementation of logical analysis of data. IEEE Transactions on Knowledge and Data Engineering, 12(2), 292–306.

    Article  Google Scholar 

  6. Bostrom, H. (1995). Covering vs divide-and-conquer for top-down induction of logic programs. In Proceedings of the 14th international joint conference on artificial intelligence (Vol. 2, pp. 1194–1200).

  7. Boutell, M. R., Luo, J., Shen, X., & Brown, C. M. (2004). Learning multi-label scene classification. Pattern Recognition, 37(9), 1757–1771.

    Article  Google Scholar 

  8. Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.

    Google Scholar 

  9. Cohen, W. W., & Singer, Y. (1999). A simple, fast, and effective rule learner. In Proceedings of the sixteenth national conference on artificial intelligence, American Association for Artificial Intelligence, AAAI ’99 (pp. 335–342).

  10. Crama, Y., Hammer, P. L., & Ibaraki, T. (1988). Cause-effect relationships and partially defined boolean functions. Annals of Operations Research, 16(1), 299–325.

    Article  Google Scholar 

  11. Dembczyński, K., Kotłowski, W., & Słowiński, R. (2010). Ender: A statistical framework for boosting decision rules. Data Mining and Knowledge Discovery, 21(1), 52–90.

    Article  Google Scholar 

  12. Fürnkranz, J. (1999). Separate-and-conquer rule learning. Artificial Intelligence Review, 13, 3–54.

    Article  Google Scholar 

  13. Fürnkranz, J., Gamberger, D., & Lavrac, N. (2012). Foundations of rule learning. Cognitive technologies. Berlin: Springer.

    Book  Google Scholar 

  14. Greco, S., Matarazzo, B., & Słowiński, R. (2001). Rough sets theory for multicriteria decision analysis. European Journal of Operational Research, 129(1), 1–47.

    Article  Google Scholar 

  15. Hammer, P., & Bonates, T. (2006). Logical analysis of data—An Overview: From combinatorial optimization to medical applications. Annals of Operations Research, 148(1), 203–225.

    Article  Google Scholar 

  16. Hammer, P. L., Kogan, A., Simeone, B., & Szedmk, S. (2004). Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 144(12), 79–102.

    Article  Google Scholar 

  17. Lavrač, N., Fürnkranz, J., & Gamberger, D. (2010). Explicit feature construction and manipulation for covering rule learning algorithms. In J. Koronacki, Z. W. Raś, S. T. Wierzchoń & J. Kacprzyk (Eds.), Advances in machine learning. Studies in Computational Intelligence (Vol. 262, pp. 121–146). Berlin: Springer.

    Chapter  Google Scholar 

  18. Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 10 Dec 2015.

  19. Michalski, S., & Pietrzykowski, J. (2007). iAQ: A program that discovers rules. In AAAI-07 AI Video Competition.

  20. Moshkov, M. (2007). On the class of restricted linear information systems. Discrete Mathematics, 307(22), 2837–2844.

    Article  Google Scholar 

  21. Moshkov, M., & Chikalov, I. (2000). On algorithm for constructing of decision trees with minimal depth. Fundamenta Informaticae, 41(3), 295–299.

    Google Scholar 

  22. Moshkov, M., & Zielosko, B. (2011). Combinatorial machine learning—A rough set approach, Studies in computational intelligence (Vol. 360). Berlin: Springer.

    Book  Google Scholar 

  23. Pawlak, Z. (1991). Rough sets: Theoretical aspects of reasoning about data. Dordrecht: Kluwer Academic Publishers.

    Book  Google Scholar 

  24. Pawlak, Z., & Skowron, A. (2007). Rough sets and boolean reasoning. Information Sciences, 177(1), 41–73.

    Article  Google Scholar 

  25. Quinlan, J. R. (1993). C4.5: Programs for machine learning. Los Altos: Morgan Kaufmann.

    Google Scholar 

  26. Rivest, R. L. (1987). Learning decision lists. Machine Learning, 2, 229–246.

    Google Scholar 

  27. Wieczorkowska, A., Synak, P., Lewis, R. A., & Raś, Z. W. (2005). Extracting emotions from music data. In: Foundations of intelligent systems, Lecture notes in computer science (Vol. 3488, pp. 456–465). Berlin: Springer.

    Google Scholar 

  28. Zhou, Z. H., Jiang, K., & Li, M. (2005). Multi-instance learning based web mining. Applied Intelligence, 22(2), 135–147.

    Article  Google Scholar 

  29. Zielosko, B., Chikalov, I.,Moshkov,M., & Amin, T. (2014). Optimization of decision rules based on dynamic programming approach. In C. Faucher & L.C. Jain (Eds.), Innovations in intelligent machines-4. Studies in Computational Intelligence (Vol. 514, pp. 369–392). Cham: Springer.

    Google Scholar 

Download references

Acknowledgements

Research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST). We are greatly indebted to the anonymous reviewer for useful comments and suggestions.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Fawaz Alsolami.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Alsolami, F., Amin, T., Chikalov, I. et al. Bi-criteria optimization problems for decision rules. Ann Oper Res 271, 279–295 (2018). https://doi.org/10.1007/s10479-018-2905-0

Download citation

Keywords

  • Decision tables with many-valued decisions
  • Systems of decision rules
  • Dynamic programming
  • Pareto optimal points
  • Greedy heuristics