Skip to main content

Advertisement

Log in

Logical analysis of multiclass data with relaxed patterns

Annals of Operations Research Aims and scope Submit manuscript

Abstract

An efficient and robust algorithm based on mixed integer linear programming is proposed to extend the Logical Analysis of Data (LAD) methodology to solve multiclass classification problems, where One-vs-Rest learning models are constructed to classify observations in predefined classes. The proposed algorithm uses two control parameters, homogeneity and prevalence, for identifying relaxed (fuzzy) patterns in multiclass datasets. The utility of the proposed method is demonstrated through experiments on multiclass benchmark datasets. Numerical experiments show that the efficiency and performance of the proposed multiclass LAD method with relaxed patterns is comparable to, if not better than, those of the previously developed LAD based multiclass classification as well as other well-known supervised learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aggarwal, C. C. (2015). Data mining. Berlin: Springer.

    Google Scholar 

  • Aiolli, F., & Sperduti, A. (2005). Multiclass classification with multi-prototype support vector machines. Journal of Machine Learning Research, 6(1), 817–850.

    Google Scholar 

  • Alexe, G., Alexe, S., Axelrod, D. E., Bonates, T., Lozina, I., Reiss, M., et al. (2006). Breast cancer prognosis by combinatorial analysis of gene expression data. Breast Cancer Research, 8(4), R41.

    Google Scholar 

  • Alexe, G., Alexe, S., Axelrod, D. E., Hammer, P. L., & Weissmann, D. (2005). Logical analysis of diffuse large B-cell lymphomas. Artificial Intelligence in Medicine, 34(3), 235–267.

    Google Scholar 

  • Alexe, G., Alexe, S., Bonates, T. O., & Kogan, A. (2007). Logical analysis of data—the vision of Peter L. Hammer. Annals of Mathematics and Artificial Intelligence, 49(1–4), 265–312.

    Google Scholar 

  • Alexe, G., Alexe, S., Liotta, L. A., Petricoin, E., Reiss, M., & Hammer, P. L. (2004). Ovarian cancer detection by logical analysis of proteomic data. Proteomics, 4(3), 766–783.

    Google Scholar 

  • Alexe, G., & Hammer, P. (2006). Spanned patterns for the logical analysis of data. Discrete Applied Mathematics, 154(7), 1039–1049.

    Google Scholar 

  • Alexe, S., Blackstone, E., Hammer, P. L., Ishwaran, H., Lauer, M. S., & Snader, C. E. P. (2003). Coronary risk prediction by logical analysis of data. Annals of Operations Research, 119(1–4), 15–42.

    Google Scholar 

  • Aly, M. (2005). Survey on multiclass classification methods. In Neural Networks (pp. 1–9).

  • Apté, C., Damerau, F., & Weiss, S. M. (1994). Automated learning of decision rules for text categorization. ACM Transactions on Information Systems (TOIS), 12(3), 233–251.

    Google Scholar 

  • Avila-Herrera, J. F., & Subasi, M. M. (2013). Logical analysis of multiclass data. In RUTCOR research reports, RRR 5-2013.

  • Avila-Herrera, J. F., & Subasi, M. M. (2015). Logical analysis of multiclass data. In Proceedings of the 2015 Latin American computing conference (pp. 1–10). IEEE.

  • Beygelzimer, A., Langford, J., & Ravikumar, P. (2007). Multiclass classification with filter trees.

  • Bishop, C. M. (2007). Pattern recognition and machine learning. Berlin: Springer.

    Google Scholar 

  • Boland, C. R., Thibodeau, S. N., Hamilton, S. R., Sidransky, D., Eshleman, J. R., Burt, R. W., et al. (1998). A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: Development of international criteria for the determination of microsatellite instability in colorectal cancer. Cancer Research, 58(22), 5248–5257.

    Google Scholar 

  • Bonates, T. O., & Hammer, P. L. (2007). Pseudo-Boolean regression. In RUTCOR research report, Vol. 3-2007.

  • Bonates, T. O., Hammer, P. L., & Kogan, A. (2008). Maximum patterns in datasets. Discrete Applied Mathematics, 156(6), 846–861.

    Google Scholar 

  • Boros, E., Hammer, P. L., Ibaraki, T., & Kogan, A. (1997). Logical analysis of numerical data. Mathematical Programming, 79(1), 163–190.

    Google Scholar 

  • Boros, E., Hammer, P. L., Ibaraki, T., Kogan, A., Mayoraz, E., & Muchnik, I. (2000). An implementation of logical analysis of data. IEEE Transactions on Knowledge and Data Engineering, 12(2), 292–306.

    Google Scholar 

  • Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.

    Google Scholar 

  • Chapelle, O., Haffner, P., & Vapnik, V. (1999). Support vector machines for histogram-based image classification. IEEE Transactions on Neural Networks and Learning Systems, 10(5), 1055–1064.

    Google Scholar 

  • Crama, Y., Ibaraki, T., & Hammer, P. L. (1988). Cause-effect relationships and partially defined boolean functions. Annals of Operations Research, 16(1–4), 299–325.

    Google Scholar 

  • Daniely, A., Sabato, S., & Shalev-Shwartz, S. (2012). Multiclass learning approaches: A theoretical comparison with implications. In Neural information processing systems.

  • Dietterich, T. G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.

    Google Scholar 

  • Ding, C. H. Q., & Dubchak, I. (2001). Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17(4), 349–358.

    Google Scholar 

  • Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern classification. Hoboken: Wiley.

    Google Scholar 

  • Dupuis, C., Gamache, M., & Pagé, J. F. (2012). Logical analysis of data for estimating passenger show rates at air canada. Journal of Air Transport Management, 18(1), 78–81.

    Google Scholar 

  • Efron, B., & Tibshirani, R. (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science, 1(1), 54–75.

    Google Scholar 

  • Even-Zohar, Y., & Roth, D. (2001). A sequential model for multi-class classification. In EMNLP-2001, the SIGDAT conference on empirical methods in natural language processing (pp. 10–19).

  • Fausett, L. V. (1994). Fundamentals of neural networks: Architectures, algorithms, and applications. Englewood Cliffs, NJ: Prentice-Hall.

    Google Scholar 

  • Frank, E., Hall, M. A., & Witten, I. H. (2016). The WEKA Workbench. In Online Appendix for “Data mining: Practical machine learning tools and techniques”, 4th edn.

  • Friedman, N., Linial, M., Nachman, I., & Peer, D. (2000). Using Bayesian networks to analyze expression data. Journal of Computaional Biology, 7(3–4), 601–620.

    Google Scholar 

  • Galar, M., Fernández, A., Barrenechea, E., Bustince, H., & Herrera, F. (2011). An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on One-vs-One and One-vs-All schemes. Pattern Recognition, 44(8), 1761–1776.

    Google Scholar 

  • Gehler, P., & Nowozin, S. (2009). On feature combination for multiclass object classification. In 2009 IEEE 12th international conference on computer vision (pp. 221–228). IEEE.

  • Ghasemi, A., Esmaeili, S., & Yacout, S. (2013). Development of equipment failure prognostic model based on logical analysis of data (LAD). Engineering Letters, 21(4), 256–263.

    Google Scholar 

  • Guo, C., & Ryoo, H. S. (2012). Compact MILP models for optimal and pareto-optimal LAD patterns. Discrete Applied Mathematics, 160(16–17), 2339–2348.

    Google Scholar 

  • Hammer, P. L. (1986). Partially defined Boolean functions and cause-effect relationships. In International conference on multi-attribute decision making via OR-based expert systems.

  • Hammer, P. L., & Bonates, T. O. (2006). Logical analysis of data—An overview: From combinatorial optimization to medical applications. Annals of Operations Research, 148(1), 203–335.

    Google Scholar 

  • Hammer, A. B., Hammer, P. L., & Muchnik, I. (1999). Logical analysis of chinese labor productivity patterns. Annals of Operations Research, 87, 165–176.

    Google Scholar 

  • Hammer, P. L., Kogan, A., & Lejeune, M. A. (2011). Reverse engineering country risk ratings: Statistical and combinatorial non-recursive models. Annals of Operations Research, 188(1), 185–213.

    Google Scholar 

  • Hammer, P., Kogan, A., Simeone, B., & Szedmák, S. (2004). Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 144(1), 79–102.

    Google Scholar 

  • Hanash, S., & Creighton, C. (2003). Making sense of microarray data to classify cancer. The Pharmacogenomics Journal, 3, 308–311.

    Google Scholar 

  • Har-Peled, S., Roth, D., & Zimak, D. (2002). Constraint classification: A new approach to multiclass classification. In International conference on algorithmic learning theory (pp. 365–379). Springer. https://doi.org/10.1109/ICCV.2009.5459169.

  • Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. The Annals of Statistics, 26(2), 451–471.

    Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J., & Franklin, J. (2005). The elements of statistical learning: Data mining, inference and prediction. The Mathematical Intelligencer, 27(2), 83–85.

    Google Scholar 

  • Jelinek, F. (1998). Statistical methods for speech recognition. Cambridge: The MIT Press.

    Google Scholar 

  • Kim, H. H., & Choi, J. Y. (2015). Pattern generation for multi-class LAD using iterative genetic algorithm with flexible chromosomes and multiple populations. Expert Systems with Applications, 42(2), 833–843.

    Google Scholar 

  • Kotsiantis, S., & Kanellopoulus, D. (2006). Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering, 32(1), 47–58.

    Google Scholar 

  • Kronek, L. P., & Reddy, A. (2008). Logical analysis of survival data: Prognostic survival models by detecting high degree interactions in right-censored data. Bioinformatics, 24(16), i248–253.

    Google Scholar 

  • Lauer, M. S., Alexe, S., Pothier-Snader, C. E., Blackstone, E. H., Ishwaran, H., & Hammer, P. L. (2002). Use of the logical analysis of data method for assessing long-term mortality risk after exercise electrocardiography. Circulation, 106(6), 685–690.

    Google Scholar 

  • LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., et al. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1(4), 541–551.

    Google Scholar 

  • Lee, D. D., & Seung, H. S. (1997). Unsupervised learning by convex and conic coding. In Advances in neural information processing systems (pp. 515–521).

  • Lejeune, M., Lozin, V., Lozina, I., Ragab, A., & Yacout, S. (2018). Recent advances in the theory and practice of logical analysis of data. European Journal of Operational Research, 275, 1–15. https://doi.org/10.1016/j.ejor.2018.06.011.

    Article  Google Scholar 

  • Lejeune, M. A., & Margot, F. (2011). Optimization for simulation: LAD accelerator. Annals of Operations Research, 188(1), 285–305.

    Google Scholar 

  • Lemaire, P. (2011). Extensions of logical analysis of data for growth hormone deficiency diagnoses. Annals of Operations Research, 186(1), 199–211.

    Google Scholar 

  • Liu, D., Yan, S., Mu, Y., Hua, X., Chang, S., & Zhang, H. (2011). Towards optimal discriminating order for multiclass classification. In 2011 IEEE 11th international conference on data mining (ICDM) (pp. 388–397). IEEE.

  • Liu, H., Hussain, F., Tan, C. L., & Dash, M. (2002). Discretization: An enabling technique. Data Mining and Knowledge Discovery, 6, 393–423.

    Google Scholar 

  • Li, T., Zhu, S., & Ogihara, M. (2006). Using discriminant analysis for multi-class classification: An experimental investigation. Knowledge and Information Systems, 10(4), 453–472.

    Google Scholar 

  • Misselwitz, B., Strittmatter, G., Periaswamy, B., Schlumberger, M. C., Rout, S., Horvath, P., et al. (2010). Enhanced cell classifier: A multi-class classification tool for microscopy images. BMC Bioinformatics, 11, 30.

    Google Scholar 

  • Moreira, L. (2000). The use of Boolean concepts in general classification contexts. Ph.D. Thesis, Universidade do Minho, Portugal.

  • Mortada, M. (2010). Applicability and interpretability of logical analysis of data in condition based maintenance. Ph.D. Thesis, École Polytechnique de Montréal, Canada.

  • Mortada, M. A., Yacout, S., & Lakis, A. (2011). Diagnosis of rotor bearings using logical analysis of data. Journal of Quality in Maintenance Engineering, 17(4), 371–397.

    Google Scholar 

  • Mortada, M. A., Yacout, S., & Lakis, A. (2014). Fault diagnosis in power transformers using multi-class logical analysis of data. Journal of Intelligent Manufacturing, 25(6), 1429–1439.

    Google Scholar 

  • Nakagawa, T., Kudo, T., & Matsumoto, Y. (2002). Revision learning and its application to part-of-speech tagging. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 497–504).

  • Platt, J. C., Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGs for multiclass classification. Advances in Neural Information Processing Systems, 12(3), 547–553.

    Google Scholar 

  • Reddy, A., Brannon, A. R., Seiler, M., Irgon, J., Ljungberg, B., Zhao, H., Brooks, J. D., Ganesan, S., Rathmell, W. K., & Bhanot, G. (2009). A predictor for survival in intermediate grade clear cell renal cell carcinoma. In BIOCOMP.

  • Reddy, A., Wang, H., Yu, H., Bonates, T. O., Gulabani, V., Azok, J., et al. (2008). Logical analysis of data (LAD) model for the early diagnosis of acute ischemic stroke. BMC Medical Informatics and Decision Making, 8, 30.

    Google Scholar 

  • Ryoo, H. S., & Jang, I. Y. (2009). MILP approach to pattern generation in logical analysis of data. Discrete Applied Mathematics, 157(4), 749–761.

    Google Scholar 

  • Schölkopf, B., & Smola, A. J. (2001). Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambrige, MA: The MIT Press.

    Google Scholar 

  • Singh-Miller, N., & Collins, M. (2009). Learning label embeddings for nearest-neighbor multi-class classification with an application to speech recognition. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, A. Culotta (Eds.), Advances in neural information processing systems (Vol. 22, pp. 1678–1686).

  • Subasi, E., Subasi, M. M., Hammer, P. L., Roboz, J., Anbalagan, V., & Lipkowitz, M. S. (2017). A classification model to predict the rate of decline of kidney function. Frontiers in Medicine, 4, 97.

    Google Scholar 

  • Tax, D. M. J., & Duin, R. P. W. (2002). Using two-class classifiers for multiclass classification. In Proceedings of 16th international conference on pattern recognition (Vol. 2, pp. 124–127). IEEE.

  • Tewari, A., & Bartlett, P. L. (2007). On the consistency of multiclass classification methods. Journal of Machine Learning Research, 8, 1007–1025.

    Google Scholar 

  • Üney, F., & Türkay, M. (2006). A mixed-integer programming approach to multi-class data classification problem. European Journal of Operational Research, 173(3), 910–920.

    Google Scholar 

  • Wu, T. F., Lin, C. J., & Weng, R. C. (2004). Probability estimates for multi-class classification by pairwise coupling. The Journal of Machine Learning Research, 5, 975–1005.

    Google Scholar 

  • Yang, J. B., & Tsang, I. W. (2012). Hierarchical maximum margin learning for multi-class classification. Preprint arXiv:1202.3770.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Munevver Mine Subasi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bain, T.C., Avila-Herrera, J.F., Subasi, E. et al. Logical analysis of multiclass data with relaxed patterns. Ann Oper Res 287, 11–35 (2020). https://doi.org/10.1007/s10479-019-03389-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-019-03389-7

Keywords

Navigation