Abstract
This paper presents a new rule-based classification method that partitions data under analysis into spherical patterns. The forte of the method is twofold. One, it exploits the efficiency of distance metric-based clustering to fast collect similar data into spherical patterns. The other, spherical patterns are each a trait shared among one type of data only, hence are built for classification of new data. Numerical studies with public machine learning datasets from Lichman (2013), in comparison with well-established classification methods from Boros et al. (IEEE Transactions on Knowledge and Data Engineering, 12, 292–306, 2000) and Waikato Environment for Knowledge Analysis (http://www.cs.waikato.ac.nz/ml/weka/), demonstrate the aforementioned utilities of the new method well.
Similar content being viewed by others
References
Aha, D., Kibler, D., Albert, M. (1991). Instance-based learning. Machine Learning, 6(1), 37–66.
Alexe, G., & Hammer, P.L. (2006a). Spanned patterns for the logical analysis of data. Discrete Mathematics, 154(7), 1039–1049.
Alexe, S., & Hammer, P.L. (2006b). Accelerated algorithm for pattern detection in logical analysis of data. Discrete Mathematics, 154(7), 1050–1063.
Alexe, G., Alexe, S., Bonates, T., Kogan, A. (2007). Logical analysis of data – the vision of Peter L. Hammer. Annals of Mathematics and Artificial Intelligence, 49, 265–312.
Balcan, M. -F., Blum, A., Vempala, S. (2008). A discriminative framework for clustering via similarity functions. In Proceedings of the Fortieth ACM Symposium on Theory of Computing (pp. 671– 680).
Bazaraa, M., Sherali, H., Shetty, C. (2006). Nonlinear programming: theory and algorithms. New York: Wiley.
Beasley, J., & Chu, P. (1996). A genetic algorithm for the set covering problem. European Journal of Operation Research, 94, 392–404.
Bennett, K., & Mangasarian, O. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1, 23–34.
Bennett, K., & Mangasarian, O. (1994). Bilinear separation of two sets in n −space. Computational Optimization and Applications, 2, 207–227.
Bonates, T., Hammer, P. L., Kogan, A. (2008). Maximum patterns in datasets. Discrete Applied Mathematics, 156(6), 846–861.
Boros, E., Hammer, P.L., Ibaraki, T., Kogan, A., Mayoraz, E., Muchnik, I. (2000). An implementation of logical analysis of data. IEEE Transactions on Knowledge and Data Engineering, 12, 292–306.
Bradley, P., & Mangasarian, O. (2000). Massive data discrimination via linear support vector machines. Optimization Methods and Software, 13(1), 1–20.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Chvatal, V. (1979). A greedy heuristic for the set covering problem. Mathematics of Operations Research, 4, 233–235.
Cohen, W. W. (1995). Fast effective rule induction. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123).
Cortes, C., & Vapnik, V. (1995). Support vector networks. Machine Learning, 20, 273–297.
Eick, C. F., Zeidat, N., Zhao, Z. (2004). Supervised clustering – algorithms and benefits. In 16Th IEEE international conference on tools with artificial intelligence (pp. 774–776).
Falk, J., & Lopez-Cardona, E. (1997). The surgical separation of sets. Journal of Global Optimization, 11, 433–462.
Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In Proceedings of the Fifteenth International Conference on Machine Learning (pp. 144–151).
Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Thirteenth International Conference on Machine Learning (pp. 148–156).
Fung, G., & Mangasarian, O. (2003). Finite Newton method for Lagrangian support vector machine classification. Neurocomputing, 55, 39–55.
Guo, C., & Ryoo, H.S. (2012). Compact MILP models for optimal and Pareto-optimal LAD patterns. Discrete Applied Mathematics, 160, 2339–2348.
Guo, C., & Ryoo, H.S. (2018). On Pareto-optimal Boolean logical patterns for numerical data. Submitted for publication.
Gurobi Optimization Inc. (2017). Gurobi optimizer reference manual. http://www.gurobi.com.
Hammer, P.L., Kogan, A., Simeone, B., Szedmak, S. (2004). Pareto-optimal patterns in logical analysis of data. Discrete Applied Mathematics, 144, 79–102.
Haykin, S. (1999). Neural networks: a comprehensive foundation. Englewood Cliffs: Prentice Hall.
Hoffman, K., & Padberg, M. (1993). Solving airline crew scheduling problems by branch-and-cut. Management Science, 39(6), 657–682.
Jain, A., Murty, M., Flynn, P. (1999). Data clustering: a review. ACM Computing Surveys, 31(3), 264–323.
Jain, A. (2010). Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31, 651–666.
John, G., & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (pp. 338–345).
Kim, K., & Ryoo, H.S. (2007a). Data separation via a finite number of discriminant functions: a global optimization approach. Applied Mathematics and Computation, 190 (1), 476–489.
Kim, K., & Ryoo, H.S.S. (2007b). Nonlinear separation of data via mixed 0-1 integer and linear programming. Applied Mathematics and Computation, 193(1), 183–196.
Kim, K., & Ryoo, H.S. (2008). A LAD-based method for selecting short oligo probes for genotyping applications. OR Spectrum, 30(2), 249–268.
Kohavi, R. (1995). The power of decision tables. In Proceedings of the Eighth European Conference on Machine Learning (pp. 179–189).
Kolesar, P., & Walker, W. (1974). An algorithm for the dynamic relocation of fire companies. Operations Research, 22, 249–274.
Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Lorena, L., & Lopes, F. (1994). A surrogate heuristic for set covering problems. European Journal of Operational Research, 79, 138–150.
Ma, Z., & Ryoo, H.S. (2012). General set covering for feature selection in data mining. Management Science and Financial Engineering, 18(2), 13–17.
Mangasarian, O. (1965). Linear and nonlinear separation of patterns by linear programming. Operations Research, 13(3), 444–452.
Mangasarian, O. (1968). Multisurface method of pattern separation. IEEE Transactions on Information Theory, 14(6), 801–807.
Mangasarian, O. (1993). Mathematical programming in neural network. ORSA Journal on Computing, 5(4), 349–360.
Platt, J. (1999). Fast training of support vector machines using sequential minimal optimization, (pp. 185–208). Cambridge: MIT Press.
Quinlan, R. (1993). C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann Publishers.
Ryoo, H.S., & Jang, I. (2009). MILP approach to pattern generation in logical analysis of data. Discrete Applied Mathematics, 157, 749–761.
Ullman, J. (1973). Pattern recognition techniques. London: Crane.
Vapnik, V. (1998). Statistical learning theory. New York: Wiley-Interscience.
Vapnik, V. (2000). The nature of statistical learning theory, 2nd edn. Berlin: Springer.
Wedelin, D. (1995). An algorithm for large scale 0-1 inter programming with application to airline crew scheduling. Annals of Operations Research, 57, 283–301.
Yan, K., & Ryoo, H.S. (2017a). 0-1 multilinear programming as a unifying theory for LAD pattern generation. Discrete Applied Mathematics, 218, 21–39.
Yan, K., & Ryoo, H.S. (2017b). Strong valid inequalities for Boolean logical pattern generation. Journal of Global Optimization, 69(1), 183–230.
Yan, K., & Ryoo, H.S. (2020). Cliques for Multi-Term linearization of 0-1 multilinear program for Boolean logical pattern generation. In Optimization of Complex Systems: Theory, Models, Algorithms and Applications, Advances in Intelligent Systems and Computing, 991, 376–386.
Funding
This work was supported by research grant awarded to H.S. Ryoo by Samsung Science and Technology Foundation under Project Number SSTF-BA1501-03 and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (Grant Number: 2017R1D1A1A02018729).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ma, Z., Ryoo, H.S. Spherical Classification of Data, a New Rule-Based Learning Method. J Classif 38, 44–71 (2021). https://doi.org/10.1007/s00357-019-09355-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-019-09355-z