Advertisement

A First Attempt on Monotonic Training Set Selection

  • J.-R. CanoEmail author
  • S. García
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10870)

Abstract

Monotonicity constraints frequently appear in real-life problems. Many of the monotonic classifiers used in these cases require that the input data satisfy the monotonicity restrictions. This contribution proposes the use of training set selection to choose the most representative instances which improves the monotonic classifiers performance, fulfilling the monotonic constraints. We have developed an experiment on 30 data sets in order to demonstrate the benefits of our proposal.

Keywords

Monotonic classification Ordinal classification Training set selection Data preprocessing Machine learning 

Notes

Acknowledgement

This work was supported by TIN2014-57251-P, by the Spanish “Ministerio de Economía y Competitividad” and by “Fondo Europeo de Desarrollo Regional” (FEDER) under Project TEC2015-69496-R and the Foundation BBVA project 75/2016 BigDaPTOOLS.

References

  1. 1.
    Kotłowski, W., Słowiński, R.: On nonparametric ordinal classification with monotonicity constraints. IEEE Trans. Knowl. Data Eng. 25(11), 2576–2589 (2013)CrossRefGoogle Scholar
  2. 2.
    Gutiérrez, P.A., García, S.: Current prospects on ordinal and monotonic classification. Prog. Artif. Intell. 5(3), 171–179 (2016)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Chen, C.C., Li, S.T.: Credit rating with a monotonicity-constrained support vector machine model. Expert Syst. Appl. 41(16), 7235–7247 (2014)CrossRefGoogle Scholar
  4. 4.
    Ben-David, A.: Monotonicity maintenance in information theoretic machine learning algorithms. Mach. Learn. 19, 29–43 (1995)Google Scholar
  5. 5.
    Potharst, R., Bioch, J.: Decision trees for ordinal classification. Intell. Data Anal. 4, 97–111 (2000)zbMATHGoogle Scholar
  6. 6.
    Alcalá-Fdez, J., Alcalá, R., González, S., Nojima, Y., García, S.: Evolutionary fuzzy rule-based methods for monotonic classification. IEEE Trans. Fuzzy Syst. 25(6), 1376–1390 (2017)CrossRefGoogle Scholar
  7. 7.
    Duivesteijn, W., Feelders, A.: Nearest neighbour classification with monotonicity constraints. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 301–316. Springer, Heidelberg (2008).  https://doi.org/10.1007/978-3-540-87479-9_38CrossRefGoogle Scholar
  8. 8.
    García, J., Albar, A., Aljohani, N., Cano, J.R., García, S.: Hyperrectangles selection for monotonic classification by using evolutionary algorithms. Int. J. Comput. Intell. Syst. 9(1), 184–201 (2016)CrossRefGoogle Scholar
  9. 9.
    García, J., Fardoun, H.M., Alghazzawi, D.M., Cano, J.R., García, S.: Mongel: monotonic nested generalized exemplar learning. Pattern Anal. Appl. 20(2), 441–452 (2017)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)CrossRefGoogle Scholar
  11. 11.
    Triguero, I., González, S., Moyano, J.M., García, S., Alcalá-Fdez, J., Luengo, J., Fernández, A., del Jesús, M.J., Sánchez, L., Herrera, F.: Keel 3.0: an open source software for multi-stage analysis in data mining. Int. J. Comput. Intell. Syst. 10(1), 1238–1249 (2017)CrossRefGoogle Scholar
  12. 12.
    Feelders, A.: Monotone relabeling in ordinal classification. In: IEEE International Conference on Data Mining (ICDM), pp. 803–808 (2010)Google Scholar
  13. 13.
    García, S., Derrac, J., Cano, J.R., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 417–435 (2012)CrossRefGoogle Scholar
  14. 14.
    Silva, D.A., Souza, L.C., Motta, G.H.: An instance selection method for large datasets based on markov geometric diffusion. Data Knowl. Eng. 101, 24–41 (2016)CrossRefGoogle Scholar
  15. 15.
    García, S., Luengo, J., Herrera, F.: Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowl. Based Syst. 98, 1–29 (2016)CrossRefGoogle Scholar
  16. 16.
    Cano, J.R., Aljohani, N.R., Abbasi, R.A., Alowidbi, J.S., García, S.: Prototype selection to improve monotonic nearest neighbor. Eng. Appl. Artif. Intell. 60, 128–135 (2017)CrossRefGoogle Scholar
  17. 17.
    Cano, J.R., Herrera, F., Lozano, M.: Stratification for scaling up evolutionary prototype selection. Pattern Recogn. Lett. 26(7), 953–963 (2005)CrossRefGoogle Scholar
  18. 18.
    Cano, J.R., García, S., Herrera, F.: Subgroup discover in large size data sets preprocessed using stratified instance selection for increasing the presence of minority classes. Pattern Recogn. Lett. 29(16), 2156–2164 (2008)CrossRefGoogle Scholar
  19. 19.
    García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, Heidelberg (2015).  https://doi.org/10.1007/978-3-319-10247-4CrossRefGoogle Scholar
  20. 20.
    Cano, J.R., Herrera, F., Lozano, M.: On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining. Appl. Soft Comput. 6(3), 323–332 (2006)CrossRefGoogle Scholar
  21. 21.
    Nanni, L., Lumini, A., Brahnam, S.: Weighted reward-punishment editing. Pattern Recogn. Lett. 75, 48–54 (2016)CrossRefGoogle Scholar
  22. 22.
    Hu, Q., Che, X., Zhang, L., Zhang, D., Guo, M., Yu, D.: Rank entropy-based decision trees for monotonic classification. IEEE Trans. Knowl. Data Eng. 24(11), 2052–2064 (2012)CrossRefGoogle Scholar
  23. 23.
    Alcalá, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sánchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Logic Soft Comput. 17(255–287), 11 (2010)Google Scholar
  24. 24.
    Bache, K., Lichman, M.: UCI machine learning repository (2013)Google Scholar
  25. 25.
    Ben-David, A., Serling, L., Pao, Y.: Learning and classification of monotonic ordinal concepts. Comput. Intell. 5, 45–49 (1989)CrossRefGoogle Scholar
  26. 26.
    Lievens, S., De Baets, B., Cao-Van, K.: A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting. Ann. Oper. Res. 163, 115–142 (2008)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Lievens, S., De Baets, B.: Supervised ranking in the weka environment. Inf. Sci. 180(24), 4763–4771 (2010)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Gaudette, L., Japkowicz, N.: Evaluation methods for ordinal classification. In: Gao, Y., Japkowicz, N. (eds.) AI 2009. LNCS (LNAI), vol. 5549, pp. 207–210. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-01818-3_25CrossRefGoogle Scholar
  29. 29.
    Milstein, I., Ben-David, A., Potharst, R.: Generating noisy monotone ordinal datasets. Artif. Intell. Res. 3(1), 30–37 (2014)Google Scholar
  30. 30.
    Gibbons, J.D., Chakraborti, S.: Nonparametric statistical inference. In: Lovric, M. (ed.) International Encyclopedia of Statistical Science. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-04898-2_420CrossRefGoogle Scholar
  31. 31.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetzbMATHGoogle Scholar
  32. 32.
    Triguero, I., Peralta, D., Bacardit, J., García, S., Herrera, F.: Mrpr: a mapreduce solution for prototype reduction in big data classification. Neurocomputing 150, 331–345 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Jaén, EPS of LinaresLinaresSpain
  2. 2.Department of Computer Science and Artificial IntelligenceUniversity of GranadaGranadaSpain

Personalised recommendations