Advertisement

Prototypes Generation from Multi-label Datasets Based on Granular Computing

  • Marilyn BelloEmail author
  • Gonzalo Nápoles
  • Koen Vanhoof
  • Rafael Bello
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11896)

Abstract

Data reduction techniques play a key role in instance-based classification to lower the amount of data to be processed. Prototype generation aims to obtain a reduced training set in order to obtain accurate results with less effort. This translates into a significant reduction in both algorithms’ spatial and temporal burden. This issue is particularly relevant in multi-label classification, which is a generalization of multiclass classification that allows objects to belong to several classes simultaneously. Although this field is quite active in terms of learning algorithms, there is a lack of prototype generation methods. In this research, we propose three prototype generation methods from multi-label datasets based on Granular Computing. The experimental results show that these methods reduce the number of examples into a set of prototypes without affecting the overall performance.

Keywords

Multi-label classification Prototype generation Granular Computing Rough Set Theory 

References

  1. 1.
    Aggarwal, C.C.: Data Classification: Algorithms and Applications. CRC Press, New York (2014)CrossRefGoogle Scholar
  2. 2.
    Barandela, R., Cortés, N., Palacios, A.: The nearest neighbor rule and the reduction of the training sample size. In: Proceedings 9th Symposium on Pattern Recognition and Image Analysis, vol. 1, pp. 103–108 (2001)Google Scholar
  3. 3.
    Bargiela, A., Pedrycz, W.: Granular Computing: An Introduction, vol. 717. Springer, Berlin (2012)zbMATHGoogle Scholar
  4. 4.
    Bello, R., Falcón, R., Pedrycz, W.: Granular Computing: At the Junction of Rough Sets and Fuzzy Sets, vol. 224. Springer, Berlin (2007)Google Scholar
  5. 5.
    Bermejo, S., Cabestany, J.: A batch learning vector quantization algorithm for nearest neighbour classification. Neural Process. Lett. 11(3), 173–184 (2000)CrossRefGoogle Scholar
  6. 6.
    Bezdek, J.C., Kuncheva, L.I.: Nearest prototype classifier designs: an experimental study. Int. J. Intell. Syst. 16(12), 1445–1473 (2001)CrossRefGoogle Scholar
  7. 7.
    Calvo-Zaragoza, J., Valero-Mas, J.J., Rico-Juan, J.R.: Improving knn multi-label classification in prototype selection scenarios using class proposals. Pattern Recogn. 48(5), 1608–1622 (2015)CrossRefGoogle Scholar
  8. 8.
    Charte, F., Charte, D., Rivera, A., del Jesus, M.J., Herrera, F.: R ultimate multilabel dataset repository. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS (LNAI), vol. 9648, pp. 487–499. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-32034-2_41CrossRefGoogle Scholar
  9. 9.
    Chen, B., Sun, M., Zhou, M.: Granular rough theory: a representation semantics oriented theory of roughness. Appl. Soft Comput. 9(2), 786–805 (2009)CrossRefGoogle Scholar
  10. 10.
    Cover, T.M., Hart, P.E., et al.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)CrossRefGoogle Scholar
  11. 11.
    García, S., Cano, J.R., Herrera, F.: A memetic algorithm for evolutionary prototype selection: a scaling up approach. Pattern Recogn. 41(8), 2693–2709 (2008)CrossRefGoogle Scholar
  12. 12.
    García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, Berlin (2015)CrossRefGoogle Scholar
  13. 13.
    García-Durán, R., Fernández, F., Borrajo, D.: A prototype-based method for classification with time constraints: a case study on automated planning. Pattern Anal. Appl. 15(3), 261–277 (2012)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Guan, D., Yuan, W., Lee, Y.K., Lee, S.: Nearest neighbor editing aided by unlabeled data. Inf. Sci. 179(13), 2273–2282 (2009)CrossRefGoogle Scholar
  15. 15.
    Hernández, F., et al.: An approach for prototype generation based on similarity relations for problems of classification. Computación y Sistemas 19(1), 109–118 (2015)Google Scholar
  16. 16.
    Herrera, F., Charte, F., Rivera, A.J., del Jesus, M.J.: Multilabel classification. Multilabel Classification, pp. 17–31. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-41111-8_2CrossRefGoogle Scholar
  17. 17.
    Kim, S.W., Oommen, B.J.: A brief taxonomy and ranking of creative prototype reduction schemes. Pattern Anal. Appl. 6(3), 232–244 (2003)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Nanni, L., Lumini, A.: Prototype reduction techniques: a comparison among different approaches. Expert Syst. Appl. 38(9), 11820–11828 (2011)CrossRefGoogle Scholar
  19. 19.
    Pawlak, Z.: Rough sets. Int. J. Comput. Inf. Sci. 11(5), 341–356 (1982)CrossRefGoogle Scholar
  20. 20.
    Pawlak, Z., Skowron, A.: Rough sets: some extensions. Inf. Sci. 177(1), 28–40 (2007)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Pedrycz, W.: Granular Computing: Analysis and Design of Intelligent Systems. CRC Press, New York (2016)Google Scholar
  22. 22.
    Pedrycz, W., Homenda, W.: Building the fundamentals of granular computing: a principle of justifiable granularity. Appl. Soft Comput. 13(10), 4209–4218 (2013)CrossRefGoogle Scholar
  23. 23.
    Pedrycz, W., Skowron, A., Kreinovich, V.: Handbook of Granular Computing. Wiley, Hoboken (2008)CrossRefGoogle Scholar
  24. 24.
    Slowinski, R., Vanderpooten, D.: A generalized definition of rough approximations based on similarity. IEEE Trans. Knowl. Data Eng. 12(2), 331–336 (2000)CrossRefGoogle Scholar
  25. 25.
    Triguero, I., Derrac, J., Garcia, S., Herrera, F.: A taxonomy and experimental study on prototype generation for nearest neighbor classification. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 42(1), 86–100 (2012)CrossRefGoogle Scholar
  26. 26.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009).  https://doi.org/10.1007/978-0-387-09823-4_34CrossRefGoogle Scholar
  27. 27.
    Wilson, D.R., Martinez, T.R.: Improved heterogeneous distance functions. J. Artif. Intell. Res. 6, 1–34 (1997)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Yao, Y., Zhong, N.: Granular computing using information tables. In: Lin, T.Y., Yao, Y.Y., Zadeh, L.A. (eds.) Data Mining, Rough Sets and Granular Computing. STUDFUZZ, vol. 95, pp. 102–124. Springer, Heidelberg (2002).  https://doi.org/10.1007/978-3-7908-1791-1_5CrossRefGoogle Scholar
  29. 29.
    Zadeh, L.A.: Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 90(2), 111–127 (1997)MathSciNetCrossRefGoogle Scholar
  30. 30.
    Zhang, M.L., Zhou, Z.H.: Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)CrossRefGoogle Scholar
  31. 31.
    Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Marilyn Bello
    • 1
    • 2
    Email author
  • Gonzalo Nápoles
    • 2
  • Koen Vanhoof
    • 2
  • Rafael Bello
    • 1
  1. 1.Computer Science DepartmentUniversidad Central de Las VillasSanta ClaraCuba
  2. 2.Faculty of Business EconomicsHasselt UniversityHasseltBelgium

Personalised recommendations