Skip to main content

PreCLAS: An Evolutionary Tool for Unsupervised Feature Selection

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12344))

Included in the following conference series:

  • 995 Accesses

Abstract

Several research areas are being faced with data matrices that are not suitable to be managed with traditional clustering, regression, or classification strategies. For example, biological so-called omic problems present models with thousands or millions of rows and less than a hundred columns. This matrix structure hinders the successful progress of traditional data analysis methods and thus needs some means for reducing the number of rows. This article presents an unsupervised approach called PreCLAS for preprocessing matrices with dimension problems to obtain data that are apt for clustering and classification strategies. The PreCLAS was implemented as an unsupervised strategy that aims at finding a submatrix with a drastically reduced number of rows, preferring those rows that together present some group structure. Experimentation was carried out in two stages. First, to assess its functionality, a benchmark dataset was studied in a clustering context. Then, a microarray dataset with genomic information was analyzed, and the PreCLAS was used to select informative genes in the context of classification strategies. Experimentation showed that the new method performs successfully at drastically reducing the number of rows of a matrix, smartly performing unsupervised feature selection for both classification and clustering problems.

This work is supported by CONICET (Grant number 112-2017-0100829) and Secre-taría de Ciencia y Tecnología (UNS) (Grant number 24/N042).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alvar, A.S., Abadeh, M.S.: Efficient instance selection algorithm for classification based on fuzzy frequent patterns. In: 2016 IEEE 17th International Symposium on Computational Intelligence and Informatics (CINTI), pp. 000319–000324 (2016)

    Google Scholar 

  2. Antonelli, M., Ducange, P., Marcelloni, F.: Genetic training instance selection in multiobjective evolutionary fuzzy systems: a coevolutionary approach. Trans. Fuzzy Sys. 20(2), 276–290 (2012)

    Article  Google Scholar 

  3. Bezdek, J.C., Hathaway, R.J.: VAT: a tool for visual assessment of (cluster) tendency. In: Proceedings of the 2002 International Joint Conference on Neural Networks, IJCNN 2002 (Cat. No. 02CH37290), vol. 3, pp. 2225–2230 (2002)

    Google Scholar 

  4. Chen, Z.-Y., Tsai, C.-F., Eberle, W., Lin, W.-C., Ke, S.-W.: Instance selection by genetic-based biological algorithm. Soft. Comput. 19(5), 1269–1282 (2014). https://doi.org/10.1007/s00500-014-1339-0

    Article  Google Scholar 

  5. Darwin, C.: On the Origin of Species by Means of Natural Selection. Murray, London (1859)

    Google Scholar 

  6. Delany, S.J., Segata, N., Mac Namee, B.: Profiling instances in noise reduction. Knowl.-Based Syst. 31, 28–40 (2012)

    Article  Google Scholar 

  7. Derrac, J., García, S., Herrera, F.: A survey on evolutionary instance selection and generation. Int. J. Appl. Metaheuristic Comput. 1(1), 60–92 (2010)

    Article  Google Scholar 

  8. Edgar, R., Domrachev, M., Lash, A.E.: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30(1), 207–210 (2002)

    Article  Google Scholar 

  9. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co. Inc., Reading (1989)

    MATH  Google Scholar 

  10. Grochowski, M., Jankowski, N.: Comparison of instance selection algorithms II. Results and comments. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 580–585. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24844-6_87

    Chapter  MATH  Google Scholar 

  11. Guillen, A., Herrera, L.J., Rubio, G., Pomares, H., Lendasse, A., Rojas, I.: New method for instance or prototype selection using mutual information in time series prediction. Neurocomputing 73(10–12), 2030–2038 (2010)

    Article  Google Scholar 

  12. Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor (1975). 2nd edn, 1992

    Google Scholar 

  13. Ishibuchi, H., Nakashima, T., Nii, M.: Learning of neural networks with GA-based instance selection. In: Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569), vol. 4, pp. 2102–2107, August 2001

    Google Scholar 

  14. Jamjoom, M., El Hindi, K.: Partial instance reduction for noise elimination. Pattern Recogn. Lett. 74(C), 30–37 (2016)

    Article  Google Scholar 

  15. Kassambara, A.: Practical Guide To Principal Component Methods in R: PCA, M (CA), FAMD, MFA, HCPC, Factoextra, vol. 2. STHDA (2017)

    Google Scholar 

  16. Kuri-Morales, A., Rodríguez, F.: A search space reduction methodology for large databases: a case study. In: Perner, P. (ed.) ICDM 2007. LNCS (LNAI), vol. 4597, pp. 199–213. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73435-2_16

    Chapter  Google Scholar 

  17. Lawson, R.G., Jurs, P.C.: New index for clustering tendency and its application to chemical problems. J. Chem. Inf. Comput. Sci. 30(1), 36–41 (1990)

    Article  Google Scholar 

  18. Mirisaee, S.H., Douzal, A., Termier, A.: Selecting representative instances from datasets. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 1–10 (2015)

    Google Scholar 

  19. Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Kittler, J.: A review of instance selection methods. Artif. Intell. Rev. 34(2), 133–143 (2010). https://doi.org/10.1007/s10462-010-9165-y10.1007/s10462-010-9165-y

    Article  Google Scholar 

  20. Olvera-López, J.A., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: Object selection based on clustering and border objects. In: Kurzynski, M., Puchala, E., Wozniak, M., Zolnierek, A. (eds.) Computer Recognition Systems. AINSC, vol. 45, pp. 27–34. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75175-5_4

    Chapter  Google Scholar 

  21. Ruspini, E.H.: Numerical methods for fuzzy clustering. Inf. Sci. 2(3), 319–350 (1970)

    Article  Google Scholar 

  22. Samuels, E.: Fantasies of Identification: Disability, Gender, Race. NYU Press, New York (2014)

    Google Scholar 

  23. Sato, T., et al.: PRC2 overexpression and PRC2-target gene repression relating to poorer prognosis in small cell lung cancer. Sci. Rep. 3 (2013). Article number: 1911

    Google Scholar 

  24. Triguero, I., García, S., Herrera, F.: Differential evolution for optimizing the positioning of prototypes in nearest neighbor classification. Pattern Recogn. 44(4), 901–916 (2011)

    Article  Google Scholar 

  25. Tsai, C.F., Eberle, W., Chu, C.Y.: Genetic algorithms in feature and instance selection. Know.-Based Syst. 39, 240–247 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ignacio Ponzoni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carballido, J.A., Ponzoni, I., Cecchini, R.L. (2020). PreCLAS: An Evolutionary Tool for Unsupervised Feature Selection. In: de la Cal, E.A., Villar Flecha, J.R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2020. Lecture Notes in Computer Science(), vol 12344. Springer, Cham. https://doi.org/10.1007/978-3-030-61705-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61705-9_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61704-2

  • Online ISBN: 978-3-030-61705-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics