Scalability Analysis of ANN Training Algorithms with Feature Selection

  • Verónica Bolón-Canedo
  • Diego Peteiro-Barral
  • Amparo Alonso-Betanzos
  • Bertha Guijarro-Berdiñas
  • Noelia Sánchez-Maroño
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7023)


The advent of high dimensionality problems has brought new challenges for machine learning researchers, who are now interested not only in the accuracy but also in the scalability of algorithms. In this context, machine learning can take advantage of feature selection methods to deal with large-scale databases. Feature selection is able to reduce the temporal and spatial complexity of learning, turning an impracticable algorithm into a practical one. In this work, the influence of feature selection on the scalability of four of the most well-known training algorithms for feedforward artificial neural networks (ANNs) is studied. Six different measures are considered to evaluate scalability, allowing to establish a final score to compare the algorithms. Results show that including a feature selection step, ANNs algorithms perform much better in terms of scalability.


Feature Selection Training Time Feature Subset Test Error Subset Evaluation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction. Foundations and Applications. Springer, Heidelberg (2006)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bolon-Canedo, V., Sanchez-Maroño, N., Alonso-Betanzos, A.: On the Effectiveness of Discretization on Gene Selection of Microarray Data. In: Proceedings of the International Joint Conference on Neural Networks, pp. 3167–3174 (2010)Google Scholar
  3. 3.
    Bolon-Canedo, V., Sanchez-Maroño, N., Alonso-Betanzos, A.: Feature Selection and Classification in Multiple Class Datasets: An Application to KDD Cup 99 Dataset. Journal of Expert Systems with Applications (38), 5947–5957 (2011)CrossRefGoogle Scholar
  4. 4.
    Dong, J.: Speed and accuracy: large-scale machine learning algorithms and their applications. Concordia University Montreal, PQ (2003)Google Scholar
  5. 5.
    Sonnenburg, S., Ratsch, G., Rieck, K.: Large scale learning with string kernels. Journal of Large-Scale Kernel Machines, 73–104 (2007)Google Scholar
  6. 6.
    Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. Journal of Advances in Neural Information Processing Systems 20, 161–168 (2008)Google Scholar
  7. 7.
    Catlett, J.: Megainduction: machine learning on very large databases. Ph.D. dissertation, School of Computer Science, University of Technology, Sydney, Australia (1991)Google Scholar
  8. 8.
    Provost, F., Kolluri, V.: A survey of methods for scaling up inductive algorithms. Journal of Data Mining and Knowledge Discovery 3(2), 131–169 (1999)CrossRefGoogle Scholar
  9. 9.
    Peteiro-Barral, D., Guijarro-Berdinas, B., Pérez-Sánchez, B., Fontenla-Romero, O.: On the Scalability of Machine Learning Algorithms for Artificial Neural Networks. Journal of IEEE Transactions on Neural Networks (under review)Google Scholar
  10. 10.
    Sonnemburg, S., Franc, V., Yom-Tov, E., Sebag, M.: PASCAL Large Scale Learning Challenge. Journal of Machine Learning Research (2009)Google Scholar
  11. 11.
    Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research 5, 1205–1224 (2004)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Hall, M.A.: Correlation-based Feature Selection for Machine Learning. PhD thesis, University of Waikato, Hamilton, New Zealand (1999)Google Scholar
  13. 13.
    Zhao, Z., Liu, H.: Searching for Interacting Features. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1156–1167 (1991)Google Scholar
  14. 14.
    Dash, M., Liu, H.: Consistency-based Search in Feature Selection. Journal of Artificial Intelligence 151(1-2), 155–176 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Bishop, C.M.: Pattern recognition and machine learning. Springer, New York (2006)zbMATHGoogle Scholar
  16. 16.
    Møller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Journal of Neural Networks 6(4), 525–533 (1993)CrossRefGoogle Scholar
  17. 17.
    More, J.: The Levenberg-Marquardt algorithm: implementation and theory. Journal of Numerical Analysis, 105-116 (1978)Google Scholar
  18. 18.
    Weiss, S.M., Kulikowski, C.A.: Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems. Morgan Kaufmann, San Francisco (1991)Google Scholar
  19. 19.
    Hecht-Nielsen, R.: Neurocomputing. Addison-Wesley, Menlo Park (1990)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Verónica Bolón-Canedo
    • 1
  • Diego Peteiro-Barral
    • 1
  • Amparo Alonso-Betanzos
    • 1
  • Bertha Guijarro-Berdiñas
    • 1
  • Noelia Sánchez-Maroño
    • 1
  1. 1.Laboratory for Research and Development in Artificial Intelligence (LIDIA), Computer Science Dept.University of A CoruñaA CoruñaSpain

Personalised recommendations