Cluster Computing

, Volume 22, Supplement 2, pp 3953–3960 | Cite as

An ACO–ANN based feature selection algorithm for big data

  • R. Joseph Manoj
  • M. D. Anto Praveena
  • K. VijayakumarEmail author


Feature selection is the approach of choosing subset of given dataset based on some feature. It can be used to minimize dimensions of the huge data set. So that it removes unnecessary data in the data source and produces prediction or output accurately in big data analytics. In the proposed work, feature selection algorithm process is implemented for text categorization using the algorithms ant colony optimization (ACO) and artificial neural network (ANN). This hybrid approach simulated using Reuter’s data set and proved its efficiency.


Artificial neural network Feature selection Ant colony optimization 


  1. 1.
    Sagiroglu, S., Sinanc, D.: Big data: a review. In: 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 42–47 (2013)Google Scholar
  2. 2.
    Ramchandra, K.C., Singh, P.S.P.: Pso swarm search feature selection for data stream mining big data using genetic algorithm. Int. Educ. Res. J. 2(2), 33–45 (2016)Google Scholar
  3. 3.
    Singh, S., Singh, N.: Big data analytics. In: 2012 International Conference on Communication, Information & Computing Technology (ICCICT), pp. 1–4 (2012)Google Scholar
  4. 4.
    Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)Google Scholar
  5. 5.
    Kumar, V., Minz, S.: Feature selection: a literature review. Smart Comput. Rev. 4(3), 211–229 (2014)Google Scholar
  6. 6.
    Li, J., Liu, H.: Challenges of feature selection for big data analytics. IEEE Intell. Syst. 32(2), 9–15 (2017)Google Scholar
  7. 7.
    Raymer, M.L., Punch, W.F., Goodman, E.D., Kuhn, L.A., Jain, A.K.: Dimensionality reduction using genetic algorithms. IEEE Trans. Evol. Comput. 2(4), 164–171 (2013)Google Scholar
  8. 8.
    Punch, R.M., Goodman, W.: Dimensionality reduction using genetic algorithms. IEEE Trans. Evol. Comput. 4, 164–171 (2000)Google Scholar
  9. 9.
    Tanaka, K., Kurita, T., Kawabe, T.: Selection of import vectors via binary particle swarm optimization and cross-validation for kernel logistic regression. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN’07), pp. 1037–1042, IEEE (2007)Google Scholar
  10. 10.
    Sivagaminathan, R.K., Ramakrishnan, S.: A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst. Appl. 33(1), 49–60 (2007)Google Scholar
  11. 11.
    Aghdam, M.H., Ghasam-Aghaee, N., Basiri, M.E.: Text feature selection using ant colony optimization. Expert Syst. Appl. 36, 6843–6853 (2009)Google Scholar
  12. 12.
    Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)Google Scholar
  13. 13.
    Nedungadi, P., Remya, M.S.: A scalable feature selection algorithm for large datasets-quick branch & bound iterative (QBB-I). Adv. Comput. Netw. Inform. 1, 125–135 (2014)Google Scholar
  14. 14.
    Papari, B., Edrington, C.S.: Text feature selection using ant colony optimization. Expert Syst. Appl. 36(3), 6843–6853 (2009)Google Scholar
  15. 15.
    Zhou, W., Wu, C., Yi, Y., Luo, G.: Structure preserving non-negative feature self-representation for unsupervised feature selection. IEEE Access 5, 8792–8803 (2007)Google Scholar
  16. 16.
    Chen, S., Li, Z., Li, Y., Xiaosong, W.: Visual analysis of large data text–an empirical study based on Sohu News Data. Int. J. Eng. Innov. Res. 6(3), 141–143 (2017)Google Scholar
  17. 17.
    Aghdam, M.H., Ghasam-Aghaee, N., Basir, M.E.: Text feature selection using ant colonyoptimization. Expert Syst. Appl. 36, 6843–6853 (2009)Google Scholar
  18. 18.
    Vijayakumar, K., Arun, C.: Continuous security assessment of cloud based applications using distributed hashing algorithm in SDLC. Clust. Comput. (2017). Google Scholar
  19. 19.
    Vijayakumar, K., Arun, C.: Analysis and selection of risk assessment frameworks for cloud based enterprise applications Biomed. Res., ISSN: 0976-1683 (Electronic) (2017)Google Scholar
  20. 20.
    Vijayakumar, K., Arun, C.: Automated risk identification using NLP in cloud based development environments. J. Ambient Intell. Hum. Comput. (2017). Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.St.Joseph’s College of EngineeringChennaiIndia
  2. 2.Sathyabama Institute of Science & TechnologyChennaiIndia
  3. 3.St.Joseph’s Institute of TechnologyChennaiIndia

Personalised recommendations