Advertisement

Proposed Methodology

  • Laith Mohammad Qasim Abualigah
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 816)

Abstract

This chapter presents and summarizes the proposed method that was used to achieve the research objectives of the present study, including (i) a new weight scheme, that is, length feature weight (LFW) in Sect. 4.4.1; (ii) three models for TFSP to find the best algorithm for the FS problem in Sect. 4.5; (iii) a dynamic DR technique in Sect. 4.6; (iv) three models of the KHA for TDCP in Sect. 4.8; (v) a new multi-objective function for enhancing the clustering decision of the local search algorithm in Sect. 4.8; (vi) experiments and results in Sect. 4.9; and (vii) conclusion in Sect. 4.10.

References

  1. Abualigah, L. M. Q., & Hanandeh, E. S. (2015). Applying genetic algorithms to information retrieval using vector space model. International Journal of Computer Science, Engineering and Applications, 5(1), 19.CrossRefGoogle Scholar
  2. Abualigah, L. M., & Khader, A. T. (2017). Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. The Journal of Supercomputing, 1–23.Google Scholar
  3. Abualigah, L. M., Khader, A. T., & Al-Betar, M. A. (2016a, July). Multi-objectives based text clustering technique using k-mean algorithm. In 2016 7th International Conference on Computer Science and Information Technology (CSIT) (pp. 1–6).  https://doi.org/10.1109/CSIT.2016.7549464.
  4. Abualigah, L. M., Khader, A. T., & Al-Betar, M. A. (2016b, July). Unsupervised feature selection technique based on genetic algorithm for improving the text clustering. In 2016 7th International Conference on Computer Science and Information Technology (CSIT) (pp. 1–6).  https://doi.org/10.1109/CSIT.2016.7549453.
  5. Abualigah, L. M., Khader, A. T., & Al-Betar, M. A. (2016c, July). Unsupervised feature selection technique based on harmony search algorithm for improving the text clustering. In 2016 7th International Conference on Computer Science and Information Technology (CSIT) (pp. 1–6).  https://doi.org/10.1109/CSIT.2016.7549456.
  6. Abualigah, L. M., Khader, A. T., Al-Betar, M. A., & Awadallah, M. A. (2016). A krill herd algorithm for efficient text documents clustering. In 2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE) (pp. 67–72).Google Scholar
  7. Abualigah, L. M., Khader, A. T., AlBetar, M. A., & Hanandeh, E. S. (2017). Unsupervised text feature selection technique based on particle swarm optimization algorithm for improving the text clustering. EAI.  https://doi.org/10.4108/eai.27-2-2017.152282.
  8. Al-Betar, M. A., Awadallah, M. A., Khader, A. T., & Abdalkareem, Z. A. (2015). Island-based harmony search for optimization problems. Expert Systems with Applications, 42(4), 2026–2035.CrossRefGoogle Scholar
  9. Armano, G., & Farmani, M. R. (2016). Multiobjective clustering analysis using particle swarm optimization. Expert Systems with Applications, 55, 184–193.CrossRefGoogle Scholar
  10. Bandyopadhyay, S., & Maulik, U. (2002). An evolutionary technique based on k-means algorithm for optimal clustering in rn. Information Sciences, 146(1), 221–237.MathSciNetzbMATHCrossRefGoogle Scholar
  11. Basu, T., & Murthy, C. (2015). A similarity assessment technique for effective grouping of documents. Information Sciences, 311, 149–162.CrossRefGoogle Scholar
  12. Bharti, K. K., & Singh, P. K. (2014). A three-stage unsupervised dimension reduction method for text clustering. Journal of Computational Science, 5(2), 156–169.CrossRefGoogle Scholar
  13. Bharti, K. K., & Singh, P. K. (2015a). Chaotic gradient artificial bee colony for text clustering. Soft Computing, 1–14.Google Scholar
  14. Bharti, K. K., & Singh, P. K. (2015b). Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Systems with Applications, 42(6), 3105–3114.CrossRefGoogle Scholar
  15. Bharti, K. K., & Singh, P. K. (2016). Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Applied Soft Computing, 43, 20–34.CrossRefGoogle Scholar
  16. Bolaji, A. L., Al-Betar, M. A., Awadallah, M. A., Khader, A. T., & Abualigah, L. M. (2016). A comprehensive review: Krill herd algorithm (kh) and its applications. Applied Soft Computing, 49, 437–446.CrossRefGoogle Scholar
  17. Chen, L., Liu, M., Wu, C., & Xu, A. (2016). A novel clustering algorithm and its incremental version for large-scale text collection. Information Technology and Control, 45(2), 136–147.CrossRefGoogle Scholar
  18. Cobos, C., León, E., & Mendoza, M. (2010). A harmony search algorithm for clustering with feature selection. Revista Facultad de Ingeniería Universidad de Antioquia (55), 153–164.Google Scholar
  19. Cole, R. M. (1998). Clustering with genetic algorithms. Citeseer.Google Scholar
  20. Cui, X., Potok, T. E., & Palathingal, P. (2005). Document clustering using particle swarm optimization. In Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE (pp. 185–191).Google Scholar
  21. De Vries, C. M. (2014). Document clustering algorithms, representations and evaluation for information retrieval.Google Scholar
  22. Deb, K., Sindhya, K., & Hakanen, J. (2016). Multi-objective optimization. Decision sciences: Theory and practice (pp. 145–184). Boca Raton: CRC Press.CrossRefGoogle Scholar
  23. Del Buono, N., & Pio, G. (2015). Non-negative matrix tri-factorization for co-clustering: An analysis of the block matrix. Information Sciences, 301, 13–26.CrossRefGoogle Scholar
  24. Forsati, R., & Mahdavi, M. (2010). Web text mining using harmony search. Recent advances in harmony search algorithm (pp. 51–64). Berlin: Springer.CrossRefGoogle Scholar
  25. Forsati, R., Mahdavi, M., Shamsfard, M., & Meybodi, M. R. (2013). Efficient stochastic algorithms for document clustering. Information Sciences, 220, 269–291.MathSciNetCrossRefGoogle Scholar
  26. Forsati, R., Keikha, A., & Shamsfard, M. (2015). An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing, 159, 9–26.CrossRefGoogle Scholar
  27. Gandomi, A. H., & Alavi, A. H. (2012). Krill herd: A new bio-inspired optimization algorithm. Communications in Nonlinear Science and Numerical Simulation, 17(12), 4831–4845.MathSciNetzbMATHCrossRefGoogle Scholar
  28. George, G., & Parthiban, L. (2015). Multi objective hybridized firefly algorithm with group search optimization for data clustering. In 2015 IEEE International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN) (pp. 125–130).Google Scholar
  29. Ghanem, O., & Alhanjouri, M. (2014). Evaluating the effect of preprocessing in arabic documents clustering (Unpublished doctoral dissertation). Master’s thesis, Computer Engineering Department, Islamic University of Gaza, Palestine.Google Scholar
  30. Hong, S.-S., Lee, W., & Han, M.-M. (2015). The feature selection method based on genetic algorithm for efficient of text clustering and text classification. International Journal of Advances in Soft Computing and Its Applications, 7(1), 22–40.Google Scholar
  31. Inbarani, H. H., Bagyamathi, M., & Azar, A. T. (2015). A novel hybrid feature selection method based on rough set and improved harmony search. Neural Computing and Applications, 26(8), 1859–1880.CrossRefGoogle Scholar
  32. Karol, S., & Mangat, V. (2013). Evaluation of text document clustering approach based on particle swarm optimization. Open Computer Science, 3(2), 69–90.CrossRefGoogle Scholar
  33. Kaur, S. P., & Madan, N. (2016). Document clustering using firefly algorithm. Artificial Intelligent Systems and Machine Learning, 8(5), 182–185.Google Scholar
  34. Liao, H., Xu, Z., & Zeng, X.-J. (2014). Distance and similarity measures for hesitant fuzzy linguistic term sets and their application in multi-criteria decision making. Information Sciences, 271, 125–142.MathSciNetzbMATHCrossRefGoogle Scholar
  35. Mahdavi, M., & Abolhassani, H. (2009). Harmony k-means algorithm for document clustering. Data Mining and Knowledge Discovery, 18(3), 370–391.MathSciNetCrossRefGoogle Scholar
  36. Mahdavi, M., Chehreghani, M. H., Abolhassani, H., & Forsati, R. (2008). Novel meta-heuristic algorithms for clustering web documents. Applied Mathematics and Computation, 201(1), 441–451.MathSciNetzbMATHCrossRefGoogle Scholar
  37. Maimon, O., & Rokach, L. (2005). Data mining and knowledge discovery handbook (Vol. 2). New York: Springer.zbMATHCrossRefGoogle Scholar
  38. Moayedikia, A., Jensen, R., Wiil, U. K., & Forsati, R. (2015). Weighted bee colony algorithm for discrete optimization problems with application to feature selection. Engineering Applications of Artificial Intelligence, 44, 153–167.CrossRefGoogle Scholar
  39. Mohammed, A. J., Yusof, Y., & Husni, H. (2014). Weight-based firefly algorithm for document clustering. In Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013) (pp. 259–266).Google Scholar
  40. Nanda, S. J., & Panda, G. (2014). A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm and Evolutionary Computation, 16, 1–18.CrossRefGoogle Scholar
  41. Nebu, C. M., & Joseph, S. (2016). A hybrid dimension reduction technique for document clustering. Innovations in bio-inspired computing and applications (pp. 403–416). Cham: Springer.CrossRefGoogle Scholar
  42. Prabha, K. A., & Visalakshi, N. K. (2014). Improved particle swarm optimization based k-means clustering. In 2014 International Conference on Intelligent Computing Applications (ICICA) (pp. 59–63).Google Scholar
  43. Salton, G., Wong, A., & Yang, C.-S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18(11), 613–620.zbMATHCrossRefGoogle Scholar
  44. Shafiei, M., Wang, S., Zhang, R., Milios, E., Tang, B., Tougas, J., et al. (2007). Document representation and dimension reduction for text clustering. In 2007 IEEE 23rd International Conference on Data Engineering Workshop (pp. 770–779).Google Scholar
  45. Shah, N., & Mahajan, S. (2012). Document clustering: A detailed review. International Journal of Applied Information Systems, 4(5), 30–38.CrossRefGoogle Scholar
  46. Singh, P., & Sharma, M. (2013). Text document clustering and similarity measures. Department of Computer Science & Engineering.Google Scholar
  47. Singh, V. K., Tiwari, N., & Garg, S. (2011). Document clustering using k-means, heuristic k-means and fuzzy c-means. In 2011 International Conference on Computational Intelligence and Communication Networks (CICN) (pp. 297–301).Google Scholar
  48. Uğuz, H. (2011). A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm. Knowledge-Based Systems, 24(7), 1024–1032.CrossRefGoogle Scholar
  49. Wang, X., Cao, J., Liu, Y., Gao, S., & Deng, X. (2012). Text clustering based on the improved TFIDF by the iterative algorithm. In 2012 IEEE Symposium on Electrical & Electronics Engineering (EEESYM) (pp. 140–143).Google Scholar
  50. Wang, G.-G., Gandomi, A. H., & Alavi, A. H. (2014). Stud krill herd algorithm. Neurocomputing, 128, 363–370.CrossRefGoogle Scholar
  51. Zaw, M. M., & Mon, E. E. (2015). Web document clustering by using pso-based cuckoo search clustering algorithm. Recent advances in swarm intelligence and evolutionary computation (pp. 263–281). Cham: Springer.Google Scholar
  52. Zhang, Y., Wang, S., Phillips, P., & Ji, G. (2014). Binary pso with mutation operator for feature selection using decision tree applied to spam detection. Knowledge-Based Systems, 64, 22–31.CrossRefGoogle Scholar
  53. Zhao, W., & Wang, Y. (2010a). Notice of retraction an improved genetic algorithm for text feature selection. In 2010 International Conference on Intelligent Computing and Cognitive Informatics (ICICCI) (pp. 7–10).Google Scholar
  54. Zhao, W., & Wang, Y. (2010b). Notice of retraction an improved genetic algorithm for text feature selection. In 2010 International Conference on Intelligent Computing and Cognitive Informatics (ICICCI) (pp. 7–10).Google Scholar
  55. Zhong, S., & Ghosh, J. (2005). Generative model-based document clustering: A comparative study. Knowledge and Information Systems, 8(3), 374–384.CrossRefGoogle Scholar
  56. Zhong, N., Li, Y., & Wu, S.-T. (2012). Effective pattern discovery for text mining. IEEE Transactions on Knowledge and Data Engineering, 24(1), 30–44.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Universiti Sains MalaysiaPenangMalaysia

Personalised recommendations