Skip to main content

Advertisement

Log in

Using cloud effectively in concept based text mining using grey wolf self organizing feature map

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cloud computing is considered to be an integral aspect in all business and this is expected to change the information technology (IT) landscape. This has been based on the model that delivers services on the internet using the pay-as-you go model that has several advantages like the no up-front cost, a lower IT staff, and a lower operation cost. A technology that is made use of for retrieval of data from huge database is known as text mining. This is used by cloud for efficiently retrieving data from the data centres of cloud. In providing navigation as well as mechanisms for browsing intuitively, text document clustering has an important role. This is done by organizing huge amounts of information into smaller number of clusters. Bag of words (BoW) is a representation that is used for the clustering of these methods but in many case it is not satisfactory as relations that exist between terms that don’t co-occur are ignored. To handle this problem a document level and sentence level integration of the concepts is made. This increases the space of the feature vector and also brings down the clustering algorithm’s efficiency. In order to overcome this a self-organizing feature map (SOFM) based algorithm makes use of the concepts of genetic algorithm (GA) along with grey wolf optimization (GWO) which are considered popular in the SOFM. The goal of the SOFM-GA is to find an optimal topology of network (the number of neurons and their array dimension) along with an optimal training parameter like the scheduling of learning rate and the annealing of neighborhood width. The SOFM-GWO and the GWO-based approach to the formation of SOFM are compared with the SOM standard relating to quality and the weights and map generated. The results of the experiment show that this method achieved better results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Irfan, R., King, C.K., Grages, D., Ewen, S., Khan, S.U., Madani, S.A., Tziritas, N.: A survey on text mining in social networks. Knowl. Eng. Rev. 30(02), 157–170 (2015)

    Article  Google Scholar 

  2. Bandaru, S., Madhuri, K.B.: An efficient semantic model for concept based clustering and classification. Int. J. Comput. Sci. Eng. 4(3), 340 (2012)

    Google Scholar 

  3. Lama, P. (2013). Clustering system based on text mining using the K-means algorithm: news headlines clustering

  4. Khalid, A., Alam, F., Ahmed, I.: Extracting reference text from citation contexts. Clust. Comput. 1, 1–18 (2017)

    Google Scholar 

  5. Yasodha, M., Ponmuthuramalingam, P.: An advanced concept-based mining model to enrich text clustering. Int. J. Comput. Sci. Issue 9(4), 417–422 (2012)

    Google Scholar 

  6. Durga, J., Sunitha, D., Narasimha, S. P. (2012). A survey on concept based mining model using various clustering techniques. Int. J. Adv. Res. Comput. Sci. Softw. Eng

  7. Le Thi, H.A., Nguyen, M.C.: Self-organizing maps by difference of convex functions optimization. Data Min. Knowl. Discov. 28(5–6), 1336–1365 (2014)

    Article  MathSciNet  Google Scholar 

  8. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014)

    Article  Google Scholar 

  9. Song, X., Tang, L., Zhao, S., Zhang, X., Li, L., Huang, J., Cai, W.: Grey wolf optimizer for parameter estimation in surface waves. Soil Dyn. Earthq. Eng. 75, 147–157 (2015)

    Article  Google Scholar 

  10. Krishna, S.M., Bhavani, S.D.: An efficient approach for text clustering based on frequent itemsets. Eur. J. Sci. Res. 42(3), 399–410 (2010)

    Google Scholar 

  11. Saranya, S., Munieswari, R.: A survey on improving the clustering performance in text mining for efficient information retrieval. Int. J. Eng. Trends Technol. (IJETT) 8(5), 1 (2014)

    Google Scholar 

  12. Yang, H., Wang, Z., Xu, H. (2015) On-line text mining and recommendation based on ontology and implied sentiment inclination. In 2015 17th International Conference on Advanced Communication Technology (ICACT), pp. 613–617. IEEE

  13. Vidhya, K. A., Aghila, G. (2010) Hybrid text mining model for document classification. In 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE), Vol. 1, pp. 210–214. IEEE

  14. Santra, A.K., Christy, C.J.: Genetic algorithm and confusion matrix for document clustering. Int. J. Comput. Sci. Issues 9(1), 322–328 (2012)

    Google Scholar 

  15. Emary, E., Yamany, W., Hassanien, A.E., Snasel, V.: Multi-objective gray-wolf optimization for attribute reduction. Proc. Comput. Sci. 65, 623–632 (2015)

    Article  Google Scholar 

  16. Kishor, A., Singh, P. K.: Empirical study of grey wolf optimizer. In Proceedings of Fifth International Conference on Soft Computing for Problem Solving, pp. 1037–1049. Springer, Singapore (2016)

  17. Elhariri, E., El-Bendary, N., Hassanien, A. E., Abraham, A.: Grey wolf optimization for one-against-one multi-class support vector machines. In 2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp. 7–12. IEEE (2015)

  18. Matharage, S., Alahakoon, D.: Enhancing GSOM text clustering with latent semantic analysis. In 2010 Fifth International Conference on Information and Automation for Sustainability, pp. 441–446. IEEE (2010)

  19. Yu, L., Zheng, J., Shen, W.C., Wu, B., Wang, B., Qian, L., Zhang, B.R.: BC-PDM: data mining, social network analysis and text mining system based on cloud computing. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1496–1499. ACM (2012)

  20. Zhou, J., Cao, Z., Dong, X., Lin, X.: PPDM: a privacy-preserving protocol for cloud-assisted e-healthcare systems. IEEE J. Sel. Topics Signal Process. 9(7), 1332–1344 (2015)

    Article  Google Scholar 

  21. Zeng, J., Ruan, G., Crowell, A., Prakash, A., Plale, B.: Cloud computing data capsules for non-consumptive use of texts. In Proceedings of the 5th ACM Workshop on Scientific Cloud Computing, pp. 9–16. ACM (2014)

  22. Tablan, V., Roberts, I., Cunningham, H., Bontcheva, K.: GATECloud. net: a platform for large-scale, open-source text processing on the cloud. Philos. Trans. R. Soc. A, 371(1983), 20120071 (2013)

  23. Samovsky, M., Kacur, T.: Cloud-based classification of text documents using the Grid gain platform. In: 2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 241–245. IEEE (2012)

  24. Punitha, S.C., Thangaiah, P.R.J., Punithavalli, M.: Performance analysis of clustering using partitioning and hierarchical clustering techniques. Int. J. Database Theory Appl. 7(6), 233–240 (2014)

    Article  Google Scholar 

  25. Bhardwaj, B.: Text mining, its utilities, challenges and clustering techniques. Int. J. Comput. Appl. 135(7), 22–24 (2016)

    Google Scholar 

  26. Shehata, S., Karray, F., Kamel, M.: An efficient concept-based mining model for enhancing text clustering. IEEE Trans. Knowl. Data Eng. 22(10), 1360–1371 (2010)

    Article  Google Scholar 

  27. Menaga, N., Hemapriya, B.: An efficient concept-based mining model for enhancing text clustering. Int. J. Comput. Trends Technol. 41 (2013)

  28. Navaneethakumar, V.M., Chandrasekar, C.: A consistent web documents based text clustering using concept based mining model. IJCSI Int. J. Comput. Sci. 2012, 9 (2012)

    Google Scholar 

  29. Huang, C.L., Tsai, C.Y.: A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting. Expert Syst. Appl. 36(2), 1529–1539 (2009)

    Article  MathSciNet  Google Scholar 

  30. Jang, J., Lee, Y., Lee, S., Shin, D., Kim, D., Rim, H.: A novel density-based clustering method using word embedding features for dialogue intention recognition. Clust. Comput. 19(4), 2315–2326 (2016)

  31. Bharadwaj, D., Shukla, S.: Text mining technique using genetic algorithm. In: Proceedings on International Conference on Advances in Computer Application (ICACA) (2013)

  32. Zhang, S., Zhou, Y.: Grey wolf optimizer based on powell local optimization method for clustering analysis. Discret. Dyn. Nat. Soc (2015)

  33. Saremi, S., Mirjalili, S.Z., Mirjalili, S.M.: Evolutionary population dynamics and grey wolf optimizer. Neural Comput. Appl. 26(5), 1257–1263 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Thilagavathy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thilagavathy, R., Sabitha, R. Using cloud effectively in concept based text mining using grey wolf self organizing feature map. Cluster Comput 22 (Suppl 5), 10697–10707 (2019). https://doi.org/10.1007/s10586-017-1159-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1159-y

Keywords

Navigation