Skip to main content
Log in

A state-of-art optimization method for analyzing the tweets of earthquake-prone region

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With the increase in accumulated data and usage of the Internet, social media such as Twitter has become a fundamental tool to access all kinds of information. Therefore, it can be expressed that processing, preparing data, and eliminating unnecessary information on Twitter gains its importance rapidly. In particular, it is very important to analyze the information and make it available in emergencies such as disasters. In the proposed study, an earthquake with the magnitude of Mw = 6.8 on the Richter scale that occurred on January 24, 2020, in Elazig province, Turkey, is analyzed in detail. Tweets under twelve hashtags are clustered separately by utilizing the Social Spider Optimization (SSO) algorithm with some modifications. The sum-of intra-cluster distances (SICD) is utilized to measure the performance of the proposed clustering algorithm. In addition, SICD, which works in a way of assigning a new solution to its nearest node, is used as an integer programming model to be solved with the GUROBI package program on the test data-sets. Optimal results are gathered and compared with the proposed SSO results. In the study, center tweets with optimal results are found by utilizing modified SSO. Moreover, results of the proposed SSO algorithm are compared with the K-means clustering technique which is the most popular clustering technique. The proposed SSO algorithm gives better results. Hereby, the general situation of society after an earthquake is deduced to provide moral and material supports.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Statista, https://www.statista.com/statistics/263105/development-of-the-number-of-earthquakes-worldwide-since-2000/.

  2. AFAD, https://www.afad.gov.tr/elazig-depremi-sonrasi-yapilan-yardimlar-merkezicerik.

  3. Statista, https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/.

  4. Statista, https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-users//.

  5. UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/datasets.php.

  6. MathWorks, http://www.mathworks.com/help/textanalytics/ref/fitlsa.html

References

  1. Thalamala RC, Venkata Swamy Reddy A, Janet B (2020) A novel bio-inspired algorithm based on social spiders for improving performance and efficiency of data clustering. J Intell Syst 29(1):311–326

    Article  Google Scholar 

  2. Thalamala R, Barnabas J, Reddy AV (2019) A novel variant of social spider optimization using single centroid representation and enhanced mating for data clustering. PeerJ Comput Sci 5:201

    Article  Google Scholar 

  3. Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft Comput 20(3):1113–1126

    Article  Google Scholar 

  4. Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466

    Article  Google Scholar 

  5. Liu X, Fu H (2010) An effective clustering algorithm with ant colony. J Comput 5(4):598–605

    Google Scholar 

  6. Song W, Park SC (2009) Genetic algorithm for text clustering based on latent semantic indexing. Comput Math with Appl 57(11–12):1901–1907

    Article  Google Scholar 

  7. Hong SS, Lee W, Han MM (2015) The feature selection method based on genetic algorithm for efficient of text clustering and text classification. Int J Adv Soft Comput its Appl 7(1):22–40

    Google Scholar 

  8. TR Chandran, AV Reddy, and B Janet (2019) Performance comparison of social spider optimization for data clustering with other clustering methods. In: Proceedings 2nd International Conference Intelligent Computer Control Systems ICICCS 2018, no. Iciccs, pp 1119–1125

  9. A Aghamohseni and R Ramezanian (2015) An efficient hybrid approach based on K-means and generalized fashion algorithms for cluster analysis. In: 2015 AI Robot. IRANOPEN 2015 - 5th Conference Artificial Intelligence Robotics, pp 1–7

  10. Nandwalkar JR, Pete DJ (2021) Social spider optimization based optimized heat management for wet-electrospun polymer fiber. Microw Opt Technol Lett 63(2):670–678

    Article  Google Scholar 

  11. Yu JJQ, Li VOK (2015) A social spider algorithm for global optimization. Appl Soft Comput J 30:614–627

    Article  Google Scholar 

  12. R Zhao, A Zhou, and K Mao (2016) Automatic detection of cyberbullying on social networks based on bullying features. In: ACM International Conference Proceeding Series, vol 04–07, pp. 1–6

  13. Deerwester S, Dumais ST, Furnas GW, Landauer TK (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  14. Yilmaz S, Toklu S (2020) A deep learning analysis on question classification task using Word2vec representations. Neural Comput Appl 32(7):2909–2928

    Article  Google Scholar 

  15. Corallo A et al (2020) Sentiment analysis of expectation and perception of MILANO EXPO2015 in twitter data: a generalized cross entropy approach. Soft Comput 24(18):13597–13607

    Article  Google Scholar 

  16. Aaron Sonabend W et al (2020) Integrating questionnaire measures for transdiagnostic psychiatric phenotyping using word2vec. PLoS One 15(4):1–14

    Google Scholar 

  17. T. Hofmann (1999) Probabilistic latent semantic analysis. In: Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval (SIGIR-99)

  18. E Altszyler, M Sigman, S Ribeiro, and DF Slezak, (2016) Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database. arXiv preprint 1–14

  19. J Pennington, R Socher, and CD Manning (2014) GloVe: Global Vectors forWord Representation Jeffrey. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543

  20. Naili M, Chaibi AH, Ben Ghezala HH (2017) Comparative study of word embedding methods in topic segmentation. Procedia Comput Sci 112:340–349

    Article  Google Scholar 

  21. Aguilar J, Salazar C, Velasco H, Monsalve-Pulido J, Montoya E (2020) Comparison and evaluation of different methods for the feature extraction from educational contents. Computation 8(2):1–20

    Article  Google Scholar 

  22. C Hua and W Wei, (2019) A particle swarm optimization k-means algorithm for mongolian elements clustering. In: 2019 IEEE Symposium Series Computer Intelligence SSCI 2019, pp. 1559–1564

  23. Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Syst Appl 134:192–200

    Article  Google Scholar 

  24. P Nema and V Sharma, (2016) Multi-label text categorization based on feature optimization using ant colony optimization and relevance clustering technique. In: Proceedings - 2015 International Conference Computer Communication Systems ICCCS 2015, pp. 1–5

  25. P Hailong, Z Hui, L Wanglong, and M Ying, (2017) The research on the improved ant colony text clustering algorithm. In: 2017 IEEE 2nd International Conference Big Data Analysis ICBDA 2017, pp. 323–328

  26. Cuevas E, Cienfuegos M, Zaldívar D, Pérez-cisneros M (2013) A swarm optimization algorithm inspired in the behavior of the social-spider. Expert Syst Appl 40(16):6374–6384

    Article  Google Scholar 

  27. Abirami E (2019) Social spider optimization algorithm: theory and its applications. Int J Innov Technol Explor Eng 8(10):327–331

    Article  Google Scholar 

  28. HM Zawbaa, E Emary, AE Hassanien, and B Parv, (2016) A wrapper approach for feature selection based on swarm optimization algorithm inspired from the behavior of social-spiders. In: Proceedings 2015 7th International Conference Soft Computer Pattern Recognition, SoCPaR 2015, pp. 25–30

  29. Baş E, Ülker E (2020) An efficient binary social spider algorithm for feature selection problem. Expert Syst Appl 146:113185

    Article  Google Scholar 

  30. Abd El Aziz M, Hassanien AE (2018) An improved social spider optimization algorithm based on rough sets for solving minimum number attribute reduction problem. Neural Comput Appl 30(8):2441–2452

    Article  Google Scholar 

  31. TR Chandran, AV Reddy, and B Janet, (2016) A social spider optimization approach for clustering text documents. In: Proceeding IEEE - 2nd International Conference Advance Electrical and Electronical Information, Communication Bio-Informatics, IEEE - AEEICB 2016, pp. 22–26

  32. Chandran TR, Reddy AV, Janet B (2017) Text clustering quality improvement using a hybrid social spider optimization. Int J Appl Eng Res 12(6):995–1008

    Google Scholar 

  33. Hart EM, Avile L (2014) reconstructing local population dynamics in noisy metapopulations — the role of random catastrophes and allee effects. PLoS One 9(10):110049

    Article  Google Scholar 

  34. Ochoa I, Juárez-Casimiro A, Olivier K, Camarena T, Vázquez R (2017) Social spider algorithm to improve intelligent drones used in humanitarian disasters related to floods. Nature-inspired design of hybrid intelligent systems. Springer, Cham, pp 457–476

    Chapter  Google Scholar 

  35. Wang W, Chau K, Xu D, Qiu L, Liu C (2017) The annual maximum flood peak discharge forecasting using hermite projection pursuit regression with SSO and LS method. Water Resour. Manag 31:461–477

    Article  Google Scholar 

  36. Cuevas E, Cienfuegos M (2014) A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst Appl 41(2):412–425

    Article  Google Scholar 

  37. L Webb and Y Wang, (2013) Techniques for sampling online text-based data sets. In: Advances in Data Mining and Database Management (ADMDM), no. May 2015

  38. Indrayan A, Gupta P (2000) Clinical research methods sampling techniques, confidence intervals, and sample size. Natl Med J India 13:29–36

    Google Scholar 

  39. Pawde K, Purbey N, Gangan S, Kurup L (2014) Latent semantic analysis in information retrieval. Int J Eng Tech Res 2(10):243–246

    Google Scholar 

  40. Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25(2–3):259–284

    Article  Google Scholar 

  41. Papadimitriou CH, Raghavan P, Tamaki H, Vempala S (2000) Latent semantic indexing: a probabilistic analysis. J Comput Syst Sci 61(2):217–235

    Article  MathSciNet  Google Scholar 

  42. JC Valle-Lisbo and E Mizraji, (2006) The uncovering of hidden structures by latent semantic analysis. arXiv

  43. Levy O, Goldberg Y, Dagan I (2015) Improving distributional similarity with lessons learned from word embeddings. Trans Assoc Comput Linguist 3:211–225

    Article  Google Scholar 

  44. Chueh C-H, Wang H-M, Chien J-T (2006) A maximum entropy approach for semantic language modeling. Comput Linguist Chin Lang Process 11(1):37–56

    Google Scholar 

  45. N Alnajran, K Crockett, D McLean, and A Latham (2017) Cluster analysis of twitter data: a review of algorithms. In: ICAART 2017 - Proceedings 9th International Conference Agents Artificial Intelligence, vol. 2, no. Icaart, pp. 239–249

  46. Morissette L, Chartier S (2013) The k-means clustering technique: general considerations and implementation in Mathematica. Tutor Quant Methods Psychol 9(1):15–24

    Article  Google Scholar 

  47. Haq EU, Hussain A, Ahmad I (2019) Performance evaluation of novel selection processes through hybridization of k-means clustering and genetic algorithm. Appl Ecol Environ Res 17(6):14159–14177

    Article  Google Scholar 

  48. AP Bhopale and KS Sowmya (2017) Novel hybrid feature selection models for unsupervised document categorization.In: 2017 International Conference Advance Computer Communications Informatics, ICACCI 2017, vol. 2017–January, pp. 1471–1477

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nazmiye Eligüzel.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eligüzel, N., Çetinkaya, C. & Dereli, T. A state-of-art optimization method for analyzing the tweets of earthquake-prone region. Neural Comput & Applic 33, 14687–14705 (2021). https://doi.org/10.1007/s00521-021-06109-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06109-0

Keywords

Navigation