Skip to main content
Log in

Automatic keyword extraction for localized tweets using fuzzy graph connectivity measures

  • 1135T: Social Multimedia Processing
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With an upsurge in the use of social media, a tremendous amount of textual data is being generated, which is being used for applications like sentiment analysis, industry trend analysis, information retrieval etc. In this context, automatic keyword extraction is a crucial and useful task. Many graph - based methods have been proposed which consider co-occurrence as edge weight, but these methods neglect the semantic relations between words. This paper proposes an automatic keyword extraction method for tweets from Twitter that represents text as a fuzzy graph and applies fuzzy centrality measures to find relevant keywords (vertices). Proposed work, F-GAKE (fuzzy graph automatic keyword extraction) takes belongingness of two words concerning the theme of the dataset into consideration and provides a fuzzy edge weight. It also considers node weight which incorporates the position of the words, frequency, importance, strength of neighbours and distance from the central node. It then uses fuzzy degree centrality, fuzzy betweenness, fuzzy PageRank and fuzzy Node and Edge (NE) Rank measures which provide relevant keywords. It is further extended to extract keywords for localized trending topics from Twitter. For experimentation, various Twitter datasets are used and results show that F-GAKE performs better than the state-of-the-art approaches for automatic keyword extraction for short messages, such as tweets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Abdelhaq H, Gertz M, Armiti A (2016) Efficient online extraction of keywords for localized events in twitter. Geo Informatica 21(2):365–388

    Google Scholar 

  2. Abilhoa WD, Castro LND (2014) A keyword extraction method from twitter messages represented as graphs. Appl Math Comput 240:308–325

    Google Scholar 

  3. Aggarwal A, Sharma C, Jain M, Jain A (2018) Semi supervised graph based keyword extraction using lexical chains and centrality measures. Computación y Sistemas 22(4):1307–1315. http://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S1405-55462018000401307

  4. Anjali S, Meera NM, Thushara MG (2019) A graph based approach for keyword extraction from documents, second international conference on advanced computational and communication paradigms (ICACCP)

  5. Bellaachia A, Al-Dhelaan M (2012) NE-rank: a novel graph-based Keyphrase extraction in twitter. 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology

  6. Biswas SK, Bordoloi M, Shreya J (2018) A graph based keyword extraction model using collective node weight. Expert Syst Appl 97:51–59

    Article  Google Scholar 

  7. Caragea C, Bulgarov FA, Godea A, Gollapalli SD (2014)Citation-enhanced keyphrase extraction from research papers: A supervised approach. In EMNLP, 1435–1446

  8. Diakopoulos N, Naaman M & Kivran-Swaine F (2010) Diamonds in the rough: social media visual analytics for journalistic inquiry. 2010 IEEE symposium on visual analytics science and technology

  9. Hong TP, Lee CY (1996) Induction of fuzzy rules and membership functions from training examples. Fuzzy Sets Syst 84(1):33–47

    Article  MathSciNet  MATH  Google Scholar 

  10. Hu JR, Li Q, Zhang YG, Ma WC (2015) Centrality measures in directed fuzzy social networks. Int J Fuzzy Inf Eng 7(1):115–128

    Article  MathSciNet  Google Scholar 

  11. Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. Proceedings of the 2003 conference on empirical methods in natural language processing

  12. Imran M, Elbassuoni S, Castillo C, Diaz F & Meier P (2013) Practical extraction of disaster-relevant information from social media. Proceedings of the 22nd international conference on world wide web - WWW 13 companion

  13. Jain A, Lobiyal DK (2015) Fuzzy Hindi WordNet and word sense disambiguation using fuzzy graph connectivity measures. ACM Trans Asian Lang Inf Process 15:1–31

    Google Scholar 

  14. Jain A, Vij S, Castillo O (2019) Hindi query expansion based on semantic importance of Hindi WordNet relations and fuzzy graph connectivity measures. Computación y Sistemas 23(4):1337–1355. http://www.scielo.org.mx/scielo.php?pid=S1405-55462019000401337&script=sci_arttext

  15. Jain A, Mittal K, Vaisla KS (2020) FLAKE: fuzzy graph centrality-based automatic keyword extraction. Comput J:bxaa133. https://doi.org/10.1093/comjnl/bxaa133

  16. James S (2004) International encyclopedia of information and library science (2nd edition)200459Edited by John feather and Paul Sturges. International encyclopedia of information and library science (2nd edition), London and New York

  17. Kosko B (1994) Fuzzy thinking: the new science of fuzzy logic

  18. Kwon K, Choi CH, Lee J, Jeong J, Cho WS (2015) A graph based representative keywords extraction model from news articles. Proceedings of the 2015 International Conference on Big Data Applications and Services - BigDAS 15

  19. Lachlan KA, Spence PR, Lin X, Najarian KM, Greco MD (2014) Twitter use during a weather event: comparing content associated with localized and nonlocalized hashtags. Commun Stud 65(5):519–534

    Article  Google Scholar 

  20. Litvak M, Last M, Aizenman H, Gobits I & Kandel A (2011) DegExt — a language-independent graph-based Keyphrase extractor. Advances in Intelligent and Soft Computing Advances in Intelligent Web Mastering – 3, 121–130

  21. Liu F, Lui, F & Yang L (2008) Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion. IEEE Spoken Language Technology Workshop

  22. Liu F, Pennell D, Liu F, Liu Y (2009) Unsupervised approaches for automatic key-word extraction using meeting transcripts. In: Proceedings of Human Language Technologies: Conference of the North American Chapter of the Association for Computational Linguistics, 620–628

  23. Mathew S, Sunitha MS (2009) Node connectivity and arc connectivity of a fuzzy graph. Inf Sci 179:1760–1768

    Article  MATH  Google Scholar 

  24. Medasani S, Kim J, Krishnapuram R (1998) An overview of membership function generation techniques for pattern recognition. Int J Approx Reason 19:391–417

    Article  MathSciNet  MATH  Google Scholar 

  25. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'13). https://dl.acm.org/doi/10.5555/2999792.2999959

  26. Mikolov T, Chen K, Corrado GS, Dean J (2013) Efficient estimation of word representations in vector space, proc. Int’l Conf. Learning Representation

  27. Noh J, Lee S (2016) Extracting and evaluating topics by region. Multimed Tools Appl 75(20):12765–12777

    Article  Google Scholar 

  28. Palshikar G. K. (2007) Keyword extraction from a single document using centrality measures. Pattern Recognition and Machine Intelligence, 503–510

  29. Rahutomo F, Kitasuka T, Aritsugi M (2012) Semantic cosine similarity. Proceedings of the 7th ICAST 2012, Seoul

  30. Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. Applications and Theory, John Wiley & Sons Ltd, Text Mining

    Book  Google Scholar 

  31. Rosenfeld A, Zadeh LA, Fu KS, Tanaka K, Shimura M (1975) Fuzzy Graph. Fuzzy Sets and their Application to Cognitive and Decision Process, 77–97

  32. Sayyadi H, Hurst M, Maykov A (2009) Event detection and tracking in social streams. Proceedings of the Third International ICWSM Conference

  33. Turney PD (1999) Learning to Extract Keyphrases from Text. NRC Technical Report ERB-1057, National Research Council, Canada. 1–43

  34. Vetriselvi T, Gopalan NP (2020) An improved key term weightage algorithm for text summarization using local context information and fuzzy graph sentence score. J Ambient Intell Humaniz Comput 12:4609–4618

    Article  Google Scholar 

  35. Vij S, Jain A, Tayal D, Castillo O (2017) Fuzzy logic for inculcating significance of semantic relations in word sense disambiguation using a WordNet graph. Int J Fuzzy Syst 20(2):444–459

    Article  Google Scholar 

  36. Vij S, Jain A, Tayal DK, Castillo O (2018) Fuzzy logic for inculcating significance of semantic relations in word sense disambiguation using a WordNet graph. Int J Fuzzy Syst 20(2):444–459

    Article  Google Scholar 

  37. Vij S, Tayal D, Jain A (2019) A fuzzy WordNet graph based approach to find key terms for students short answer evaluation. 4th international conference on internet of things: smart innovation and usages (IoT-SIU)

  38. Wang Z, Feng Y, Li F (2016) The improvements of text rank for domain-specific key phrase extraction. Int J Simul Syst Sci Tech 17(20):11.1–11.5

    Google Scholar 

  39. Webster JJ, Kit C (1992) Tokenization as the initial phase in NLP. Proceedings of the 15th International Conference on Computational Linguistics, Nantes, France, 1106–1110

  40. Wilbur WJ, Sirotkin K (1992) The automatic identification of stop words. J Inf Sci 18(1):45–55

    Article  Google Scholar 

  41. Yang M, Liang Y, Zhao W, Xu W, Zhu J, Qu Q (2017)Task-oriented keyphrase extraction from social media. Multimed Tools Appl 77(3):3171–3187

    Article  Google Scholar 

  42. Yang M, Liang Y, Zhao W, Xu W, Zhu J, Qu Q (2017)Task-oriented keyphrase extraction from social media. Multimed Tools Appl 77(3):3171–3187

    Article  Google Scholar 

  43. Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353. https://www.sciencedirect.com/science/article/pii/S001999586590241X

  44. Zahang C, Wang H, Liu Y, Wu D, Liao Y, Wang B (2008) Automatic keyword extraction from documents using conditional random fields. J Comput Inf Syst 4(3):1169–1180

    Google Scholar 

  45. Zhang Q (1998) Fuzziness - vagueness - generality - ambiguity. J Pragmat 29(1):13–31

    Article  Google Scholar 

  46. Zhang K, Xu H, Tang J, Li J (2006) Keyword extraction using support vector machine. Advances in Web-Age Information Management Lecture Notes in Computer Science, 85–96

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amita Jain.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jain, M., Bhalla, G., Jain, A. et al. Automatic keyword extraction for localized tweets using fuzzy graph connectivity measures. Multimed Tools Appl 81, 42931–42956 (2022). https://doi.org/10.1007/s11042-021-11893-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11893-x

Keywords

Navigation