Abstract
With an upsurge in the use of social media, a tremendous amount of textual data is being generated, which is being used for applications like sentiment analysis, industry trend analysis, information retrieval etc. In this context, automatic keyword extraction is a crucial and useful task. Many graph - based methods have been proposed which consider co-occurrence as edge weight, but these methods neglect the semantic relations between words. This paper proposes an automatic keyword extraction method for tweets from Twitter that represents text as a fuzzy graph and applies fuzzy centrality measures to find relevant keywords (vertices). Proposed work, F-GAKE (fuzzy graph automatic keyword extraction) takes belongingness of two words concerning the theme of the dataset into consideration and provides a fuzzy edge weight. It also considers node weight which incorporates the position of the words, frequency, importance, strength of neighbours and distance from the central node. It then uses fuzzy degree centrality, fuzzy betweenness, fuzzy PageRank and fuzzy Node and Edge (NE) Rank measures which provide relevant keywords. It is further extended to extract keywords for localized trending topics from Twitter. For experimentation, various Twitter datasets are used and results show that F-GAKE performs better than the state-of-the-art approaches for automatic keyword extraction for short messages, such as tweets.
Similar content being viewed by others
References
Abdelhaq H, Gertz M, Armiti A (2016) Efficient online extraction of keywords for localized events in twitter. Geo Informatica 21(2):365–388
Abilhoa WD, Castro LND (2014) A keyword extraction method from twitter messages represented as graphs. Appl Math Comput 240:308–325
Aggarwal A, Sharma C, Jain M, Jain A (2018) Semi supervised graph based keyword extraction using lexical chains and centrality measures. Computación y Sistemas 22(4):1307–1315. http://www.scielo.org.mx/scielo.php?script=sci_arttext&pid=S1405-55462018000401307
Anjali S, Meera NM, Thushara MG (2019) A graph based approach for keyword extraction from documents, second international conference on advanced computational and communication paradigms (ICACCP)
Bellaachia A, Al-Dhelaan M (2012) NE-rank: a novel graph-based Keyphrase extraction in twitter. 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology
Biswas SK, Bordoloi M, Shreya J (2018) A graph based keyword extraction model using collective node weight. Expert Syst Appl 97:51–59
Caragea C, Bulgarov FA, Godea A, Gollapalli SD (2014)Citation-enhanced keyphrase extraction from research papers: A supervised approach. In EMNLP, 1435–1446
Diakopoulos N, Naaman M & Kivran-Swaine F (2010) Diamonds in the rough: social media visual analytics for journalistic inquiry. 2010 IEEE symposium on visual analytics science and technology
Hong TP, Lee CY (1996) Induction of fuzzy rules and membership functions from training examples. Fuzzy Sets Syst 84(1):33–47
Hu JR, Li Q, Zhang YG, Ma WC (2015) Centrality measures in directed fuzzy social networks. Int J Fuzzy Inf Eng 7(1):115–128
Hulth A (2003) Improved automatic keyword extraction given more linguistic knowledge. Proceedings of the 2003 conference on empirical methods in natural language processing
Imran M, Elbassuoni S, Castillo C, Diaz F & Meier P (2013) Practical extraction of disaster-relevant information from social media. Proceedings of the 22nd international conference on world wide web - WWW 13 companion
Jain A, Lobiyal DK (2015) Fuzzy Hindi WordNet and word sense disambiguation using fuzzy graph connectivity measures. ACM Trans Asian Lang Inf Process 15:1–31
Jain A, Vij S, Castillo O (2019) Hindi query expansion based on semantic importance of Hindi WordNet relations and fuzzy graph connectivity measures. Computación y Sistemas 23(4):1337–1355. http://www.scielo.org.mx/scielo.php?pid=S1405-55462019000401337&script=sci_arttext
Jain A, Mittal K, Vaisla KS (2020) FLAKE: fuzzy graph centrality-based automatic keyword extraction. Comput J:bxaa133. https://doi.org/10.1093/comjnl/bxaa133
James S (2004) International encyclopedia of information and library science (2nd edition)200459Edited by John feather and Paul Sturges. International encyclopedia of information and library science (2nd edition), London and New York
Kosko B (1994) Fuzzy thinking: the new science of fuzzy logic
Kwon K, Choi CH, Lee J, Jeong J, Cho WS (2015) A graph based representative keywords extraction model from news articles. Proceedings of the 2015 International Conference on Big Data Applications and Services - BigDAS 15
Lachlan KA, Spence PR, Lin X, Najarian KM, Greco MD (2014) Twitter use during a weather event: comparing content associated with localized and nonlocalized hashtags. Commun Stud 65(5):519–534
Litvak M, Last M, Aizenman H, Gobits I & Kandel A (2011) DegExt — a language-independent graph-based Keyphrase extractor. Advances in Intelligent and Soft Computing Advances in Intelligent Web Mastering – 3, 121–130
Liu F, Lui, F & Yang L (2008) Automatic keyword extraction for the meeting corpus using supervised approach and bigram expansion. IEEE Spoken Language Technology Workshop
Liu F, Pennell D, Liu F, Liu Y (2009) Unsupervised approaches for automatic key-word extraction using meeting transcripts. In: Proceedings of Human Language Technologies: Conference of the North American Chapter of the Association for Computational Linguistics, 620–628
Mathew S, Sunitha MS (2009) Node connectivity and arc connectivity of a fuzzy graph. Inf Sci 179:1760–1768
Medasani S, Kim J, Krishnapuram R (1998) An overview of membership function generation techniques for pattern recognition. Int J Approx Reason 19:391–417
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'13). https://dl.acm.org/doi/10.5555/2999792.2999959
Mikolov T, Chen K, Corrado GS, Dean J (2013) Efficient estimation of word representations in vector space, proc. Int’l Conf. Learning Representation
Noh J, Lee S (2016) Extracting and evaluating topics by region. Multimed Tools Appl 75(20):12765–12777
Palshikar G. K. (2007) Keyword extraction from a single document using centrality measures. Pattern Recognition and Machine Intelligence, 503–510
Rahutomo F, Kitasuka T, Aritsugi M (2012) Semantic cosine similarity. Proceedings of the 7th ICAST 2012, Seoul
Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. Applications and Theory, John Wiley & Sons Ltd, Text Mining
Rosenfeld A, Zadeh LA, Fu KS, Tanaka K, Shimura M (1975) Fuzzy Graph. Fuzzy Sets and their Application to Cognitive and Decision Process, 77–97
Sayyadi H, Hurst M, Maykov A (2009) Event detection and tracking in social streams. Proceedings of the Third International ICWSM Conference
Turney PD (1999) Learning to Extract Keyphrases from Text. NRC Technical Report ERB-1057, National Research Council, Canada. 1–43
Vetriselvi T, Gopalan NP (2020) An improved key term weightage algorithm for text summarization using local context information and fuzzy graph sentence score. J Ambient Intell Humaniz Comput 12:4609–4618
Vij S, Jain A, Tayal D, Castillo O (2017) Fuzzy logic for inculcating significance of semantic relations in word sense disambiguation using a WordNet graph. Int J Fuzzy Syst 20(2):444–459
Vij S, Jain A, Tayal DK, Castillo O (2018) Fuzzy logic for inculcating significance of semantic relations in word sense disambiguation using a WordNet graph. Int J Fuzzy Syst 20(2):444–459
Vij S, Tayal D, Jain A (2019) A fuzzy WordNet graph based approach to find key terms for students short answer evaluation. 4th international conference on internet of things: smart innovation and usages (IoT-SIU)
Wang Z, Feng Y, Li F (2016) The improvements of text rank for domain-specific key phrase extraction. Int J Simul Syst Sci Tech 17(20):11.1–11.5
Webster JJ, Kit C (1992) Tokenization as the initial phase in NLP. Proceedings of the 15th International Conference on Computational Linguistics, Nantes, France, 1106–1110
Wilbur WJ, Sirotkin K (1992) The automatic identification of stop words. J Inf Sci 18(1):45–55
Yang M, Liang Y, Zhao W, Xu W, Zhu J, Qu Q (2017)Task-oriented keyphrase extraction from social media. Multimed Tools Appl 77(3):3171–3187
Yang M, Liang Y, Zhao W, Xu W, Zhu J, Qu Q (2017)Task-oriented keyphrase extraction from social media. Multimed Tools Appl 77(3):3171–3187
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353. https://www.sciencedirect.com/science/article/pii/S001999586590241X
Zahang C, Wang H, Liu Y, Wu D, Liao Y, Wang B (2008) Automatic keyword extraction from documents using conditional random fields. J Comput Inf Syst 4(3):1169–1180
Zhang Q (1998) Fuzziness - vagueness - generality - ambiguity. J Pragmat 29(1):13–31
Zhang K, Xu H, Tang J, Li J (2006) Keyword extraction using support vector machine. Advances in Web-Age Information Management Lecture Notes in Computer Science, 85–96
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jain, M., Bhalla, G., Jain, A. et al. Automatic keyword extraction for localized tweets using fuzzy graph connectivity measures. Multimed Tools Appl 81, 42931–42956 (2022). https://doi.org/10.1007/s11042-021-11893-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11893-x