Abstract
The semantic similarity between two concepts is widely used in natural language processing. In this article, we propose a method using WordNet 3.1 to determine the similarity based on feature combinations. This work focuses on overcoming the ambiguity in social media text via the selection of informative features to improve semantic representation. In addition, this research uses social media as its research domain used in this work, and the study is only limited to the politic dataset. A feature-based method is applied to predict the outcome and improve the performance of the proposed method depending on factors related to the fidelity, continuity, and balance of knowledge sources in WordNet 3.1. Semantic similarity measurements among words are insufficient and unbalanced features. However, this study presents a semantic similarity measure of a feature-based method in WordNet 3.1 to determine the similarity between two concepts/words depending on the selected features used to measure their similarity, which is also known as a “noun” and “is-a” relations-based method. We evaluate our proposed method using the data set in Agirre [1] (AG203) and compare our results of our new method as which three of methods taxonomy relation, non-taxonomy and Glosses with those of related studies. The correlation with human judgments is subjective and low based on our results was a better. Experimental results show that our new method significantly outperforms other existing computational methods with the following results: r = 0.73%, p = 0.69%, m = 0.71% and nonzero = 0.95%.
https://wordnet.princeton.edu.
https://github.com/alimuttaleb/Ali-Muttaleb/compare/Semantic-Similarity-by-using-theambiguity-AG203.
This work is supported by the University of Malaysia Pahang (UMP) via research grant UMP (RDU1803141, PGRS190398).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
References
Agirre E, et al (2009) A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics
Rassem TH, et al (2017) Restoring the missing features of the corrupted speech using linear interpolation methods. In: AIP Conference Proceedings, AIP Publishing
Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3):393–418
Hasan AM, Rassem TH, Noorhuzaimi M (2018) Combined support vector machine and pattern matching for arabic islamic hadith question classification system. In: International Conference of Reliable Information and Communication Technology, Springer
Hasan AM, Rassem TH, Karimah M (2018) Pattern-matching based for arabic question answering: a challenge perspective. Adv Sci Lett 24(10):7655–7661
Hasan AM, Zakaria LQ (2016) Question classification using support vector machine and pattern matching. J Theor Appl Inf Technol 87(2)
Al-Tashi Q, Hasan AM (2019) Word sense disambiguation: a review. Southern Connecticut State University, Hilton C. Buley Library 1(2):20–458
Pedersen T, Patwardhan S, Michelizzi J (2004) WordNet: similarity: measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004 Association for Computational Linguistics
Al-Tashi Q, et al (2019) Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access
Wu Z, Palmer, M (1994) Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics
Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. WordNet: an electronic lexical database 49(2):265–283
SáNchez D, Batet M (2013) A semantic similarity method based on information content exploiting multiple ontologies. Expert Syst Appl 40(4):1393–1399
Harispe S et al (2014) A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J Biomed Inform 48:38–53
Jiang Y et al (2015) Feature-based approaches to semantic similarity assessment of concepts using Wikipedia. Inf Process Manage 51(3):215–234
Saif A et al (2018) Weighting-based semantic similarity measure based on topological parameters in semantic taxonomy. Nat Lang Eng 24(6):861–886
Li P et al (2017) A graph-based semantic relatedness assessment method combining wikipedia features. Eng Appl Artif Intell 65:268–281
Taieb MAH, Aouicha MB, Hamadou AB (2014) Ontology-based approach for measuring semantic similarity. Eng Appl Artif Intell 36:238–261
Aouicha MB, Taieb MAH, Hamadou AB (2016) Taxonomy-based information content and wordnet-wiktionary-wikipedia glosses for semantic relatedness. Appl Intell 45(2):475–511
Rada R et al (1989) Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern 19(1):17–30
Li Y, Bandar ZA, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans Knowl Data Eng 15(4):871–882
Sebti A, Barfroush AA (2008) A new word sense similarity measure in WordNet. In Computer Science and Information Technology, IMCSIT 2008. International Multiconference on. IEEE
Sánchez D, Batet M, Isern D (2011) Ontology-based information content computation. Knowl-Based Syst 24(2):297–303
Meng L, Gu J, Zhou Z (2012) A new model of information content based on concept’s topology for measuring semantic similarity in wordnet. Int J Grid Distrib Comput 5(3):81–94
Aouicha MB, Taieb MAH, Ezzeddine M (2016) Derivation of “is a” taxonomy from Wikipedia category graph. Eng Appl Artif Intell 50:265–286
Patwardhan S, Banerjee S, Pedersen T (2007) UMND1: unsupervised word sense disambiguation using contextual semantic relatedness. In: Proceedings of the 4th International Workshop on Semantic Evaluations. Association for Computational Linguistics
Acknowledgements
This work is supported by the University Malaysia Pahang (UMP) via Research Grant UMP (RDU1803141, PGRS190398).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hasan, A.M., Mohd Noor, N., Rassem, T.H., Mohd Noah, S.A., Hasan, A.M. (2020). A Proposed Method Using the Semantic Similarity of WordNet 3.1 to Handle the Ambiguity to Apply in Social Media Text. In: Kim, K., Kim, HY. (eds) Information Science and Applications. Lecture Notes in Electrical Engineering, vol 621. Springer, Singapore. https://doi.org/10.1007/978-981-15-1465-4_47
Download citation
DOI: https://doi.org/10.1007/978-981-15-1465-4_47
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1464-7
Online ISBN: 978-981-15-1465-4
eBook Packages: EngineeringEngineering (R0)