Abstract
Semantic similarity between two sentences is concerned with measuring how much two sentences share the same or related meaning. Two methods in the literature for measuring sentence similarity are cosine similarity and overall similarity. In this work we investigate if it is possible to improve the performance of these methods by integrating different word level semantic relatedness methods. Four different word relatedness methods are compared using four different data sets compiled from different domains, providing a testbed formed of various range of writing expressions to challenge the selected methods. Results show that the use of corpus-based word semantic similarity function has significantly outperformed that of WordNet-based word semantic similarity function in sentence similarity methods. Moreover, we propose a new sentence similarity measure method by modifying an existing method which incorporates word order and lexical similarity called as overall similarity. Furthermore, the results show that the proposed method has significantly improved the performance of the overall method. All the selected methods are tested and compared with other state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cilibrasi, R., Vitányi, P.: The Google Similarity Distance. IEEE Trans. Know Data Engineering (2006)
Batet, M.: Ontology-Based Semantic Clustering. AI Communication 24 (2011)
Jones, K., Walker, S., Robertson, S.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments. Part. In: Information Processing and Management (2000)
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (1997)
Mehran, S., Timothy, H.: A Web-Based Kernel Function for Measuring the Similarity of Short Text Snippets. In: WWW 2006. ACM Press (2006)
Rapp, R.: Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts. In: Proc. 28th Annu. Conf. Gesellschaft für Klassif, pp. 521–528 (2004)
Ercan, G.: Lexical Cohesion Analysis for Topic Segmentation, Summarization and Keyphrase Extraction. Phd. Dissertation. Bilkent University (2012)
Leacock, C., Chodorow, M., Miller, G.: Using Semantics and WordNet Relation for Sense Identification. Association for Computational Linguistics (1998)
Wu, Z., Palmer, M.: Verb semantics and Lexical Selection. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (1994)
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (1995)
Francis, W., Henry, K.: Frequency Analysis of English Usage. Lexicon and Grammar. Houghton Mifflin, Boston (1982)
Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the International Conference on Machine Learning (1998)
Jay, J., David, W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (ROCLING X), Taiwan (1997)
Choueka, Y., Lusignan, S.: Disambiguation by Short Contexts Computers and the Humanities (1985)
Satanjeev, B., Ted, P.: Extended Gloss Overlaps as a Measure of Semantic Rrelatedness. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (2003)
Zaka, B.: Theory and Applications of Similarity Detection Techniques., Institute for Information Systems and Computer Media (IICM) Graz University of Technology A-8010 Graz, Austria (2009)
Samuel, F., Stevenson, M.: A Semantic Similarity Approach to Paraphrase Detection (2007)
Yuhua, L., Zuhair, B., David, M., James, O.: A Method for Measuring Sentence Similarity and its Application to Conversational Agents. IEEE Transactions on Knowledge and Data Engineering (2006)
Li, J., Bandar, Z., McLean, D., Shea, O.: A Method for Measuring Sentence Similarity and its Application to Conversational Agents. In: 17th International Florida Artificial Intelligence Research Society Conference, Miami Beach. AAAI Press (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kairaldeen, A.R., Ercan, G. (2015). Calculation of Textual Similarity Using Semantic Relatedness Functions. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-18117-2_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18116-5
Online ISBN: 978-3-319-18117-2
eBook Packages: Computer ScienceComputer Science (R0)