Calculation of Textual Similarity Using Semantic Relatedness Functions

Kairaldeen, Ammar Riadh; Ercan, Gonenc

doi:10.1007/978-3-319-18117-2_38

Ammar Riadh Kairaldeen¹⁴ &
Gonenc Ercan¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9042))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

3349 Accesses

Abstract

Semantic similarity between two sentences is concerned with measuring how much two sentences share the same or related meaning. Two methods in the literature for measuring sentence similarity are cosine similarity and overall similarity. In this work we investigate if it is possible to improve the performance of these methods by integrating different word level semantic relatedness methods. Four different word relatedness methods are compared using four different data sets compiled from different domains, providing a testbed formed of various range of writing expressions to challenge the selected methods. Results show that the use of corpus-based word semantic similarity function has significantly outperformed that of WordNet-based word semantic similarity function in sentence similarity methods. Moreover, we propose a new sentence similarity measure method by modifying an existing method which incorporates word order and lexical similarity called as overall similarity. Furthermore, the results show that the proposed method has significantly improved the performance of the overall method. All the selected methods are tested and compared with other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cilibrasi, R., Vitányi, P.: The Google Similarity Distance. IEEE Trans. Know Data Engineering (2006)
Google Scholar
Batet, M.: Ontology-Based Semantic Clustering. AI Communication 24 (2011)
Google Scholar
Jones, K., Walker, S., Robertson, S.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments. Part. In: Information Processing and Management (2000)
Google Scholar
Barzilay, R., Elhadad, M.: Using Lexical Chains for Text Summarization. In: Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization (1997)
Google Scholar
Mehran, S., Timothy, H.: A Web-Based Kernel Function for Measuring the Similarity of Short Text Snippets. In: WWW 2006. ACM Press (2006)
Google Scholar
Rapp, R.: Discovering the Senses of an Ambiguous Word by Clustering its Local Contexts. In: Proc. 28th Annu. Conf. Gesellschaft für Klassif, pp. 521–528 (2004)
Google Scholar
Ercan, G.: Lexical Cohesion Analysis for Topic Segmentation, Summarization and Keyphrase Extraction. Phd. Dissertation. Bilkent University (2012)
Google Scholar
Leacock, C., Chodorow, M., Miller, G.: Using Semantics and WordNet Relation for Sense Identification. Association for Computational Linguistics (1998)
Google Scholar
Wu, Z., Palmer, M.: Verb semantics and Lexical Selection. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (1994)
Google Scholar
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (1995)
Google Scholar
Francis, W., Henry, K.: Frequency Analysis of English Usage. Lexicon and Grammar. Houghton Mifflin, Boston (1982)
Google Scholar
Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the International Conference on Machine Learning (1998)
Google Scholar
Jay, J., David, W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (ROCLING X), Taiwan (1997)
Google Scholar
Choueka, Y., Lusignan, S.: Disambiguation by Short Contexts Computers and the Humanities (1985)
Google Scholar
Satanjeev, B., Ted, P.: Extended Gloss Overlaps as a Measure of Semantic Rrelatedness. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (2003)
Google Scholar
Zaka, B.: Theory and Applications of Similarity Detection Techniques., Institute for Information Systems and Computer Media (IICM) Graz University of Technology A-8010 Graz, Austria (2009)
Google Scholar
Samuel, F., Stevenson, M.: A Semantic Similarity Approach to Paraphrase Detection (2007)
Google Scholar
Yuhua, L., Zuhair, B., David, M., James, O.: A Method for Measuring Sentence Similarity and its Application to Conversational Agents. IEEE Transactions on Knowledge and Data Engineering (2006)
Google Scholar
Li, J., Bandar, Z., McLean, D., Shea, O.: A Method for Measuring Sentence Similarity and its Application to Conversational Agents. In: 17th International Florida Artificial Intelligence Research Society Conference, Miami Beach. AAAI Press (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Baghdad, Baghdad, Iraq
Ammar Riadh Kairaldeen
Department of Informatics, Hacettepe University, Ankara, Turkey
Gonenc Ercan

Authors

Ammar Riadh Kairaldeen
View author publications
You can also search for this author in PubMed Google Scholar
Gonenc Ercan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ammar Riadh Kairaldeen .

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kairaldeen, A.R., Ercan, G. (2015). Calculation of Textual Similarity Using Semantic Relatedness Functions. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9042. Springer, Cham. https://doi.org/10.1007/978-3-319-18117-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-18117-2_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18116-5
Online ISBN: 978-3-319-18117-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics