Abstract
Transforming information in a digital way modifies the people views and their daily functioning. Social media is a key platform where people express their views regarding any event and it also plays an important role in daily activities. Digital marketing is an example of such digital transformation of information. In this present era, social channels use their personal information of the users to launch any product or tool. Digital Education plays a key role in transforming information in a digital way. In such cases, Natural Language Processing of people views and blog chatting plays an important role. Adding an explanatory layer is important for an Intelligent Tutoring System (ITS), where students interact with an application through natural language. This paper proposed a method, which will able to measure the interpretability between two sentences by rating the degree of semantic equivalence on a graded scale from 0 (not aligned) to 5 (semantically equivalent). The goal of the paper is not to add an interpretable layer but developed a method which can explain the similarities and differences between the two sentences. This task has been motivated by SemEval 2016 Task 2. The proposed method has been developed and tested over the headlines dataset. For the gold standard data, an accuracy of 0.64 for alignment type and score is reported.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
References
Agirre, E., Gonzalez-Agirre, A., Lopez-Gazpio, I., Maritxalar, M., Rigau, G., Uria, L.: UBC: cubes for English semantic textual similarity and supervised approaches for interpretable STS. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, June, pp. 178–183. ACL (2015)
Agirre, E., Gonzalez-Agirre, A., Lopez-Gazpio, I., Maritxalar, M., Rigau, G., Uria, L.: SemEval-2016 task 2: interpretable semantic textual similarity. In: Proceedings of SemEval (SemEval 2016), San Diego, California, 16–17 June, pp. 512–524. ACL (2016)
Agirrea, E., Baneab, C., Cardiec, C., Cerd, D., Diabe, M., Gonzalez-Agirrea, A., Guof, W.,Lopez-Gazpioa, I., Maritxalara, M., Mihalceab, R., Rigaua, G., Uriaa, L., Wiebe, J.: SemEval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, June, pp. 252–263 (2015)
Aleven, V., Popescu, O., Koedinger, KR.: Pedagogical content knowledge in a tutorial dialogue system to support self-explanation. In: Papers of the AIED-2001 Workshop on Tutorial Dialogue Systems, pp. 59–70 (2001)
Aliguliyev, R.M.: A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Syst. Appl. 36(4), 7764–7772 (2009)
Banjade, R., Niraula, N.B., Maharjan, N., Rus, V., Stefanescu, D., Lintean, M., Gautam, D.: NeRoSim: a system for measuring and interpreting semantic textual similarity. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, 4–5 June, pp. 164–171. ACL (2015)
Brockett, C.: Aligning the RTE 2006 corpus. In: Microsoft Research Technical report MSR-TR-2007-77 (2007)
Coelho, A.S., Tatiana, A.S., Calado, P.P., Souza, L.V., Ribeiro-Neto, B., Muntz, R.: Image retrieval using multiple evidence ranking. IEEE Trans. Knowl. Data Eng. 16(4), 408–417 (2004)
Dagan, I., Glickman, O., Magnini, B.: The PASCAL recognising textual entailment challenge. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS (LNAI), vol. 3944, pp. 177–190. Springer, Heidelberg (2006). https://doi.org/10.1007/11736790_9
Dolan, W.B., Brockett, C.: Automatically constructing a corpus of sentential paraphrases. In: Third International Workshop on Paraphrasing. Asia Federation of Natural Language Processing, January 2005
Finkel, JR., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL 2005), pp. 363–370, Stroudsburg, PA, USA, 25–30 June. ACL (2005)
Henry, S., Sands, A.: VRep at SemEval-2016 task 1 and task 2: a system for interpretable semantic similarity. In: Proceedings of the 10th International Workshop on Semantic Evaluation in Collocated in 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (SemEval 2016), San Diego, California, 16–17 June, pp. 577–583. ACL (2016)
Hirst, G., St-Onge, D.: WordNet: an electronic lexical database chapter lexical chains as representations of context for the detection and correction of malapropisms, pp. 305–332. MIT Press, April 1998
Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data. 2(2), 10:1–10:25 (2008)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (ROCLING X) (1997)
Jordan, P.W., Makatchev, M., Pappuswamy, U., VanLehn, K., Albacete, P.: A natural language tutorial dialogue system for physics. In: Proceedings of the Nineteenth International Florida Artificial Intelligence Research Society Conference (FLAIRS 2006), Melbourne Beach, FL, United States, 11–13 May, pp. 521–526 (2005)
Karumuri, S., Vuggumudi, V.K.R., Chitirala, S.C.R.: UMDuluth-BlueTeam: SVCSTS -a multilingual and chunk level semantic similarity system. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Denver, Colorado, USA, 4–5 June, pp. 107–110. Association for Computational Linguistic (2015)
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, (ACL 2003), Sapporo, Japan, 7–12 July, vol. 1, pp. 423–430 (2003)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: WordNet: An Electronic Lexical Database, chap. 13, pp. 265–283. MIT Press (1998)
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation (SIGDOC 1986), pp. 24–26. ACM, New York, June 1986
Li, Y., McLean, D., Bandar, Z.A., O’shea, I.D., Crockett, K.: Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18(8), 1138–1150 (2006)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), San Francisco, CA, USA, pp. 296–304. Morgan Kaufmann Publishers Inc. (1998)
Lopez-Gazpio, I., Eneko, A., Montse, M.: iUBC at SemEval-2016 task 2: RNNs and LSTMs for interpretable STS. In: Proceedings of International Workshop on Semantic Evaluation in Association with 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (SemEval 2016), San Diego, California, 16–17 June, pp. 771–776 (2016)
Magnolini, S., Feltracco, A., Magnini, B.: FBK-HLT-NLP at SemEval-2016 Task 2: a multitask, deep learning approach for interpretable semantic textual similarity. In: Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016), San Diego, California, 16–17 June, pp. 783–789 (2016)
Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: Proceedings of the 21st National Conference on Artificial Intelligence (AAAI 2006), Boston, Massachusetts, 16–20, July, vol. 1, pp. 775–780. AAAI Press (2006)
Mohler, M., Bunescu, R., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT 2011), Stroudsburg, PA, USA, 19–24 June, vol. 1, pp. 752–762 (2011)
Nielsen, R.D., Ward, W., Martin, J.H.: Recognizing entailment in intelligent tutoring systems*. Nat. Lang. Eng. 15(4), 479–501 (2009)
Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, May 2010
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, (IJCAI 1995), San Francisco, CA, USA, 20–25 August, vol. 1, pp. 448–453. Morgan Kaufmann Publishers Inc. (1995)
Rocchio, J.J.: Relevance Feedback in Information Retrieval. Prentice-Hall, Englewood Cliffs (1971)
Ru, V., Lintean, M., Moldovan, C., Baggett, W., Niraula, N., Morgan, B.: The similar corpus: a resource to foster the qualitative understanding of semantic similarity of texts. In: Proceedings of Semantic Relations-II. Enhancing Resources and Applications. The 8th Language Resources and Evaluation Conference, (LREC 2012), 23–25 May 2012
Salton, G., Singhal, A., Mitra, M., Buckley, C.: Automatic text structuring and summarization. Inf. Process. Manag. Int. J. 33(2), 193–207 (1997). Special issue: methods and tools for the automatic construction of hypertext
Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of 7th International Conference on Information Systems Implementation Modeling (ISIM 2004), Ostrava, CZ, pp. 93–100, April 2004
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL 2003), Edmonton, Canada, 27 May–01 June, vol. 1, pp. 173–180 (2003)
Šarić, F., Glavaš, G., Karan, M., Šnajder, J., Bašić, B.D.: TakeLab: systems for measuring semantic text similarity. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics, vol. 1: Proceedings of the Main Conference and the Shared Task, vol. 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), Stroudsburg, PA, USA, 7–8 July, pp. 441–448 (2012)
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics (ACL 1994), Stroudsburg, PA, USA, 27–30 June, pp. 133–138 (1994)
Acknowledgement
The work presented here falls under the Research Project Grant No. YSS/2015/000988 and supported by the Department of Science & Technology (DST) and Science and Engineering Research Board (SERB), Govt. of India. The authors would like to acknowledge the Department of Computer Science & Engineering, National Institute of Technology Mizoram, India for providing infrastructural facilities and support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Majumder, G., Pakray, P., Avendaño, D.E.P. (2018). Interpretable Semantic Textual Similarity Using Lexical and Cosine Similarity. In: Mandal, J., Sinha, D. (eds) Social Transformation – Digital Way. CSI 2018. Communications in Computer and Information Science, vol 836. Springer, Singapore. https://doi.org/10.1007/978-981-13-1343-1_59
Download citation
DOI: https://doi.org/10.1007/978-981-13-1343-1_59
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1342-4
Online ISBN: 978-981-13-1343-1
eBook Packages: Computer ScienceComputer Science (R0)