Paraphrase Identification of Marathi Sentences

  • Shruti SrivastavaEmail author
  • Sharvari Govilkar
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 26)


Paraphrasing is the expressing the already said sentence with rewording but without changing its inherent meaning. Hence, different structured sentences may carry the similar meaning and can be identified by paraphrase identification. Paraphrase identification of Marathi sentences is based on the study of structural and semantic analysis of Marathi sentences. Statistical similarity includes similarity on basis of word-set, word-order, word-vector and sumo metric. Whereas, by comparing UNL graphs of two sentences, semantic similarity is calculated which yield semantic equivalence of two sentences. The overall similarity of Marathi sentences is calculated by equally combining the scores of both the statistical and semantic similarity measures. Paraphrase Identification has its important contribution to various NLP tasks like Plagiarism Detection, Text summarization, Question Answering, information Retrieval, Text Simplification and paraphrase detection on SMS.


Statistical similarity Semantic similarity Sumo metric Universal Networking Language (UNL) 



My special thanks to Dr. Sharvari Govilkar, my project guide for trusting that I could do this project and stand up to her expectations. It was all her trust in me that led me to I would like to thank to the computer department of Pillai college of Engineering New Panvel for giving us the opportunity to conduct the research. I express my thanks to head of computer department and to the principal of Pillai College of Engg., New Panvel for extending his support.


  1. 1.
    Kong, L., Hao, Z., Chen, K., Han, Z., Tian, L., Qi, H.: HIT2016@DPIL-FIRE2016: detecting paraphrases in Indian Languages based on gradient tree boosting. In: DPIL (2016)Google Scholar
  2. 2.
    Sarkar, S., Pakray, P., Saha, S., Das, D., Bentham, J., Gelbukh, A.: NLP-NITMZ@DPIL-FIRE 2016: language independent paraphrases detection. In: DPIL (2016)Google Scholar
  3. 3.
    Sarkar, K.: KS_JU@DPIL-FIRE2016: detecting paraphrases in Indian languages using multinomial logistic regression model. In: DPIL (2016)Google Scholar
  4. 4.
    Saikh, T., Naskar, S.K., Bandyopadhyay, S.: JU_NLP@DPIL-FIRE2016: paraphrase detection in Indian languages - a machine learning approach. In: DPIL (2016)Google Scholar
  5. 5.
    Bhargava, R., Baoni, A., Jain, H.: BITS_PILANI@DPIL-FIRE 2016: paraphrase detection in Hindi language using syntactic features of phrase. In: DPIL (2016)Google Scholar
  6. 6.
    Vani, K., Gupta, D.: ASE@DPIL-FIRE2016: Hindi paraphrase detection using natural language processing techniques & semantic similarity computations. In: DPIL (2016)Google Scholar
  7. 7.
    Saini, A.: Anuj@DPIL-FIRE2016: a novel paraphrase detection method in Hindi language using machine learning. In: DPIL (2016)Google Scholar
  8. 8.
    Sindhu, L., Idicula, S.M.: CUSAT_NLP@DPIL-FIRE2016: Malayalam paraphrase detection. In: DPIL (2016)Google Scholar
  9. 9.
    Manju, K., Idicula, S.M.: CUSAT TEAM@DPIL-FIRE2016: detecting paraphrase in Indian languages-Malayalam. In: DPIL (2016)Google Scholar
  10. 10.
    Mathew, D., Idicula, S.M.: Paraphrase identification of Malayalam sentences - an experience. IEEE (2013)Google Scholar
  11. 11.
    Thangarajan, R., Kogilavani, S.V., Karthic, A., Jawahar, S.: KEC@DPIL-FIRE2016: detection of paraphrases on Indian languages. In: DPIL (2016)Google Scholar
  12. 12.
    Cordeiro, J., Dais, G., Brazil, P.: A Metric for Paraphrase Detection. IEEE (2007)Google Scholar
  13. 13.
    Lee, J.C., Cheah, Yu.-N.: Paraphrase detection using string similarity with synonyms. In: The Fourth Asian Conference on Information Systems, ACIS (2015)Google Scholar
  14. 14.
    Sethi, N., Agrawal, P., Madaan, V., Singh, S.K.: A novel approach to paraphrase Hindi sentences using natural language processing. Indian J. Sci. Technol. 9(28), July 2016 Google Scholar
  15. 15.
    Kozareva, Z., Montoyo, A.: Paraphrase identification on the basis of supervised machine learning techniques. In: Advances in Natural Language Processing: 5th International Conference on NLP (FinTAL 2006), Turku, Finland, pp. 524–533 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer EngineeringPillai College of Engineering, University of MumbaiMumbaiIndia

Personalised recommendations