Document Similarity Approach Using Grammatical Linkages with Graph Databases

  • V. Priya
  • K. Umamaheswari
Conference paper
Part of the EAI/Springer Innovations in Communication and Computing book series (EAISICC)


Document similarity had become essential in many applications such as document retrieval, recommendation systems, and plagiarism checker. Many similarity evaluation approaches rely on word-based document representation, because it is very fast. But these approaches are not accurate when documents with different language and vocabulary are used. When graph representation is used for documents, they use some relational knowledge which is not feasible in many applications because of expensive graph operations. In this work a novel approach for document similarity computation which utilizes verbal intent has been developed. This improves the similarity and graph databases were also used for faster performance. The performance of the system is evaluated using various datasets and verbal intent-based approach has registered promising results.


Graph database Grammatical linkages Text similarity 


  1. 1.
    R. Anna, Z. Silvia, Assessing semantic similarity of texts—methods and algorithms, in Proceedings of the 43rd International Conference Applications of Mathematics in Engineering and Economics, 2010, pp. 1–8Google Scholar
  2. 2.
    K. Julian, An algorithm for finding noun phrase correspondences in bilingual corpora, in Proceedings of the 31st Annual Meeting on Association of Computational Linguistics, 2012, pp. 17–22Google Scholar
  3. 3.
    P. Christian, R. Achim, M. Aditya, Efficient graph-based document similarity, in Proceedings of the 13th International Conference on the Semantic Web. Latest Advances and New Domains, vol 9678, 2016, pp. 334–349Google Scholar
  4. 4.
    E. Gunes, R. Dragomir, LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res., 457–479 (2015)Google Scholar
  5. 5.
    Z. Ganggao, A. Carlos, Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 29(1), 72–85 (2017)CrossRefGoogle Scholar
  6. 6.
    R. Philip, Using information content to evaluate semantic similarity in a taxonomy, ACM Digital Library, 1995, pp. 448–453Google Scholar
  7. 7.
    E. Agirre, E. Alfonseca, K. Hall, J. Kravalova, M. Paşca, A. Soroa, A study on similarity and relatedness using distributional and WordNet-based approaches, in Proceedings of Human Language Technology Annual Conference North American Chapter Association of Computational Linguistics, 2009, pp. 19–27Google Scholar
  8. 8.
    A. Broder et al., A semantic approach to contextual advertising, in Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007, pp. 559–566Google Scholar
  9. 9.
    J.-H. Lee et al., Semantic contextual advertising based on the open directory project. ACM Trans. Web 7(4), 1–24 (2013)CrossRefGoogle Scholar
  10. 10.
    N. Takagi, M. Tomohiro, Wsl: sentence similarity using semantic distance between words, in Proceedings of the Ninth International Workshop on Semantic Evaluation, 2015, pp. 128–131Google Scholar
  11. 11.
    A. Gupta, D.K. Yadav, Semantic similarity measure using information content approach with depth for similarity calculation. Int. J. Sci. Technol. Res. 3(2), 165–169 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • V. Priya
    • 1
  • K. Umamaheswari
    • 2
  1. 1.Dr. Mahalingam College of Engineering and TechnologyPollachiIndia
  2. 2.PSG College of TechnologyCoimbatoreIndia

Personalised recommendations