Investigating Citation Linkage as a Sentence Similarity Measurement Task Using Deep Learning

Singha Roy, Sudipta; Mercer, Robert E.; Urra, Felipe

doi:10.1007/978-3-030-47358-7_50

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12109))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2342 Accesses
1 Citations

Abstract

Research publications reflect advancements in the corresponding research domain. In these research publications, scientists often use citations to bolster the presented research findings and portray the improvements that come with these findings, at the same time, to make the contents more understandable to the audience by navigating the flow of information. In the science domain, a citation refers to the document from where this information originates, but doesn’t specify the text span that is actually being cited. This paper develops a framework which can create a linkage between the citing sentences from the ongoing research article and the related cited sentences from the corresponding referenced documents. Eventually, it will reduce the burden of the readers to go through all the sentences in the documents while acquiring the required background information. This citation linkage problem has been modelled as a semantic relatedness task where given a citing sentence the framework generates the sentence pairs with the citing sentence and each of the sentences from the reference document and then tries to determine which sentence pair is semantically similar and which pair is not. Construction of the citation linkage framework involves corpus creation and utilizing deep-learning models for semantic similarity measurement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

AbuRa’ed, A., Chiruzzo, L., Saggion, H.: What sentence are you referring to and why? Identifying cited sentences in scientific literature. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pp. 9–17 (2017)
Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
Bonin, S., Petrera, F., Niccolini, B., Stanta, G.: PCR analysis in archival postmortem tissues. Mol. Pathol. 56(3), 184–186 (2003)
Article Google Scholar
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Garfield, E.: Citation analysis as a tool in journal evaluation. Science 178(4060), 471–479 (1972)
Article Google Scholar
Garzone, M., Mercer, R.E.: Towards an automated citation classifier. In: Hamilton, H.J. (ed.) AI 2000. LNCS (LNAI), vol. 1822, pp. 337–346. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45486-1_28
Chapter Google Scholar
Houngbo, H., Mercer, R.E.: Investigating citation linkage with machine learning. In: Mouhoub, M., Langlais, P. (eds.) AI 2017. LNCS (LNAI), vol. 10233, pp. 78–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57351-9_10
Chapter Google Scholar
Houngbo, K.H.: Investigating citation linkage between research articles. Ph.D. thesis, The University of Western Ontario (2017)
Google Scholar
Huijsmans, C.J., Damen, J., van der Linden, J.C., Savelkoul, P.H., Hermans, M.H.: Comparative analysis of four methods to extract DNA from paraffin-embedded tissues: effect on downstream molecular applications. BMC Res. Notes 3(1), 239 (2010)
Article Google Scholar
Kayser, K., Stute, H., Lübcke, J., Wazinski, U.: Rapid microwave fixation–a comparative morphometric study. Histochem. J. 20(6–7), 347–352 (1988). https://doi.org/10.1007/BF01002728
Article Google Scholar
Kiros, R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)
Google Scholar
Li, L., et al.: Computational linguistics literature and citations oriented citation linkage, classification and summarization. Int. J. Digit. Libr. 19(2), 173–190 (2017). https://doi.org/10.1007/s00799-017-0219-5
Article MathSciNet Google Scholar
Li, L., Zhang, Y., Mao, L., Chi, J., Chen, M., Huang, Z.: CIST@CLSciSumm-17: multiple features based citation linkage, classification and summarization. In: BIRNDL@ SIGIR (2), pp. 43–54 (2017)
Google Scholar
Li, L., et al.: CIST@CLSciSumm-19: automatic scientific paper summarization with citances and facets. In: BIRNDL 2019 (2019)
Google Scholar
Liu, Y., Sun, C., Lin, L., Wang, X.: Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv preprint arXiv:1605.09090 (2016)
Logeswaran, L., Lee, H.: An efficient framework for learning sentence representations. arXiv preprint arXiv:1803.02893 (2018)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Neculoiu, P., Versteegh, M., Rotaru, M.: Learning text similarity with Siamese recurrent networks. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 148–157 (2016)
Google Scholar
Pagliardini, M., Gupta, P., Jaggi, M.: Unsupervised learning of sentence embeddings using compositional n-Gram features. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 528–540 (2018)
Google Scholar
Pakhomov, S., McInnes, B., Adam, T., Liu, Y., Pedersen, T., Melton, G.B.: Semantic similarity and relatedness between clinical terms: an experimental study. In: AMIA Annual Symposium Proceedings 2010, pp. 572–576 (2010)
Google Scholar
Palau, R.M., Moens, M.F.: Argumentation mining: the detection, classification and structure of arguments in text. In: Proceedings of the 12th International Conference on Artificial Intelligence and Law, pp. 98–107. ACM (2009)
Google Scholar
Radev, D.R., Jing, H., Budzikowska, M.: Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies. In: NAACL-ANLP 2000 Workshop: Automatic Summarization (2000)
Google Scholar
Wang, Y., et al.: High quality copy number and genotype data from FFPE samples using molecular inversion probe (MIP) microarrays. BMC Med. Genomics 2(1), 8 (2009)
Article MathSciNet Google Scholar
Wolffs, P., Grage, H., Hagberg, O., Rådström, P.: Impact of DNA polymerases and their buffer systems on quantitative real-time PCR. J. Clin. Microbiol. 42(1), 408–411 (2004)
Article Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar
Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: Twenty-Fourth International Joint Conference on Artificial Intelligence, pp. 4069–4076 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Western Ontario, London, ON, Canada
Sudipta Singha Roy, Robert E. Mercer & Felipe Urra

Authors

Sudipta Singha Roy
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Mercer
View author publications
You can also search for this author in PubMed Google Scholar
Felipe Urra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudipta Singha Roy .

Editor information

Editors and Affiliations

National Research Council Canada, Ottawa, ON, Canada
Cyril Goutte
Queen’s University, Kingston, ON, Canada
Xiaodan Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singha Roy, S., Mercer, R.E., Urra, F. (2020). Investigating Citation Linkage as a Sentence Similarity Measurement Task Using Deep Learning. In: Goutte, C., Zhu, X. (eds) Advances in Artificial Intelligence. Canadian AI 2020. Lecture Notes in Computer Science(), vol 12109. Springer, Cham. https://doi.org/10.1007/978-3-030-47358-7_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-47358-7_50
Published: 06 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47357-0
Online ISBN: 978-3-030-47358-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics