Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Zeng, Tong; Acuna, Daniel E.

doi:10.1007/s11192-020-03421-9

Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Published: 28 March 2020

Volume 124, pages 399–428, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

936 Accesses
7 Citations
2 Altmetric
Explore all metrics

Abstract

Scientist learn early on how to cite scientific sources to support their claims. Sometimes, however, scientists have challenges determining where a citation should be situated—or, even worse, fail to cite a source altogether. Automatically detecting sentences that need a citation (i.e., citation worthiness) could solve both of these issues, leading to more robust and well-constructed scientific arguments. Previous researchers have applied machine learning to this task but have used small datasets and models that do not take advantage of recent algorithmic developments such as attention mechanisms in deep learning. We hypothesize that we can develop significantly accurate deep learning architectures that learn from large supervised datasets constructed from open access publications. In this work, we propose a bidirectional long short-term memory network with attention mechanism and contextual information to detect sentences that need citations. We also produce a new, large dataset (PMOA-CITE) based on PubMed Open Access Subset, which is orders of magnitude larger than previous datasets. Our experiments show that our architecture achieves state of the art performance on the standard ACL-ARC dataset (\(F_{1}=0.507\)) and exhibits high performance (\(F_{1}=0.856\)) on the new PMOA-CITE. Moreover, we show that it can transfer learning across these datasets. We further use interpretable models to illuminate how specific language is used to promote and inhibit citations. We discover that sections and surrounding sentences are crucial for our improved predictions. We further examined purported mispredictions of the model, and uncovered systematic human mistakes in citation behavior and source data. This opens the door for our model to check documents during pre-submission and pre-archival procedures. We discuss limitations of our work and make this new dataset, the code, and a web-based tool available to the community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Inline Citation Classification Using Peripheral Context and Time-Evolving Augmentation

ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers

Citation Worthiness Identification for Fine-Grained Citation Recommendation Systems

Article 23 January 2022

Meysam Roostaee

Notes

https://ciir.cs.umass.edu/downloads/sigir18_citation/.

References

Aksnes, D. W., & Rip, A. (2009). Researchers’ perceptions of citations. Research Policy, 38(6), 895–905.
Article Google Scholar
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–36.
Article Google Scholar
Allerton, D. J. (1969). The sentence as a linguistic unit. Lingua, 22, 27–46.
Article Google Scholar
ANSI, NISO, Z. (2013). JATS: Journal article tag suite. Baltimore: National Information Standards Organization.
Google Scholar
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018). Content-based citation recommendation. In Proceedings of NAACL-HLT 2018 (p. 13).
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.
Article Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3(Jan), 993–1022.
MATH Google Scholar
Bonab, H., Zamani, H., Learned-Miller, E. G., & Allan, J. (2018). Citation worthiness of sentences in scientific reports. In SIGIR (pp. 1061–1064).
Booth, W., Colomb, G., Williams, J., Bizup, J., & FitzGerald, W. (2016). The craft of research. Chicago guides to writing, editing, and publishing (4th ed.). Chicago: University of Chicago Press.
Google Scholar
Chen, C.-C. & Roth, C. (2012). Citation needed: the dynamics of referencing in wikipedia. In Proceedings of the eighth annual international symposium on wikis and open collaboration (p. 8). ACM.
Chen, J., & Zhuge, H. (2019). Automatic generation of related work through summarizing citations. Concurrency and Computation: Practice and Experience, 31(3), e4261.
Article Google Scholar
Chen, X., Xu, L., Liu, Z., Sun, M., & Luan, H. (2015). Joint learning of character and word embeddings. In Twenty-fourth international joint conference on artificial, intelligence.
Duma, D. & Klein, E. (2014). Citation resolution: A method for evaluating context-based citation recommendation systems. In Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 2: Short Papers) (vol. 2, pp. 358–363).
Duma, D., Liakata, M., Clare, A., Ravenscroft, J., & Klein, E. (2016). Applying core scientific concepts to context-based citation recommendation. In LREC.
Ebesu, T. & Fang, Y. (2017). Neural citation network for context-aware citation recommendation. In Proceedings of the 40th international ACM SIGIR conference on research and development in information (pp. 1093–1096). ACM.
Färber, M., Thiemann, A., & Jatowt, A. (2018). To cite, or not to cite? detecting citation contexts in text. In European conference on information (pp. 598–603). Springer.
Fetahu, B., Markert, K., & Anand, A. (2017). Fine-grained citation span detection for references in wikipedia. In Proceedings of the 2017 conference on empirical methods in natural language processing (pp. 1990–1999).
Firth, J. R. (1957). A synopsis of linguistic theory, 1930–1955. Studies in Linguistic Analysis, 53, 69.
Google Scholar
Frajzyngier, Z., Hodges, A., & Rood, D. S. (2005). Linguistic diversity and language theories (Vol. 72). Amsterdam: John Benjamins Publishing.
Book Google Scholar
Gazni, A., & Ghaseminik, Z. (2016). Author practices in citing other authors, institutions, and journals. Journal of the Association for Information Science and Technology, 67(10), 2536–2549.
Article Google Scholar
Graves, A., Mohamed, A.-R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 6645–6649). IEEE.
Graves, A., Wayne, G., & Danihelka, I. (2014). Neural turing machines. arXiv preprint arXiv:1410.5401.
Halliday, M. A. K., Matthiessen, C., & Halliday, M. (2014). An introduction to functional grammar. Abingdon: Routledge.
Book Google Scholar
Harris, Z. S. (1954). Distributional structure. Word, 10(2–3), 146–162.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer series in statistics (2nd ed.). New York: Springer.
Book Google Scholar
He, J., Nie, J.-Y., Lu, Y., & Zhao, W. X. (2012). Position-aligned translation model for citation recommendation. In International symposium on string processing and information (pp. 251–263). Springer.
He, Q., Kifer, D., Pei, J., Mitra, P., & Giles, C. L. (2011). Citation recommendation without author supervision. In Proceedings of the fourth ACM international conference on Web search and data mining (pp. 755–764). ACM.
He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. In WWW ’10 Proceedings of the 19th international conference on World wide web.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Article Google Scholar
Honnibal, M., & Montani, I. (2017). spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing (To appear).
Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C. L. (2015). A Neural probabilistic model for context based citation recommendation. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence (p. 7).
Jack, K., López-García, P., Hristakeva, M., & Kern, R. (2014). Citation needed: Filling in wikipedia’s citation shaped holes. In Bibliometric-enhanced information (pp. 45–52). BIR 2014.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2014). An introduction to statistical learning: With applications in R. New York: Springer Publishing Company, Incorporated.
MATH Google Scholar
Jebari, C., Cobo, M. J., & Herrera-Viedma, E. (2018). A new approach for implicit citation extraction. In International conference on intelligent data engineering and automated learning (pp. 121–129). Springer.
Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. Proceedings of COLING, 2012, 1343–1358.
Google Scholar
Jurafsky, D., & Martin, J. H. (2014). Speech and language processing (3rd ed.). London: Pearson.
Google Scholar
Kang, I.-S. & Kim, B.-K. (2012). Characteristics of citation scopes: A preliminary study to detect citing sentences. In Computer applications for database, education, and ubiquitous computing (pp. 80–85). Springer.
Kaplan, D., Tokunaga, T., & Teufel, S. (2016). Citation block determination using textual coherence. Journal of Information Processing, 24(3), 540–553.
Article Google Scholar
Küçüktunç, O., Saule, E., Kaya, K., & Çatalyürek, Ü. V. (2012). Direction awareness in citation recommendation. In DBRank’12.
Lancichinetti, A., Sirer, M. I., Wang, J. X., Acuna, D., Körding, K., & Amaral, L. A. N. (2015). High-reproducibility and high-accuracy method for automated topic classification. Physical Review X, 5(1), 011007.
Article Google Scholar
Li, P., Li, W., He, Z., Wang, X., Cao, Y., Zhou, J., & Xu, W. (2016). Dataset and neural recurrent sequence labeling model for open-domain factoid question answering. arXiv preprint arXiv:1607.06275.
Lin, Z., Feng, M., Santos, C. N. D., Yu, M., Xiang, B., Zhou, B., & Bengio, Y. (2017). A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130.
Lukic, I. K., Lukic, A., Gluncic, V., Katavic, V., Vucenik, V., & Marusic, A. (2004). Citation and quotation accuracy in three anatomy journals. Clinical Anatomy: The Official Journal of the American Association of Clinical Anatomists and the British Association of Clinical Anatomists, 17(7), 534–539.
Article Google Scholar
Luong, T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 1412–1421). Association for Computational Linguistics.
Manning, C., Raghavan, P., & Schütze, H. (2008). Introduction to information. Cambridge: Cambridge University Press.
MATH Google Scholar
Masic, I. (2013). The importance of proper citation of references in biomedical articles. Acta Informatica Medica, 21(3), 148.
Article Google Scholar
McNee, S. M., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S. K., Rashid, A. M., Konstan, J. A., & Riedl, J. (2002). On the recommending of citations for research papers. In Proceedings of the 2002 ACM conference on computer supported cooperative work (pp. 116–125). ACM.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of the 26th international conference on neural information processing systems - volume 2 (NIPS’13) (pp. 3111–3119). Red Hook: Curran Associates Inc.
Google Scholar
Mogull, S. A. (2017). Accuracy of cited “facts” in medical research articles: A review of study methodology and recalculation of quotation error rate. PLoS ONE, 12(9), e0184727.
Article MathSciNet Google Scholar
Nakov, P. I., Schwartz, A. S., & Hearst, M. (2004). Citances: Citation sentences for semantic analysis of bioscience text. Proceedings of the SIGIR, 4, 81–88.
Google Scholar
Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. International Conference on Machine Learning, 28, 1310–1318.
Google Scholar
Peng, H., Liu, J., & Lin, C.-Y. (2016). News citation recommendation with implicit and explicit semantics. In Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers) (vol. 1, pp. 388–398).
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543).
Ritchie, A. (2009). Citation context analysis for information retrieval. Technical report, University of Cambridge, Computer Laboratory.
Santos, C. D. & Zadrozny, B. (2014). Learning character-level representations for part-of-speech tagging. In Proceedings of the 31st international conference on machine learning (ICML-14) (pp. 1818–1826).
Sun, Y., & Fisher, R. (2003). Object-based visual attention for computer vision. Artificial Intelligence, 146(1), 77–123.
Article MathSciNet Google Scholar
Torres, R., McNee, S. M., Abel, M., Konstan, J. A., & Riedl, J. (2004). Enhancing digital libraries with techlens+. In Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries (pp. 228–236). ACM.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Attention is all you need. In Proceedings of the 31st international conference on neural information processing systems (NIPS’17) (pp. 6000–6010). Red Hook: Curran Associates Inc.
Wikipedia contributors. (2018). A rape on campus–Wikipedia, the free encyclopedia. Online Accessed 13 June-2018.
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 28 (pp. 649–657). Red Hook: Curran Associates Inc.
Google Scholar
Zhou, P., Shi, W., Tian, J., Qi, Z., Li, B., Hao, H., and Xu, B. (2016). Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 2: Short Papers) (pp. 207–212). Association for Computational Linguistics.

Download references

Acknowledgements

Tong Zeng was funded by the China Scholarship Council #201706190067. Daniel E. Acuna was partially funded by the National Science Foundation Awards #1800956.

Author information

Authors and Affiliations

School of Information Management, Nanjing University, Nanjing, China
Tong Zeng
School of Information Studies, Syracuse University, Syracuse, NY, USA
Tong Zeng & Daniel E. Acuna

Authors

Tong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Daniel E. Acuna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel E. Acuna.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeng, T., Acuna, D.E. Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models. Scientometrics 124, 399–428 (2020). https://doi.org/10.1007/s11192-020-03421-9

Download citation

Received: 31 October 2019
Published: 28 March 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s11192-020-03421-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Abstract

Access this article

Similar content being viewed by others

Inline Citation Classification Using Peripheral Context and Time-Evolving Augmentation

ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers

Citation Worthiness Identification for Fine-Grained Citation Recommendation Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling citation worthiness by using attention-based bidirectional long short-term memory networks and interpretable models

Abstract

Access this article

Similar content being viewed by others

Inline Citation Classification Using Peripheral Context and Time-Evolving Augmentation

ContriSci: A BERT-Based Multitasking Deep Neural Architecture to Identify Contribution Statements from Research Papers

Citation Worthiness Identification for Fine-Grained Citation Recommendation Systems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation