Abstract
With the tremendous growth in the number of scientific papers being published, searching for references while writing a scientific paper is a time-consuming process. A technique that could add a reference citation at the appropriate place in a sentence will be beneficial. In this perspective, the context-aware citation recommendation has been researched for around two decades. Many researchers have utilized the text data called the context sentence, which surrounds the citation tag, and the metadata of the target paper to find the appropriate cited research. However, the lack of well-organized benchmarking datasets, and no model that can attain high performance has made the research difficult. In this paper, we propose a deep learning-based model and well-organized dataset for context-aware paper citation recommendation. Our model comprises a document encoder and a context encoder. For this, we use graph convolutional networks layer, and bidirectional encoder representations from transformers, a pre-trained model of textual data. By modifying the related PeerRead dataset, we propose a new dataset called FullTextPeerRead containing context sentences to cited references and paper metadata. To the best of our knowledge, this dataset is the first well-organized dataset for a context-aware paper recommendation. The results indicate that the proposed model with the proposed datasets can attain state-of-the-art performance and achieve a more than 28% improvement in mean average precision and recall@k.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bai, X., Zhang, F., & Lee, I. (2019). Predicting the citations of scholarly paper. Journal of Informetrics, 13(1), 407. https://doi.org/10.1016/j.joi.2019.01.010.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. ArXiv e-prints
Dragomir, B. G. P. M., Radev, R., & Thomas, J. M. (2009). A bibliometric and network analysis of the field of computational linguistics. Journal of the American Society for Information Science and Technology.
Ebesu, T., Fang, Y. (2017). In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, SIGIR ’17 (pp. 1093–1096). New York, NY: ACM. https://doi.org/10.1145/3077136.3080730.
He, Q., Kifer, D., Pei, J., Mitra, P., & Giles, C. L. (2011). In Proceedings of the fourth ACM international conference on web search and data mining (pp. 755–764). ACM
He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). In Proceedings of the 19th international conference on World wide web (pp. 421–430). ACM
Huang, W., Wu, Z., Liang, C., Mitra, P., & Giles, C. L. (2015). In Proceedings of the twenty-ninth AAAI conference on artificial intelligence (pp. 2404–2410). AAAI Press.
Kang, D., Ammar, W., Dalvi, B., van Zuylen, M., Kohlmeier, S., Hovy, E., & Schwartz, R. (2018). In Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: Human language technologies, Volume 1 (Long Papers) (Vol. 1, pp. 1647–1661).
Kim, M. C., & Chen, C. (2015). A scientometric review of emerging trends and new developments in recommendation systems. Scientometrics, 104(1), 239. https://doi.org/10.1007/s11192-015-1595-5.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv e-prints
Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Kipf, T. N., & Welling, M. (2016). Variational graph auto-encoders, NIPS Workshop on Bayesian Deep Learning.
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. arXiv preprint arXiv:1405.4053
Liu, H., Kong, X., Bai, X., Wang, W., Bekele, T. M., & Xia, F. (2015). Context-based collaborative filtering for citation recommendation. IEEE Access, 3, 1695. https://doi.org/10.1109/ACCESS.2015.2481320.
Moed, H. F. (2010). Measuring contextual citation impact of scientific journals. Journal of Informetrics, 4(3), 265. https://doi.org/10.1016/j.joi.2010.01.002.
Niepert, M., Ahmed, M., & Kutzkov, K. (2016). Learning convolutional neural networks for graphs. arXiv:1605.05273
Radev, D. R., Muthukrishnan, P., & Qazvinian, V. (2009). In Proceedings, ACL workshop on natural language processing and information retrieval for digital libraries. Singapore
Radev, D., Muthukrishnan, P., Qazvinian, V., & Abu-Jbara, A. (2013). The ACL anthology network corpus, language resources and evaluation pp. 1–26. https://doi.org/10.1007/s10579-012-9211-2.
Rezende, D. J., Mohamed, S., & Wierstra, D. (2014). In Proceedings of the 31st international conference on international conference on machine learning—Volume 32 (JMLR.org), ICML’14 (pp. II–1278–II–1286). http://dl.acm.org/citation.cfm?id=3044805.3045035
Rokach, L., Mitra, P., Kataria, S., Huang, W., & Giles, L. (2013). A supervised learning method for context-aware citation recommendation in a large corpus. INVITED SPEAKER: Analyzing the Performance of Top-K Retrieval Algorithms, 1978.
Tang, X., Wan, X., & Zhang, X. (2014). In Proceedings of the 37th international ACM SIGIR conference on research & development in information retrieval, SIGIR ’14 (pp. 817–826). New York, NY: ACM. https://doi.org/10.1145/2600428.2609564.
Tan, J., Wan, X., Liu, H., & Xiao, J. (2018). Quoterec: Toward quote recommendation for writing. ACM Transactions on Information Systems, 36(34), 1.
Yang, L., Zheng, Y., Cai, X., Dai, H., Mu, D., Guo, L., et al. (2018). A LSTM based model for personalized context-aware citation recommendation. IEEE Access, 6, 59618. https://doi.org/10.1109/ACCESS.2018.2872730.
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) Grant and funded by the Korean Government (No. NRF-2015R1C1A1A01056185 and NRF-2018R1D1A1B07045825).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Jeong, C., Jang, S., Park, E. et al. A context-aware citation recommendation model with BERT and graph convolutional networks. Scientometrics 124, 1907–1922 (2020). https://doi.org/10.1007/s11192-020-03561-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03561-y