SentCite: a sentence-level citation recommender based on the salient similarity among multiple segments

Wang, Hei-Chia; Cheng, Jen-Wei; Yang, Che-Tsung

doi:10.1007/s11192-022-04339-0

SentCite: a sentence-level citation recommender based on the salient similarity among multiple segments

Published: 19 March 2022

Volume 127, pages 2521–2546, (2022)
Cite this article

Scientometrics Aims and scope Submit manuscript

513 Accesses
1 Citation
Explore all metrics

Abstract

Efficiently making adequate citations is becoming more challenging due to the rapidly increasing volume of publications. In practice, citing the appropriate references is a time-consuming and skill-required task. Accordingly, many studies have tried to help by providing citation-oriented support. In this field, citation recommendation is a significant research area because it addresses the problems of required profound skills and information overload. In this paper, we propose a sentence-level citation recommender, SentCite, that can identify the sentences that need links to references and can recommend citations. SentCite employs the convolutional recurrent neural network to extract the citing sentences and recommends citations based on the salient similarity between the sentences among the abstract, full text, and in-link context of the target papers. Unlike some other research in the big data domain, the recommended quality papers in this application are very limited. We proposed undersampling inlink context awareness to avoid overfitting problems. SentCite can recommend the most appropriate papers for the given sentences and outperforms other context-based methods in terms of improvement in mean reciprocal rank (MRR) 31.8%, mean average precision (MAP) 30.1%, and normalized discounted cumulative gain (NDCG) 33.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recommendation system based on deep learning methods: a systematic review and new directions

Article 03 August 2019

News recommender system: a review of recent progress, challenges, and opportunities

Article 21 July 2021

Analysing the impact of ChatGPT in research

Article Open access 21 March 2024

References

Bai, X., Wang, M., Lee, I., Yang, Z., Kong, X., & Xia, F. (2019). Scientific paper recommendation: A survey. IEEE Access, 7, 9324–9339.
Article Google Scholar
Beel, J., Gipp, B., Langer, S., & Breitinger, C. (2016). Paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4), 305–338.
Article Google Scholar
Bertin, M., Atanassova, I., Sugimoto, C. R., & Lariviere, V. (2016). The linguistic patterns and rhetorical structure of citation context: An approach using n-grams. Scientometrics, 109(3), 1417–1434.
Article Google Scholar
Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with Python: Analyzing text with the natural language toolkit. O'Reilly Media, Inc.
Cai, X., Han, J., & Yang, L. (2018, April). Generative adversarial network based heterogeneous bibliographic network representation for personalized citation recommendation. In Thirty-second AAAI conference on artificial intelligence.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.
Article Google Scholar
Dai, T., Zhu, L., Wang, Y., & Carley, K. M. (2019). Attentive stacked denoising autoencoder with bi-LSTM for personalized context aware citation recommendation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28, 553–568.
Article Google Scholar
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Doslu, M., & Bingol, H. O. (2016). Context sensitive article ranking with citation context analysis. Scientometrics, 108(2), 653–671.
Article Google Scholar
Duma, D., & Klein, E. (2014). Citation resolution: A method for evaluating context-based citation recommendation systems. In Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (Short Papers), Baltimore, MD (Vol. 2, pp. 358–363).
Ebesu, T., & Fang, Y. (2017, August). Neural citation network for context-aware citation recommendation. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval (pp. 1093–1096).
Färber, M., Thiemann, A., & Jatowt, A. (2018). To cite, or not to cite? Detecting citation contexts in text. In European conference on information retrieval (pp. 598–603). Springer.
Goldberg, Y. (2017). Neural network methods for natural language processing. Synthesis Lectures on Human Language Technologies, 10(1), 1–309.
Article Google Scholar
Habib, R., & Afzal, M. T. (2019). Sections-based bibliographic coupling for research paper recommendation. Scientometrics, 119, 643–656. https://doi.org/10.1007/s11192-019-03053-8
Article Google Scholar
Han, J., Song, Y., Zhao, W., Shi, S., & Zhang, H. (2018). hyperdoc2vec: Distributed representations of hypertext documents. In Proceedings of the 56th annual meeting of the Association for Computational Linguistics, Melbourne, Australia.
Hassan, S., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. ACM IEEE joint conference on digital libraries. Toronto (pp. 1–8). https://doi.org/10.1109/JCDL.2017.7991558
He, Q., Pei, J., Kifer, D., Mitra, P., & Giles, L. (2010). Context-aware citation recommendation. In Proceedings of the 19th international conference on World Wide Web (pp. 421–430). ACM.
Hernández-Alvarez, M., & Gomez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(3), 327–349.
Article Google Scholar
Huang, W., Wu, Z., Chen, L., Mitra, P., & Giles, C. (2015). A neural probabilistic model for context based citation recommendation. In AAAI, Austin, TX (pp. 2404–2410).
Jeong, C., Jang, S., Park, E., & Choi, S. (2020). A context-aware citation recommendation model with BERT and graph convolutional networks. Scientometrics, 124(3), 1907–1922.
Article Google Scholar
Jinha, A. (2010). Article 50 million: An estimate of the number of scholarly articles in existence. Learned Publishing, 23(3), 258–263.
Article Google Scholar
Johnson, R., Watkinson, A., & Mabe, M. (2018). The STM report: An overview of scientific and scholarly publishing. Technical and Medical Publishers.
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
Kataria, S., Mitra, P., & Bhatia, S. (2010). Utilizing context in generative Bayesian models for linked corpus. In AAAI, Georgia, USA (Vol. 10, p. 1).
Kobayashi, Y., Shimbo, M., & Matsumoto, Y. (2018). Citation recommendation using distributed representation of discourse facets in scientific articles. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries (pp. 243–251). ACM.
Lai, S., Xu, L., Liu, K., & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence.
Landis, J., & Koch, G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Article Google Scholar
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International conference on machine learning, Beijing, China (pp. 1188–1196).
Lutz, B., & Rüdiger, M. (2015). Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references. Journal of the Association for Information Science and Technology, 66(11), 2215–2222.
Article Google Scholar
Ma, S., Zhang, C., & Liu, X. (2020). A review of citation recommendation: From textual content to enriched context. Scientometrics, 122(3), 1445–1472.
Ma, S., Zhang, H., Zhang, C., & Liu, X. (2021). Chronological citation recommendation with time preference. Scientometrics, 126(4), 2991–3010.
Article Google Scholar
Makarenkov, V., & Rokach, L. (2020). Lessons learned from applying off-the-shelf BERT: There is no silver bullet. arXiv preprint arXiv:2009.07238.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Nogueira, R., Jiang, Z., Cho, K., & Lin, J. (2020). Evaluating pretrained transformer models for citation recommendation. CEUR Workshop Proceedings, 2591, 89–100.
Google Scholar
Onodera, N., & Yoshikane, F. (2015). Factors affecting citation rates of research articles. Journal of the Association for Information Science and Technology, 66(4), 739–764.
Article Google Scholar
Qi, H., Jian, P., Daniel, K., Prasenjit, M., & Lee, G. (2010). Context-aware citation recommendation. In Proceedings of the 19th international conference on World Wide Web (pp. 421–430). ACM.
Ricci, F., Rokach, L., & Shapira, B. (2011). Introduction to recommender systems handbook. In Recommender systems handbook (pp. 1–35). Springer.
Sugiyama, K., & Kan, M.-Y. (2015). A comprehensive evaluation of scholarly paper recommendation using potential citation papers. International Journal on Digital Libraries, 16(2), 91–109.
Article Google Scholar
Sugiyama, K., Kumar, T., Kan, M., & Tripathi, R. (2010). Identifying citing sentences in research papers using supervised learning. In 2010 International conference on information retrieval and knowledge management (CAMP) (pp. 67–72). IEEE.
Swearingen, K., & Sinha, R. (2001). Beyond algorithms: An HCI perspective on recommender systems. In ACM SIGIR 2001 workshop on recommender systems (Vol. 13, pp. 1–11). Citeseer.
Tang, J., & Zhang, J. (2009). A discriminative approach to topic-based citation recommendation. In Pacific–Asia conference on knowledge discovery and data mining (pp. 572–579). Springer.
Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110).
Valenzuela, M., Ha, V. A., & Etzioni, O. (2015). Identifying meaningful citations. AAAI Workshop: Scholarly Big Data.
Weber, R. O., Haolin, H., & Prateek, G. (2019). Explaining citation recommendations: Abstracts or full texts. In IJCAI 2019 explainable AI workshop. https://sites.google.com/view/xai2019/home.
Xu, S., Mariani, M. S., Lü, L., & Medo, M. (2020). Unbiased evaluation of ranking metrics reveals consistent performance in science and technology citation data. Journal of Informetrics, 14(1), 101005.
Article Google Scholar
Yang, L., Zhang, Z., Cai, X., & Dai, T. (2019). Attention-based personalized encoder–decoder model for local citation recommendation. Computational Intelligence and Neuroscience. https://doi.org/10.1155/2019/1232581
Article Google Scholar
Zhang, G., Ding, Y., & Milojević, S. (2013). Citation content analysis (CCA): A framework for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64(7), 1490–1503.
Article Google Scholar

Download references

Acknowledgements

The research is based on work supported by Taiwan Ministry of Science and Technology under Grant Nos. MOST 107-2410-H-006 040-MY3 and MOST 108-2511-H-0 06-009. We would like to thank partially research grant supported by “Higher Education SPROUT Project” and “Center for Innovative FinTech Business Models” of National Cheng Kung University (NCKU), sponsored by the Ministry of Education, Taiwan.

Author information

Authors and Affiliations

Institute of Information Management, National Cheng Kung University, T’ai-nan, 701, Taiwan
Hei-Chia Wang, Jen-Wei Cheng & Che-Tsung Yang
Center for Innovative FinTech Business Models, National Cheng Kung University, T’ai-nan, 701, Taiwan
Hei-Chia Wang & Che-Tsung Yang

Authors

Hei-Chia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jen-Wei Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Che-Tsung Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hei-Chia Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, HC., Cheng, JW. & Yang, CT. SentCite: a sentence-level citation recommender based on the salient similarity among multiple segments. Scientometrics 127, 2521–2546 (2022). https://doi.org/10.1007/s11192-022-04339-0

Download citation

Received: 01 July 2021
Accepted: 01 March 2022
Published: 19 March 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11192-022-04339-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SentCite: a sentence-level citation recommender based on the salient similarity among multiple segments

Abstract

Access this article

Similar content being viewed by others

Recommendation system based on deep learning methods: a systematic review and new directions

News recommender system: a review of recent progress, challenges, and opportunities

Analysing the impact of ChatGPT in research

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SentCite: a sentence-level citation recommender based on the salient similarity among multiple segments

Abstract

Access this article

Similar content being viewed by others

Recommendation system based on deep learning methods: a systematic review and new directions

News recommender system: a review of recent progress, challenges, and opportunities

Analysing the impact of ChatGPT in research

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation