Skip to main content
Log in

A scientific citation recommendation model integrating network and text representations

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The number of scientific papers is increasing in the rapid growth. How to make paper acquisition efficient and provide effective citation recommendation is essential for researchers. Although the application of scientific citation recommendation has shown great improvements, the in-depth mining and fusion of various types of information has been ignored. In this paper, we propose a scientific citation recommendation model integrating network and text representation (SCR-NTR), which comprises data acquisition, feature representation, feature fusion and link prediction. We compare the network representation and text representation, respectively, and select the models performing best in the pre-experiment as the sub-models of SCR-NTR. The method of vector concatenate fusion is employed to fuse two kinds of information, and the logistic regression classifier is selected to carry out the link prediction. The extensive experiments reveal that our model can effectively improve the performance on citation recommendation. In addition, the effect of different fusion methods and different classifiers are investigated, and qualitative analysis is conducted to further verify the effectiveness of SCR-NTR. The experimental results show that leveraging both network and text representation can enhance the recommendation performance, and the heterogenous network representation learning can capture richer semantic information of the given network than the homogeneous one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Achakulvisut, T., Acuna, D. E., Ruangrong, T., & Kording, K. (2016). Science concierge: A fast content-based recommendation system for scientific publications. PLoS ONE, 11(7), e0158423.

    Article  Google Scholar 

  • Alhijawi, B., & Kilani, Y. (2020). A collaborative filtering recommender system using genetic algorithm. Information Processing & Management, 57(6), 102310.

    Article  Google Scholar 

  • Ali, Z., Kefalas, P., Muhammad, K., Ali, B., & Imran, M. (2020a). Deep learning in citation recommendation models survey. Expert Systems with Applications, 162, 113790.

    Article  Google Scholar 

  • Ali, Z., Qi, G., Muhammad, K., Ali, B., & Abro, W. A. (2020b). Paper recommendation based on heterogeneous network embedding. Knowledge-Based Systems, 210, 106438.

    Article  Google Scholar 

  • Ali, Z., Qi, G., Muhammad, K., Kefalas, P., & Khusro, S. (2021). Global citation recommendation employing generative adversarial network. Expert Systems with Applications, 180, 114888.

    Article  Google Scholar 

  • Ayala-Gómez, F., Daróczy, B., Benczúr, A., Mathioudakis, M., & Gionis, A. (2018). Global citation recommendation using knowledge graphs. Journal of Intelligent & Fuzzy Systems, 34(5), 3089–3100.

    Article  Google Scholar 

  • Bessa, A., Santos, R. L., Veloso, A., & Ziviani, N. (2017). Exploiting item co-utility to improve collaborative filtering recommendations. Journal of the Association for Information Science and Technology, 68(10), 2380–2393.

    Article  Google Scholar 

  • Bhagavatula, C., Feldman, S., Power, R., & Ammar, W. (2018). Content-Based Citation Recommendation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 238–251). New Orleans, Louisiana: ACL.

  • Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Cai, Y., Leung, H. F., Li, Q., Min, H., Tang, J., & Li, J. (2013). Typicality-based collaborative filtering recommendation. IEEE Transactions on Knowledge and Data Engineering, 26(3), 766–779.

    Article  Google Scholar 

  • Chandrasekaran, K., Gauch, S., Lakkaraju, P., & Luong, H. P. (2008). Concept-based document recommendations for citeseer authors. In International Conference on Adaptive Hypermedia and Adaptive Web-based Systems (pp. 83-92). Berlin, Heidelberg: Springer.

  • Dai, T., Zhu, L., Cai, X., Pan, S., & Yuan, S. (2018). Explore semantic topics and author communities for citation recommendation in bipartite bibliographic network. Journal of Ambient Intelligence and Humanized Computing, 9(4), 957–975.

    Article  Google Scholar 

  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171–4186). Minneapolis, Minnesota: ACL.

  • Dong, Y., Chawla, N. V., & Swami, A. (2017). metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 135–144). New York: ACM.

  • Ebesu, T., & Fang, Y. (2017). Neural citation network for context-aware citation recommendation. In Proceedings of the 40th international ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1093–1096). New York: ACM.

  • Färber, M., & Jatowt, A. (2020). Citation recommendation: Approaches and datasets. International Journal on Digital Libraries, 21(1), 375–405.

    Article  Google Scholar 

  • Fu, T. Y., Lee, W. C., & Lei, Z. (2017). Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 1797–1806). New York: ACM.

  • Fu, M., Qu, H., Yi, Z., Lu, L., & Liu, Y. (2018). A novel deep learning-based collaborative filtering model for recommendation system. IEEE Transactions on Cybernetics, 49(3), 1084–1096.

    Article  Google Scholar 

  • Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 855–864). New Work: ACM.

  • Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 1025–1035). New York: Curran Associates Inc.

  • Haruna, K., Ismail, M. A., Qazi, A., Kakudi, H. A., Hassan, M., Muaz, S. A., & Chiroma, H. (2020). Research paper recommender system based on public contextual metadata. Scientometrics, 125(1), 101–114.

    Article  Google Scholar 

  • He, J., Nie, J. Y., Lu, Y., & Zhao, W. X. (2012). Position-aligned translation model for citation recommendation. In International Symposium on String Processing and Information Retrieval (pp. 251-263). Berlin, Heidelberg: Springer.

  • Hoffman, M., Bach, F., & Blei, D. (2010). Online learning for latent dirichlet allocation. Advances in Neural Information Processing Systems, 23, 856–864.

    Google Scholar 

  • Hu, B., Fang, Y., & Shi, C. (2019). Adversarial learning on heterogeneous information networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 120–129). New York: ACM.

  • Iosifidis, A., Tefas, A., & Pitas, I. (2014). Discriminant bag of words based representation for human action recognition. Pattern Recognition Letters, 49, 185–192.

    Article  Google Scholar 

  • Jeong, C., Jang, S., Park, E., & Choi, S. (2020). A context-aware citation recommendation model with BERT and graph convolutional networks. Scientometrics, 124(3), 1907–1922.

    Article  Google Scholar 

  • Jiang, S., Qian, X., Shen, J., Fu, Y., & Mei, T. (2015). Author topic model-based collaborative filtering for personalized POI recommendations. IEEE Transactions on Multimedia, 17(6), 907–918.

    Google Scholar 

  • Jiu, M., Wolf, C., Garcia, C., & Baskurt, A. (2012). Supervised learning and codebook optimization for bag-of-words models. Cognitive Computation, 4(4), 409–419.

    Article  Google Scholar 

  • Joulin, A., Grave, É., Bojanowski, P., & Mikolov, T. (2017). Bag of Tricks for Efficient Text Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers (pp. 427–431). Valencia, Spain: ACL.

  • Kataria, S., Mitra, P., & Bhatia, S. (2010). Utilizing context in generative bayesian models for linked corpus. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (pp. 1340–1345). Atlanta, Georgia: AAAI Press.

  • Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.

  • Kobayashi, Y., Shimbo, M., & Matsumoto, Y. (2018). Citation recommendation using distributed representation of discourse facets in scientific articles. In Proceedings of the 18th ACM/IEEE on joint conference on digital libraries (pp. 243–251). Washington, USA: ACM.

  • Kong, X., Jiang, H., Wang, W., Bekele, T. M., Xu, Z., & Wang, M. (2017). Exploring dynamic research interest and academic influence for scientific collaborator recommendation. Scientometrics, 113(1), 369–385.

    Article  Google Scholar 

  • Livne, A., Gokuladas, V., Teevan, J., Dumais, S. T., & Adar, E. (2014). CiteSight: supporting contextual citation recommendation using differential search. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 807–816). New York: ACM.

  • Lu, Y., He, J., Shan, D., & Yan, H. (2011). Recommending citations with translation model. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (pp. 2017–2020). New York: ACM.

  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013a). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013b). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26, 3111–3119.

    Google Scholar 

  • Nallapati, R. M., Ahmed, A., Xing, E. P., & Cohen, W. W. (2008). Joint latent topic models for text and citations. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 542–550). New York: ACM.

  • Nogueira, R., Jiang, Z., Cho, K., & Lin, J. (2020). Navigation-based candidate expansion and pretrained language models for citation recommendation. Scientometrics, 125(3), 3001–3016.

    Article  Google Scholar 

  • Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543). Doha, Qatar: ACL.

  • Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 701–710). New York: ACM.

  • Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 2227–2237). New Orleans, Louisiana: ACL.

  • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf

  • Rodriguez-Prieto, O., Araujo, L., & Martinez-Romo, J. (2019). Discovering related scientific literature beyond semantic similarity: A new co-citation approach. Scientometrics, 120(1), 105–127.

    Article  Google Scholar 

  • Ruch, P., Baud, R., & Geissbühler, A. (2002). Evaluating and reducing the effect of data corruption when applying bag of words approaches to medical records. International Journal of Medical Informatics, 67(1–3), 75–83.

    Article  Google Scholar 

  • Saier, T., & Färber, M. (2020). Semantic Modelling of Citation Contexts for Context-Aware Citation Recommendation. In European Conference on Information Retrieval (pp. 220-233). Cham: Springer.

  • Sattar, A., Ghazanfar, M. A., & Iqbal, M. (2017). Building accurate and practical recommender system algorithms using machine learning classifier and collaborative filtering. Arabian Journal for Science and Engineering, 42(8), 3229–3247.

    Article  Google Scholar 

  • Sugiyama, K., & Kan, M. Y. (2013). Exploiting potential citation papers in scholarly paper recommendation. In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 153–162). New York: ACM.

  • Sugiyama, K., & Kan, M. Y. (2015). A comprehensive evaluation of scholarly paper recommendation using potential citation papers. International Journal on Digital Libraries, 16(2), 91–109.

    Article  Google Scholar 

  • Tang, J., & Zhang, J. (2009). A discriminative approach to topic-based citation recommendation. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 572-579). Berlin, Heidelberg: Springer.

  • Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (pp. 1067–1077). Florence, Italy: ACM.

  • Tao, S., Shen, C., Zhu, L., & Dai, T. (2020). SVD-CNN: A convolutional neural network model with orthogonal constraints based on SVD for context-aware citation recommendation. Computational Intelligence and Neuroscience. https://doi.org/10.1155/2020/5343214

    Article  Google Scholar 

  • Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2006). Hierarchical dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.

    Article  MathSciNet  Google Scholar 

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017). Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 6000–6010). New York: Curran Associates Inc.

  • Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., & Bengio, Y. (2017). Graph attention networks. arXiv preprint arXiv:1710.10903.

  • Wang, C., & Blei, D. M. (2011). Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 448–456). New York: ACM.

  • Wang, D., Cui, P., & Zhu, W. (2016). Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1225–1234). New York: ACM.

  • Wang, X., Ji, H., Shi, C., Wang, B., Ye, Y., Cui, P., & Yu, P. S. (2019a). Heterogeneous graph attention network. In the World Wide Web Conference (pp. 2022-2032). New York: ACM.

  • Wang, X., Zhang, Y., & Shi, C. (2019b). Hyperbolic heterogeneous information network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 5337–5344). Hawaii: AAAI Press.

  • Wang, L., Rao, Y., Bian, Q., & Wang, S. (2020). Content-Based Hybrid Deep Neural Network Citation Recommendation Method. In International Conference of Pioneering Computer Scientists, Engineers and Educators (pp. 3-20). Singapore: Springer.

  • Wei, J., He, J., Chen, K., Zhou, Y., & Tang, Z. (2017). Collaborative filtering and deep learning based recommendation system for cold start items. Expert Systems with Applications, 69, 29–39.

    Article  Google Scholar 

  • Yan, R., & Yan, H. (2013). Guess what you will cite: Personalized citation recommendation based on users’ preference. In Asia Information Retrieval Symposium (pp. 428-439). Berlin, Heidelberg: Springer.

  • Yang, L., Zhang, Z., Cai, X., & Dai, T. (2019). Attention-based personalized encoder-decoder model for local citation recommendation. Computational Intelligence and Neuroscience. https://doi.org/10.1155/2019/1232581

    Article  Google Scholar 

  • Yin, J., & Li, X. (2017). Personalized citation recommendation via convolutional neural networks. In Asia-Pacific web (APWeb) and web-age information management (WAIM) joint conference on web and big data (pp. 285-293). Beijing: Springer.

  • Yu, C., Zhao, X., An, L., & Lin, X. (2017). Similarity-based link prediction in social networks: A path and node combined approach. Journal of Information Science, 43(5), 683–695.

    Article  Google Scholar 

  • Zhang, Q., Mao, R., & Li, R. (2019). Spatial–temporal restricted supervised learning for collaboration recommendation. Scientometrics, 119(3), 1497–1517.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Grant No.71974202, 71921002, 71790612, and 72174153), and the project of the Ministry of Education of China (Grant No. 19YJC870029).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuanming Yu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qiu, T., Yu, C., Zhong, Y. et al. A scientific citation recommendation model integrating network and text representations. Scientometrics 126, 9199–9221 (2021). https://doi.org/10.1007/s11192-021-04161-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-04161-0

Keywords

Navigation