Skip to main content
Log in

A novel hybrid paper recommendation system using deep learning

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Every year, thousands of papers are published in journals and conferences by researchers in many different fields. These papers are an important guide for other researchers. However, the increasing amount of digital data with the development of information technologies makes it difficult to reach the desired information. Recommendation systems play an important role in facilitating researchers' access to studies on their subjects. It provides faster and easier access to papers on the desired subject. Recommendation systems are developed according to the user profile or subject. In this paper, a novel hybrid paper recommendation system based on deep learning is proposed. The method uses a combination of document similarity, hierarchical clustering, and keyword extraction. Our aim is to group papers in different fields such as computer science, economics, medicine, or in a specific field, according to their subjects, and to present papers with high semantic similarity to the user according to the query entered. The study has been applied on real dataset containing papers from different categories such as machine learning, artificial intelligence, human–computer interaction in computer science. The success of each stage of the study has been evaluated separately. However, looking at the system as a whole, the overall performance of the proposed approach is 80%. Papers having high similarity with their queries have been recommended to users. Thus, access to the studies on the desired subject in the huge amount of papers has been made faster and easier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Ariff, N. M., Bakar, M. A. A., & Rahmad, M. I. (2018). Comparative study of document clustering algorithms. International Journal of Engineering and Technology (UAE), 7(4), 246–251.

    Article  Google Scholar 

  • Bai, X., Wang, M., Lee, I., Yang, Z., Kong, X., & Xia, F. (2019). Scientific paper recommendation: A survey. Ieee Access, 7, 9324–9339.

    Article  Google Scholar 

  • Bancu, C., Dagadita, M., Dascalu, M., Dobre, C., Trausan-Matu, S., & Florea, A. M. (2012). ARSYS—Article Recommender System. In 2012 14th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (pp. 349–355). IEEE.

  • Bharti, S. K., Babu, K. S. (2017). Automatic keyword extraction for text summarization: A survey. arXiv preprint arXiv:1704.03242.

  • Bulut, B., Gündoğan, E., Kaya, B., Alhajj, R., Kaya, M. (2020). User’s research interests based paper recommendation system: A deep learning approach. In Putting Social Media and Networking Data in Practice for Education, Planning, Prediction and Recommendation (pp. 117–130). Springer, Cham.

  • Bütün, E., & Kaya, M. (2019). Predicting citation count of scientists as a link prediction problem. IEEE Transactions on Cybernetics, 50(10), 4518–4529.

    Article  Google Scholar 

  • Bütün, E., Kaya, M., & Alhajj, R. (2018). Extension of neighbor-based link prediction methods for directed, weighted and temporal social networks. Information Sciences, 463, 152–165.

    Article  MathSciNet  Google Scholar 

  • Dai, A. M., Olah, C., Le, Q. V. (2015). Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998.

  • Firoozeh, N., Nazarenko, A., Alizon, F., & Daille, B. (2020). Keyword extraction: Issues and methods. Natural Language Engineering, 26(3), 259–291.

    Article  Google Scholar 

  • Gündoğan, E., & Kaya, M. (2019). Creating special issues automatically for papers accepted in journals. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1–4). IEEE.

  • Gündoğan, E., & Kaya, M. (2020). Research paper classification based on Word2vec and community discovery. In 2020 International Conference on Decision Aid Sciences and Application (DASA) (pp. 1032–1036). IEEE.

  • Lau, J. H., & Baldwin, T. (2016). An empirical evaluation of doc2vec with practical insights into document embedding generation. arXiv preprint arXiv:1607.05368.

  • Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning (pp. 1188–1196). PMLR.

  • Lee, Y. C., Yeom, J., Song, K., Ha, J., Lee, K., Yeo, J., & Kim, S. W. (2016b). Recommendation of research papers in DBpia: A Hybrid approach exploiting content and collaborative data. In 2016b IEEE International Conference on Systems, Man, and Cybernetics (SMC) (pp. 002966–002971). IEEE.

  • Lee, Y. C., Yeom, J., Song, K., Ha, J., Lee, K., Yeo, J., & Kim, S. W. (2016a). Recommendation of research papers in DBpia: A Hybrid approach exploiting content and collaborative data. In 2016a IEEE International Conference on Systems, Man, and Cybernetics (SMC) (pp. 002966–002971). IEEE.

  • Liu, H., Kou, H., Yan, C., & Qi, L. (2020). Keywords-driven and popularity-aware paper recommendation based on undirected paper citation graph. Complexity, 2020, 1–15. https://doi.org/10.1155/2020/2085638

    Article  Google Scholar 

  • Lorbeer, B., Kosareva, A., Deva, B., Softić, D., Ruppel, P., & Küpper, A. (2018). Variations on the clustering algorithm BIRCH. Big Data Research, 11, 44–53.

    Article  Google Scholar 

  • Ma, L., Zhang, Y. (2015, October). Using Word2Vec to process big text data. In 2015 IEEE International Conference on Big Data (Big Data) (pp. 2895–2897). IEEE.

  • Pan, L., Dai, X., Huang, S., & Chen, J. (2015). Academic paper recommendation based on heterogeneous graph. In Chinese computational linguistics and natural language processing based on naturally annotated big data (pp. 381–392). Springer, Cham.

  • Pan, S., Li, Z., & Dai, J. (2019). An improved TextRank keywords extraction algorithm. In Proceedings of the ACM Turing Celebration Conference-China (pp. 1–7).

  • Pera, M. S., & Ng, Y. K. (2014). Exploiting the wisdom of social connections to make personalized recommendations on scholarly articles. Journal of Intelligent Information Systems, 42(3), 371–391.

    Article  Google Scholar 

  • Qaiser, S., & Ali, R. (2018). Text mining: Use of TF-IDF to examine the relevance of words to documents. International Journal of Computer Applications, 181(1), 25–29.

    Article  Google Scholar 

  • Qingyun, Z., Yuansheng, F., Zhenlei, S., & Wanli, Z. (2020). Keyword extraction method for complex nodes based on TextRank algorithm. In 2020 International Conference on Computer Engineering and Application (ICCEA) (pp. 359–363). IEEE.

  • Ramadhani, F., Zarlis, M., & Suwilo, S. (2020). Improve BIRCH algorithm for big data clustering. IOP Conference Series: Materials Science and Engineering, 725(1), 012090.

    Article  Google Scholar 

  • Shirkhorshidi, A. S., Aghabozorgi, S., Wah, T. Y., & Herawan, T. (2014). Big data clustering: A review. In International conference on computational science and its applications (pp. 707–720). Springer, Cham.

  • Son, J., & Kim, S. B. (2018). Academic paper recommender system using multilevel simultaneous citation networks. Decision Support Systems, 105, 24–33.

    Article  Google Scholar 

  • Steinert, L., & Hoppe, H. U. (2016). A comparative analysis of network-based similarity measures for scientific paper recommendations. In 2016 Third European Network Intelligence Conference (ENIC) (pp. 17–24). IEEE.

  • Sugiyama, K., & Kan, M. Y. (2013). Exploiting potential citation papers in scholarly paper recommendation. In Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries (pp. 153–162).

  • Sun, J., Ma, J., Liu, Z., & Miao, Y. (2014). Leveraging content and connections for scientific article recommendation in social computing contexts. The Computer Journal, 57(9), 1331–1342.

    Article  Google Scholar 

  • Wang, G., Zhang, X., Wang, H., Chu, Y., & Shao, Z. (2021). Group-oriented paper recommendation with probabilistic matrix factorization and evidential reasoning in scientific social network. IEEE Transactions on Systems, Man, and Cybernetics: Systems.

  • Wang, G., He, X., & Ishuga, C. I. (2018). HAR-SI: A novel hybrid article recommendation approach integrating with social information in scientific social network. Knowledge-Based Systems, 148, 85–99.

    Article  Google Scholar 

  • Wang, H., Ye, J., Yu, Z., Wang, J., & Mao, C. (2020). Unsupervised keyword extraction methods based on a word graph network. International Journal of Ambient Computing and Intelligence (IJACI), 11(2), 68–79.

    Article  Google Scholar 

  • Wen, Y., Yuan, H., & Zhang, P. (2016). Research on keyword extraction based on word2vec weighted textrank. In 2016 2nd IEEE International Conference on Computer and Communications (ICCC) (pp. 2109–2113). IEEE.

  • West, J. D., Wesley-Smith, I., & Bergstrom, C. T. (2016). A recommendation system based on hierarchical clustering of an article-level citation network. IEEE Transactions on Big Data, 2(2), 113–123.

    Article  Google Scholar 

  • Xia, X. (2020). Clustering Analysis of Interactive Learning Activities Based on Improved BIRCH Algorithm. arXiv preprint arXiv:2010.03821.

  • Xia, F., Liu, H., Lee, I., & Cao, L. (2016). Scientific article recommendation: Exploiting common author relations and historical preferences. IEEE Transactions on Big Data, 2(2), 101–112.

    Article  Google Scholar 

  • Xia, F., Wang, W., Bekele, T. M., & Liu, H. (2017). Big scholarly data: A survey. IEEE Transactions on Big Data, 3(1), 18–35.

    Article  Google Scholar 

  • Zhang, Z., Petrak, J., & Maynard, D. (2018). Adapted textrank for term extraction: A generic method of improving automatic term extraction algorithms. Procedia Computer Science, 137, 102–108.

    Article  Google Scholar 

  • Zhao, W., Wu, R., & Liu, H. (2016). Paper recommendation based on the knowledge gap between a researcher’s background knowledge and research target. Information Processing & Management, 52(5), 976–988.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Scientific Research Projects Coordination Unit of Fırat University under Grant No: MF.20.09.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehmet Kaya.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gündoğan, E., Kaya, M. A novel hybrid paper recommendation system using deep learning. Scientometrics 127, 3837–3855 (2022). https://doi.org/10.1007/s11192-022-04420-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04420-8

Keywords

Navigation