Skip to main content
Log in

Application of k-means clustering algorithm to improve effectiveness of the results recommended by journal recommender system

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

This study investigates to evaluate feasibility of k-means clustering algorithm in order to improve effectiveness of the results recommended by RICEST Journal Finder System. More than 15,000 papers published in filed of engineering journals during 2013–2017 were collected from their websites. Their titles, abstracts and keywords were extracted, normalized and processed in order to form the test body. According to the number of papers collected, using Cochran's formula, 400 papers completely relevant to the subject of each journal were randomly and proportionally selected and entered the system as queries in order to receive the journals recommended by the system before and after k-means clustering algorithm and the results were recorded. Finally, effectiveness of the system results was determined at each stage by leave-one-out cross validation method based on precision at K top ranked results. Also, opinions of subject reviewers on relevance of the target journal were investigated through a questionnaire. Results showed that before data clustering, only 40% of target journal was recommended at the first 3 ranks. But after k-means clustering algorithm, in more than 80% of searches, the target journal was retrieved at the first 3 ranks. Also, effectiveness of the recommendations, according to 210 subject reviewers, after k-means clustering algorithm, showed that more than 80% of the recommended journals are completely relevant to the given paper. According to the study results, data clustering can significantly increase effectiveness of the results recommended by journal recommender systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. - Regional Information Center for Science and Technology.

References

  • Abbas, O. A. (2008). Comparisons between data clustering algorithms. International Arab Journal of Information Technology, 5(3), 320–325.

    MathSciNet  Google Scholar 

  • Aggarwal, C. C. (2016). An introduction to recommender systems. In Recommender systems: Springer Cham. https://doi.org/10.1007/978-3-319-29659-3_1

    Book  Google Scholar 

  • Ahuja, R., Solanki, A., & Nayyar, A. (2019). Movie recommender system using K-Means clustering and K-Nearest Neighbor. In 2019 9th International Conference on Cloud Computing, Data Science & Engineering, 263–268.

  • Almohsen, K. A., & Al-Jobori, H. (2015). Recommender systems in light of big data. International Journal of Electrical and Computer Engineering, 5(6), 1553–1563.

    Google Scholar 

  • Anchalia, P. P., Koundinya, A. K., & Srinath, N. K. (2013, June). MapReduce design of K-means clustering algorithm. In 2013 International Conference on Information Science and Applications, 1–5

  • Anderson, K. (2012). Editorial Rejection - Increasingly Important, Yet Often Overlooked Or Dismissed, in The Scholarly Kitchen

  • Bahadoran, Z., Mirmiran, P., Kashfi, K., & Ghasemi, A. (2021). Scientific Publishing in Biomedicine: How to Choose a Journal? International Journal of Endocrinology and Metabolism, 19(1), e108417.

    Google Scholar 

  • Bar-Ilan, J., Keenoy, K., Levene, M., & Yaari, E. (2009). Presentation bias is significant indetermining user preference for search results- A user study. Journal of the American Society for Information Science and Technology, 60(1), 135–149.

    Article  Google Scholar 

  • Basaran, D., Ntoutsi, E., & Zimek, A. (2017). Redundancies in data and their effect on the evaluation of recommendation systems: A case study on the amazon reviews datasets. In Proceedings of the 2017 SIAM international conference on data mining, 390–398.

  • Beel, J., Gipp, B., Langer, S., & Breitinger, C. (2016). Research-paper recommender systems: A literature survey. International Journal on Digital Libraries, 17(4), 305–338.

    Article  Google Scholar 

  • Beheshtipur, Jafari and javanbakht, (2012). Persian document clustering algorithm based on improved algorithm and feature selection. In 7th Scientific Conference on Command and Control of Iran Tehran

  • Borglund, J. (2013). Event-centric clustering of news articles

  • Celebi, M. E., & Aydin, K. (2016). Unsupervised learning algorithms. Springer International Publishing.

    Book  Google Scholar 

  • Chaboki bonab, haji eskandari, sharifi, (2019). Clustering Persian Web Documents Using a Combination of Data Mining Methods and an Evolutionary Algorithm. In 6th International Conference on New Science and Technology Findings with a Focus on Science in the Service of Development Tehran, Iran

  • Chen, T. T., and Lee, M. (2018). Research paper recommender systems on big scholarly data. In Pacific Rim Knowledge Acquisition Workshop, 251–260

  • Das, D., Sahoo, L., & Datta, S. (2017). A survey on recommendation system. International Journal of Computer Applications, 160(7), 6–10.

    Article  Google Scholar 

  • Dash, R., Paramguru, R. L., & Dash, R. (2011). Comparative analysis of supervised and unsupervised discretization techniques. International Journal of Advances in Science and Technology, 2(3), 29–37.

    Google Scholar 

  • Errami, M., Wren, J. D., Hicks, J. M., & Garner. H. R. (2007). eTBLAST: A web server to identify expert reviewers, appropriate journals and similar publications. Nucleic Acids Research, (35), Web Server issue. https://doi.org/10.1093/nar/gkm221.

  • Fayyaz, Z., Ebrahimian, M., Nawara, D., Ibrahim, A., & Kashef, R. (2020). Recommendation systems: algorithms, challenges, metrics, and business opportunities. Applied Sciences, 10(21), 7748.

    Article  Google Scholar 

  • Feng, X., Zhang, H., Ren, Y., Shang, P., Zhu, Y., Liang, Y., & Xu, D. (2019). The deep learning-based recommender system “pubmender” for choosing a biomedical publication venue: development and validation study. Journal of Medical Internet Research, 21(5), e12957.

    Article  Google Scholar 

  • Göksedef, M., & Gündüz-Öğüdücü, Ş. (2010). Combination of Web page recommender systems. Expert Systems with Applications, 37(4), 2911–2922.

    Article  Google Scholar 

  • Golubovic, N., Krintz, C., Wolski, R., Sethuramasamyraja, B., & Liu, B. (2019). A scalable system for executing and scoring K-means clustering techniques and its impact on applications in agriculture. International Journal of Big Data Intelligence, 6(3–4), 163–175.

    Google Scholar 

  • Guo, X., Li, X., & Yu, Y. (2021). Publication delay adjusted impact factor: The effect of publication delay of articles on journal impact factor. Journal of Informetrics, 15(1), 101100.

    Article  Google Scholar 

  • Huisman, J., & Smits, J. (2017). Duration and quality of the peer review process: The author’s perspective. Scientometrics, 113(1), 633–650.

    Article  Google Scholar 

  • Isinkaye, F. O., Folajimi, Y. O., & Ojokoh, B. A. (2015). Recommendation systems: Principles, methods and evaluation. Egyptian Informatics Journal, 16(3), 261–273.

    Article  Google Scholar 

  • Jafari Powersy, H., Hariri, N., Alipour-Hafezi, M., Bab Al-Hawaiji, F., & Khademi, M. (2020). Machine indexing of documents in the field of information retrieval using text mining in the rapidminer software. Jipm, 35(2), 349–374.

    Google Scholar 

  • Jiang, X., Li, C., & Sun, J. (2017). A modified K-means clustering for mining of multimedia databases based on dimensionality reduction and similarity measures. Cluster Computing, 4, 1–8.

    Google Scholar 

  • Jung, Y. G., Kang, M. S., & Heo, J. (2014). Clustering performance comparison using K-means and expectation maximization algorithms. Biotechnology & Biotechnological Equipment, 28(sup1), S44–S48.

    Article  Google Scholar 

  • Kadkhodaei P, Shams A. (2013). Clustering of persian texts using the algorithm. 2th Extending Industrial Applications of Information, Communication and Computations (EIAICC2013 Conference); 2013 Oct 30- 31; Tabriz.

  • Kalra, M., Lal, N., & Qamar, S. (2018). K-Mean clustering algorithm approach for data mining of heterogeneous data. In Information and Communication Technology for Sustainable Development. https://doi.org/10.1007/978-981-10-3920-1_7

    Article  Google Scholar 

  • Kalra, V., & Aggarwal, R. (2017). Importance of Text Data Preprocessing & Implementation in RapidMiner. In ICITKM, 71–75.

  • Kang, N., Doornenbal, M., Schijvenaars, B. (2015). Elsevier Journal Finder: Recommending Journals for your Paper. RecSys '15, September 16-20, Vienna Austria.

  • Khusro, S., Ali, Z., & Ullah, I. (2016). Recommender systems: issues, challenges, and research opportunities. In Information Science and Applications. https://doi.org/10.1007/978-981-10-0557-2_112

    Article  Google Scholar 

  • Kim, K. J., & Ahn, H. (2008). A recommender system using GA K-means clustering in an online shopping market. Expert Systems with Applications, 34(2), 1200–1209.

    Article  Google Scholar 

  • Kumar, S., Mishra, S., & Asthana, P. (2018). Automated detection of acute leukemia using k-mean clustering algorithm. Advances in Computer and Computational Sciences. https://doi.org/10.1007/978-981-10-3773-3_64

    Article  Google Scholar 

  • Lama, P. (2013) Clustering system based on text mining using the K-means algorithm: news headlines clustering

  • Lewandowski, D. (2008). The retrieval effectiveness of web search engines: Considering results descriptions. Journal of Documentation, 64(6), 915–937.

    Article  Google Scholar 

  • Li, X., Li, X., & Ma, H. (2020). Deep representation clustering-based fault diagnosis method with unsupervised data applied to rotating machinery. Mechanical Systems and Signal Processing, 143, 106825.

    Article  Google Scholar 

  • Liang, D., Charlin, L., McInerney. J. & Blei, D. M., (2016). Modeling user exposure in recommendation, in: Proceedings of the 25th Inter-national Conference on World Wide Web, International World Wide Web Conferences Steering Committee. 951–961.

  • Lin, Z., Hou, S., & Wu, J. (2016). The correlation between editorial delay and the ratio of highly cited papers in Nature Science and Physical Review Letters. Scientometrics, 107(3), 1457–1464.

    Article  Google Scholar 

  • Lops, P., Jannach, D., Musto, C., Bogers, T., & Koolen, M. (2019). Trends in content-based recommendation. User Modeling and User-Adapted Interaction, 29(2), 239–249.

    Article  Google Scholar 

  • Mihelčić, M., Antulov-Fantulin, N., Bošnjak, M., & Šmuc, T. (2012). Extending rapidminer with recommender systems algorithms. In RapidMiner Community Meeting and Conference (RCOMM 2012)

  • Mohamed, M. H., Khafagy, M. H., & Ibrahim, M. H. 2019. Recommender systems challenges and solutions survey. In 2019 International Conference on Innovative Trends in Computer Engineering, 149–155.

  • Moubayed, A., Injadat, M., Shami, A., & Lutfiyya, H. (2020). Student engagement level in an e-learning environment: Clustering using k-means. American Journal of Distance Education, 34(2), 137–156.

    Article  Google Scholar 

  • Mulligan, A., Hall, L., & Raphael, E. (2013). Peer review in a changing world: An international study measuring the attitudes of researchers. Journal of the American Society for Information Science and Technology, 64(1), 132–161.

    Article  Google Scholar 

  • Nguyen, T. T., Harper, F. M., Terveen, L., & Konstan, J. A. (2018). User personality and user satisfaction with recommender systems. Information Systems Frontiers, 20(6), 1173–1189.

    Article  Google Scholar 

  • Nowicki, S. (2003). Student vs search engine: Undergraduates rank results for relevance. Portal Libraries and the Academy, 3(3), 503–515.

    Article  Google Scholar 

  • Park, D. H., Kim, H. K., Kim, J. K., Choi, I. Y., & Kim, J. K. (2011). A review and classification of recommender systems research. International Proceedings of Economics Development & Research, 5(1), 290–294.

    Google Scholar 

  • Patibandla, R. L., & Veeranjaneyulu, N. (2018). Survey on clustering algorithms for unstructured data. In Intelligent Engineering Informatics, 421–429.

  • Pradhan, T., Gupta, A., & Pal, S. (2020). Hasvrec: A modularized hierarchical attention-based scholarly venue recommender system. Knowledge-Based Systems, 204, 106181.

    Article  Google Scholar 

  • Rahul, M., Pal, P., Yadav, V., Dellwar, D. K., & Singh, S. (2021). Impact of similarity measures in K-means clustering method used in movie recommender systems. IOP Conference Series: Materials Science and Engineering, 1022(1), 012101.

    Article  Google Scholar 

  • Raschka, S. (2018). Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning. ArXiv, abs/1811.12808

  • Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: introduction and challenges. Recommender Systems Handbook. https://doi.org/10.1007/978-1-4899-7637-6_1

    Article  Google Scholar 

  • Rodriguez, M. Z., Comin, C. H., Casanova, D., Bruno, O. M., Amancio, D. R., Costa, L. D. F., & Rodrigues, F. A. (2019). Clustering algorithms: A comparative approach. PLoS ONE, 14(1), e0210236.

    Article  Google Scholar 

  • Rollins, J., McCusker, M., Carlson, J., & Stroll, J. (2017). Manuscript Matcher: A Content and Bibliometrics-based Scholarly Journal Recommendation System. In BIR@ ECIR, 18–29.

  • Schuemie, M. J., & Kors, J. A. (2008). Jane: Suggesting journals, finding experts. Bioinformatics, 24(5), 727–728.

    Article  Google Scholar 

  • Shahshahani, M. S., Mohseni, M., Shakery, A., & Faili, H. (2019). PAYMA: A Tagged Corpus of Persian Named Entities. JSDP, 16(1), 91–110.

    Article  Google Scholar 

  • Sharma, R., & Singh, R. (2016). Evolution of recommender systems from ancient times to modern era: A survey. Indian Journal of Science and Technology, 9(20), 1–12.

    Google Scholar 

  • Soundarya, V., Kanimozhi, U., & Manjula, D. (2017). Recommendation System for Criminal Behavioral Analysis on Social Network using Genetic Weighted K-Means Clustering. JCP, 12(3), 212–220.

    Article  Google Scholar 

  • Vara, N., Mirzabeigi, M., Sotudeh, H., Fakhrahmad, S. M., & Mozafari, N. (forthcoming). The impact of data lack and data sparsity on the effectiveness of the results of the ricest journal finder results: A case study in the field of engineering. Iranian journal of information processing and management.

  • Wang, W. T., & Hou, Y. P. (2015). Motivations of employees’ knowledge sharing behaviors: A self-determination perspective. Information and Organization, 25(1), 1–26.

    Article  Google Scholar 

  • Wang, D., Liang, Y., Xu, D., Feng, X., & Guan, R. (2018). A content-based recommender system for computer science publications. Knowledge-Based Systems, 157, 1–9.

    Article  Google Scholar 

  • Wang, G., He, X., & Ishuga, C. I. (2018). HAR-SI: A novel hybrid article recommendation approach integrating with social information in scientific social network. Knowledge-Based Systems, 148, 85–99.

    Article  Google Scholar 

  • Yuan, Z., & Luo, F. (2019). Personalized Diet Recommendation Based on K-means and Collaborative Filtering Algorithm. Journal of Physics: Conference Series, 1213(3), 032013.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mahdieh Mirzabeigi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vara, N., Mirzabeigi, M., Sotudeh, H. et al. Application of k-means clustering algorithm to improve effectiveness of the results recommended by journal recommender system. Scientometrics 127, 3237–3252 (2022). https://doi.org/10.1007/s11192-022-04397-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04397-4

Keywords

Navigation