Identification and causal analysis of predatory open access journals based on interpretable machine learning

Wu, Jinhong; Liu, Tianye; Mu, Keliang; Zhou, Lei

doi:10.1007/s11192-024-04969-6

Identification and causal analysis of predatory open access journals based on interpretable machine learning

Published: 11 March 2024

(2024)
Cite this article

Scientometrics Aims and scope Submit manuscript

Jinhong Wu¹,
Tianye Liu¹,
Keliang Mu ORCID: orcid.org/0000-0002-6264-9732¹ &
…
Lei Zhou¹

264 Accesses
4 Altmetric
Explore all metrics

Abstract

Predatory journals have been a recent phenomenon, drawing attention from the academic community in the last decade. However, as the open access (OA) movement has gained momentum, the indiscriminate growth of predatory journals has had significant negative impacts on academic communication, scholarly publishing, and effective utilization of scientific resources. This rampant growth poses a serious threat to the healthy development of the OA movement and also undermines the integrity of research and the research ecosystem. Identifying predatory journals from the massive number of OA journals would assist scholars in evading negative consequences in areas of monetary investment, reputation, academic influence, and occupational advancement. Traditional methods for identifying predatory journals have relied heavily on the knowledge of domain experts. However, a large number of predatory journals exhibit latent and covert characteristics, and the growth rate of OA journals is extremely rapid, making it difficult for experts to identify these predatory journals from the vast number of OA journals. This paper proposes an interpretable machine learning model for early warning of predatory OA journals, which identifies predatory journals through the ensemble of multiple machine learning algorithms. Specifically, the proposed methodology first constructs an OA journal early warning indicator system and integrates multiple machine learning algorithms to compute the early warning values of OA journals. Then, the SHAP interpretable framework is introduced to analyze the causal factors of the early warning risks in a novel way. To verify the accuracy of the model's causal factors, we conduct a comparative analysis of domestic and foreign medical OA journals using case studies. The empirical analysis conducted in this study demonstrates the efficacy of the ensemble algorithm in accurately identifying the risk of predatory OA journals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Article Open access 30 April 2020

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

Article Open access 20 January 2024

References

Abp, A., Am, A., Ht, A., Sd, B., & Akm, A. (2020). Toward safer highways, application of xgboost and shap for real-time accident detection and feature analysis. Accident Analysis & Prevention, 136, 2313–2349.
Google Scholar
Ahmad, S., & Waris, A. (2017). Comparison among selected journal quality indicators of mechanical engineering journals. Journal of Scientometric Research, 6(3), 151–158.
Article Google Scholar
beda Sánchez, A. M., FernándezCano, A., & Callejas, Z. (2019). Using evaluative indicators of scientific journals to identify emergent research fronts in special education. Luis Gómez Chova, (pp. 3394–3403).
Beranová, L., Joachimiak, M. P., Kliegr, T., et al. (2022). Why was this cited? Explainable machine learning applied to COVID-19 research literature. Scientometrics, 127, 2313–2349. https://doi.org/10.1007/s11192-022-04314-9
Article CAS PubMed PubMed Central Google Scholar
Bohannon, J. (2013). Who’s afraid of peer review? Science, 342(6154), 60–65.
Article ADS CAS PubMed Google Scholar
Bornmann, L., & Daniel, H. D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391–392.
Article Google Scholar
Butler, D. (2008). Free journal-ranking tool enters citation market. Nature, 451(7174), 6.
Article ADS CAS PubMed Google Scholar
Butler, D. (2013). Investigating journals: the dark side of publishing. Nature, 495(7442), 433–435.
Article ADS CAS PubMed Google Scholar
Cantín, M., Muñoz, M., & Roa, I. (2015). Comparison between impact factor, eigenfactor score, and scimago journal rank indicator in anatomy and morphology journals. International Journal of Morphology, 33(3), 1183–1188.
Article Google Scholar
Cheng, W., & Ren, S. (2016). Investigation on article processing charge for OA papers from the world’s major countries. Chinese Science Bulletin, 61(26), 2861–2868.
Article Google Scholar
Clarivate. (2022). Journal Citation Reports. Retrieved July 31, 2022 from https://clarivate.com/zh-hant/news/news-releases-2022-0629/
Clarivate. (2023). Supporting integrity of the scholarly record: Our commitment to curation and selectivity in the Web of Science. Retrieved March 23, 2023 from https://clarivate.com/blog/supporting-integrity-of-the-scholarly-record-our-commitment-to-curation-and-selectivity-in-the-web-of-science/
Dadkhah, M., & Bianciardi, G. (2016). Ranking predatory journals: Solve the problem instead of removing it! Advanced Pharmaceutical Bulletin, 6(1), 1–4. https://doi.org/10.15171/apb.2016.001
Article PubMed PubMed Central Google Scholar
Dai, Q., & Yuan, X. (2018). Academic reputation risk analysis and early warning research of open access journals. Chinese Journal of Scientific and Technical Periodical, 29(11), 1063–1071.
Google Scholar
Ding, H., & Ruan, J. L. (2022). Exploring the factors influencing LIS scholars citing other’s works: An empirical research based on algorithmic attribution. Document, Information & Knowledge, 39(02), 83–97.
Google Scholar
DOAJ. Directory of open access journals. Retrieved July 31, 2022 from https://doaj.org/
Dong, X., & Bollen, J. (2015). Computational models of consumer confidence from large-scale online attention data: crowd-sourcing econometrics. Plos One, 10(3), e0120039.
Article PubMed PubMed Central Google Scholar
Falagas, M. E., Kouranos, V. D., Arencibia-Jorge, R., & Karageorgopoulos, D. E. (2008). Comparison of scimago journal rank indicator with journal impact factor. The FASEB Journal, 22(8), 2623–2628.
Article CAS PubMed Google Scholar
Fang, H. L. (2018). Comparison of cited half-life between Chinese and international SCI journals. Chinese Journal of Scientific and Technical Periodicals, 29(09), 935–939.
Google Scholar
Feng, D., & Wu, G. (2022). Interpretable machine learning-based modeling approach for fundamental properties of concrete structures. Journal of Building Structures, 43(4), 228–238.
Google Scholar
Fu, Z. K., Liu, B. X., Zhou, Z. Y., & Peng, Q. N. (2022). Research on patent quality analysis and classification prediction based on ensemble learning. Journal of Intelligence, 10, 89–96.
Google Scholar
Garfield, E. (1955). Citation indexes for science: a new dimension in documentation through association of ideas. Science, 122(3159), 108–111.
Article ADS CAS PubMed Google Scholar
Halim, Z., & Khan, S. (2019). A data science-based framework to categorize academic journals. Scientometrics, 119, 393–423. https://doi.org/10.1007/s11192-019-03035-w
Article Google Scholar
He, Y., & Xu, X. (2022). Empirical study on quality evaluation of OA journals: A comparative analysis of double-blind and open review modes. Chinese Journal of Scientific and Technical Periodical, 33(3), 305–310.
Google Scholar
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Ences of the United States of America, 102(46), 16569–16572.
Article ADS CAS Google Scholar
Hu, D. H., Ren, L., & Han, H. (2010). Quality control mechanisms for open access journals: A PLoS The Chinese Academy of Sciences study. Chinese Journal of Scientific and Technical Periodicals, 4, 4.
Google Scholar
Huang, Y. Q., Liang, C. H., He, L., et al. (2016). Development and validation of a radiomics nomogram for preoperative prediction of lymph node metastasis in colorectal cancer. Journal of Clinical Oncology, 34(18), 2157.
Article PubMed Google Scholar
Ibrahim, M., Louie, M., Modarres, C., & Paisley, J. (2019). Global explanations of neural networks: Mapping the landscape of predictions. http://arxiv.org/abs/arXiv:1902.02384
Jaafar, R., Pereira, V., Saab, S. S., & El-Kassar, A. N. (2021). Which journal ranking list? A case study in business and economics. Euromed Journal of Business, 16(4), 361–380. https://doi.org/10.1108/emjb-05-2020-0039
Article Google Scholar
John, T. (2019). FTC hits predatory scientific publisher with a $50 million fine. Retrieved July 31, 2022 from https://arstechnica.com/science/2019/04/ftc-hits-predatory-scientific-publisher-with-a-50-million-fine/
John, M., & Liying, Y. (2017). Evaluating journal quality: A review of journal citation indicators and ranking in business and management. European Journal of Operational Research, 257(1), 323–337.
Article MathSciNet Google Scholar
Li, X., Chen, Y., & Zhao, Y. (2022). Analysis and enlightenment of international high risk academic journals: A case study of early warning journals released by Chinese Academy of Sciences. Journal of Library and Information Science, 7(4), 67–73.
Google Scholar
Li, J., Fang, Y., Sun, Y., & Han, L. (2020). Analysis of challenges and governance countermeasures of scientific research integrity in biomedical field based on retraction data. Bulletin of National Natural Science Foundation of China, 34(3), 305–310.
Google Scholar
Lin, Y., Gan, H., Mo, L., & Bian, D. (2020). International impact analysis of the Chinese science and technology periodicals on the top list for seven consecutive years from 2011 to 2017 from the perspective of bibliometrics. Journal of Navy Medicine, 41(6), 741–747.
Google Scholar
Lin, Z. (2021). Evolution of large comprehensive oversea open access scientific journal and enlightenment on the establishment of similar journals in China. Acta Editologica, 33(1), 114–118.
Google Scholar
Liu, X. L., Fang, H. L., Zhou, Z. X., Dong, J. J., & Sheng, L. N. (2011). Controll study of bibliometrics characteristic in Chinese scientific and technologic journals with different self-cited rates. Acta Editologica, 23(1), 4.
Google Scholar
Luan, M., Sun, D., Li, Z., & Zhu, R. (2020). Terrorism risk prediction model based on GRA-SVR—Taking “the Belt and Road” as an Example. Journal of Intelligence, 39(3), 37–41.
Google Scholar
Lundberg, S., & Lee, S. I. (2017). A unified approach to interpreting model predictions. http://arxiv.org/abs/arXiv:1705.07874
Ma, Y., Han, Y. K., Chen, M. S., & Che, Y. Q. (2022). Study on dynamic evaluation of sci-tech journals based on time series model. Applied Sciences-Basel, 12(24), 26. https://doi.org/10.3390/app122412864
Article CAS Google Scholar
Mo, J., & Ma, J. H. (2012). Quality evaluation and problems of chinese science and technology journals—Based on Scientists’ Questionnaire Survey. Chinese Journal of Scientific and Technical Periodicals, 23(6), 8.
Google Scholar
Moed, H. F. (2011). The source normalized impact per paper is a valid and sophisticated indicator of journal citation impact. Journal of the Association for Information Science & Technology, 62(1), 211–213.
Google Scholar
National Science Library, Chinese Academy of Sciences. (2020). Early warning list of international journals (trial). Retrieved July 31, 2022 from https://earlywarning.fenqubiao.com
Normile, D. Big-name scientists surprised to find themselves on journal board. Retrieved July 31, 2022 from https://www.science.org/content/article/big-name-scientists-surprised-find-themselves-journal-board
Paji, D. (2015). On the stability of citation-based journal rankings. Journal of Informetrics, 9(4), 990–1006.
Article Google Scholar
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?”: explaining the predictions of any classifier. ACM. doi, 10(1145/2939672), 2939778.
Google Scholar
Shapley, L. S. (1953). A value for n-person games. Princeton University Press.
Google Scholar
Su, Lx., Lyu, Ph., Yang, Z., et al. (2015). Scientometric cognitive and evaluation on smart city related construction and building journals data. Scientometrics, 105, 449–470. https://doi.org/10.1007/s11192-015-1697-0
Article Google Scholar
Sun, R., An, L., & Li, G. (2022). Patent value prediction based on multi-feature fusion—Taking 5G technology as a case. Journal of Modern Information, 11, 87–96.
Google Scholar
The Paper. (2021). Nearly a year after the two offices issued a document, many universities have established a “negative list” of periodicals. Retrieved July 31, 2022 from https://baijiahao.baidu.com/s?id=1630860580442462650
Tian, Y. P., Li, G., & Mao, J. (2023). Predicting the evolution of scientific communities by interpretable machine learning approaches. Journal of Informetrics, 17(2), 20. https://doi.org/10.1016/j.joi.2023.101399
Article Google Scholar
Valderrama, P., Valderrama, A., & Baca, P. (2020). Bibliometric analysis and evaluation of the journal medicina oral patología oral y cirugía bucal (2008–2018). Medicina oral, patologia oral y cirugia bucal,. https://doi.org/10.4317/medoral.23289
Article PubMed Google Scholar
Vundavalli, S., Naidu, G., Bhargav, A., Praveen, B. H., & Babburi, S. (2016). Quality of reporting of randomized controlled trials in ten academic indian dental journals. Indian Journal of Dental Research, 27(2), 116.
Article PubMed Google Scholar
Wei, M. (2019). Research on impact evaluation of open access journals. Scientometrics, 122(3), 1027–1049.
Google Scholar
Wolpert, A. J. (2013). For the sake of inquiry and knowledge–the inevitability of open access. New England Journal of Medicine, 368(9), 785–787.
Article CAS PubMed Google Scholar
Wu, T., Yang, J., Chen, C., Zhao, J., & Sun, J. L. (2015). Research on comprehensive evaluation indicators of scientific and technological journal citations based on factor analysis. Chinese Journal of Scientific and Technological Periodicals, 26(2), 5.
Google Scholar
Yang, H., Tao, X., Du, H., & Xu, L. (2017). Review on quality evaluation methods of open acces journals. Acta Editologica, 29(2), 150–152.
Google Scholar
Yu, L. P., & Du, W. (2023). Periodical classfication and its characteristics based on the relationship between timeliness and influence. Information and Documentation Services, 01, 52–61.
Google Scholar
Yu, L. P., & Pan, W. B. (2022). Key indicators of journal evaluation based on K-means and PLS-DA. Journal of Library and Information Science in Agriculture, 34(12), 55–64.
Google Scholar
Zarifmahmoudi, L., Jamali, J., & Sadeghi, R. (2015). Google scholar journal metrics: Comparison with impact factor and scimago journal rank indicator for nuclear medicine journals. Iranian Journal of Nuclear Medicine, 23(1), 8–14.
Google Scholar
Zhang, H., & Huang, S. (2007). Discussion about the evaluation system on OA journals. Journal of Information, 16(3), 124–126.
Google Scholar
Zhao, R. Y., & Wang, X. (2019). Evaluation and comparison of influence in international open access journals between China and USA. Scientometrics, 120(3), 1091–1110.
Article Google Scholar
Zhao, T., Dai, T., Lun, Z., & Gao, Y. (2021). An analysis of recently retracted articles by authors affiliated with hospitals in mainland china. Journal of Scholarly Publishing, 52(2), 107–122.
Article Google Scholar
Zong, Z. J. (2022). Characteristics of journals on the early warning list. Journal of Intelligence, 41(12), 8.
Google Scholar

Download references

Acknowledgements

This work was supported by the China Scholarship Council.

Funding

Funding was provided 2020 Hubei Provincial Social Science Foundation Pre-Funded Projects (Grant No. 20ZD053), Social Science Foundation of Shaanxi Province (Grant No. 19CTQ030).

Author information

Authors and Affiliations

Wuhan Textile University, Wuhan, China
Jinhong Wu, Tianye Liu, Keliang Mu & Lei Zhou

Authors

Jinhong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tianye Liu
View author publications
You can also search for this author in PubMed Google Scholar
Keliang Mu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MKL: Collected the data, Contributed data or analysis tools, Performed the analysis, Wrote the manuscript.WJH: Comment on the overall framework of the paper, provide article revisions, and offer ideas. LTY: Collected experimental data, redid experiments, and wrote revisions. ZL: Conceived and designed the analysis, Wrote the manuscript and designed the figures, Other contribution.

Corresponding author

Correspondence to Keliang Mu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, J., Liu, T., Mu, K. et al. Identification and causal analysis of predatory open access journals based on interpretable machine learning. Scientometrics (2024). https://doi.org/10.1007/s11192-024-04969-6

Download citation

Received: 10 April 2023
Accepted: 13 February 2024
Published: 11 March 2024
DOI: https://doi.org/10.1007/s11192-024-04969-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identification and causal analysis of predatory open access journals based on interpretable machine learning

Abstract

Access this article

Similar content being viewed by others

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Literature reviews as independent studies: guidelines for academic practice

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Identification and causal analysis of predatory open access journals based on interpretable machine learning

Abstract

Access this article

Similar content being viewed by others

How to Write and Publish a Research Paper for a Peer-Reviewed Journal

Literature reviews as independent studies: guidelines for academic practice

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation