Abstract
Previous studies on scientific literature rarely considered discrimination, i.e., the extent to which the content of some research is different from that of others. This paper contributes to the quantitative methods used for the research on the discrimination of article content via the proposal of the article discriminative capacity (ADC). Academic articles included in the Chinese Social Sciences Citation Index (CSSCI) in the discipline of Library and Information Science (LIS) are used as research objects. First, the most suitable text representation model is chosen to better represent the content of articles, thereby improving the performance of ADC. Then, in-depth quantitative analyses and evaluations of the articles from the perspectives of the source journals, publication years, authors, themes, and disciplines are conducted in conjunction with the ADC. The results demonstrate that the combination of the ADC with the BERT model can better identify a single article with high discriminative capacity. Articles in the fields of Information Science and Cross-LIS are found to have relatively low average ADC values. In contrast, articles in the fields of Library Science and Archives Science have high average ADC values. Articles with high ADC values have diverse themes and distinctive keywords, and can reveal new methods and promote interdisciplinarity. On the contrary, articles with low ADC values have similar research themes, and favor traditional, commentary, and conventional research. Moreover, scholars with high discriminative capacity are more willing to explore new fields, instead of being confined to traditional LIS research. This work may help promote the diversity of academic research and complement the evaluation system of academic articles. One major limitation of this study is that it only used data from Chinese databases.
Similar content being viewed by others
References
Adie, E., & Roe, W. (2013). Altmetric: Enriching scholarly content with article-level discussion and metrics. Learned Publishing, 26(1), 11–17.
Aickin, M., & Gensler, H. (1996). Adjusting for multiple testing when reporting research results: The Bonferroni vs Holm methods. American Journal of Public Health, 86(5), 726–728.
Bjarnason, T., & Sigfusdottir, I. D. (2002). Nordic impact: Article productivity and citation patterns in sixteen Nordic Sociology departments. Acta Sociologica, 45(4), 253–267.
Bornmann, L. (2014). How are excellent (highly cited) papers defined in bibliometrics? A quantitative analysis of the literature. Research Evaluation, 23(2), 166–173.
Bornmann, L. (2011). Scientific peer review. Annual Review of Information Science and Technology, 45(1), 197–245.
Bornmann, L., & Daniel, H.-D. (2009). The state of h index research Is the h index the ideal way to measure research performance? EMBO Reports, 10, 2–6.
Chen, C., Cribbin, T., Macredie, R., & Morar, S. (2002). Visualizing and tracking the growth of competing paradigms: Two case studies. Journal of the American Society for Information Science and Technology, 53(8), 678–689.
Council of Canadian Academies. (2012). Expert Panel on Science Performance and Research Funding Informing research choices: Indicators and judgment/The Expert Panel on Science Performance and Research Funding. Ottawa: Council of Canadian Academies. Retrieved from https://cca-reports.ca/the-council-of-canadian-academies-releases-an-expert-panel-report-informing-research-choices-indicators-and-judgment/
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171–4186.
Dunaiski, M., & Visser, W. (2012, October). Comparing paper ranking algorithms. In Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference (pp. 21–30).
Dunaiski, M., Visser, W., & Geldenhuys, J. (2016). Evaluating paper and author ranking algorithms using impact and contribution awards. Journal of Informetrics, 10(2), 392–407.
Egghe, L. (2006). Theory and practise of the g-index. Scientometrics, 69(1), 131–152.
Ferguson, C. L. (2020). Open Peer Review. Serials Review, 46(4), 286–291.
Flatt, J. W., Blasimme, A., & Vayena, E. (2017). Improving the measurement of scientific success by reporting a self-citation index. Publications, 5(3), 20.
Franceschet, M., & Costantini, A. (2010). The effect of scholar collaboration on impact and quality of academic papers. Journal of Informetrics, 4(4), 540–553.
Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science, 178, 471–479.
Garfield, E. (2006). Citation indexes for science. A new dimension in documentation through association of ideas. International Journal of Epidemiology, 35(5), 1123–1127.
Guo, T., & Li, G. (2019). Research on indication of corresponding authors. Library Tribune, 39(02), 1–10.
Heaton, J. P. (2005). The vital role of creativity in academic departments. BJU International, 96(3), 254–256.
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569–16572.
Huffman, D. A. (1954). The synthesis of sequential switching circuits. Journal of the Franklin Institute, 257(3), 161–190.
Hwang, W. S., Chae, S. M., Kim, S. W., & Woo, G. (2010, April). Yet another paper ranking algorithm advocating recent publications. In Proceedings of the 19th international conference on World wide web (pp. 1117–1118).
Jayasinghe, U. W., Marsh, H. W., & Bond, N. (2003). A multilevel cross-classified modelling approach to peer review of grant proposals: The effects of assessor and researcher attributes on assessor ratings. Journal of the Royal Statistical Society: Series A (statistics in Society), 166(3), 279–300.
Jin, B., Liang, L., Rousseau, R., & Egghe, L. (2007). The R-and AR-indices: Complementing the h-index. Chinese Science Bulletin, 52(6), 855–863.
Jin, Z., & Zeng, M. L. (2014). A new similarity measure for subject hierarchical structures. Journal of Documentation, 70(3), 364–391.
Kostoff, R. N. (1997). The principles and practices of peer review. Science and Engineering Ethics, 3(1), 19–34.
Lee, C. J., Sugimoto, C. R., Zhang, G., & Cronin, B. (2013). Bias in peer review. Journal of the American Society for Information Science and Technology, 64(1), 2–17.
Leopold, E., & Kindermann, J. (2002). Text categorization with support vector machines. How to represent texts in input space? Machine Learning, 46(1), 423–444.
Lin, W., Meng, F., & Wang, Z. (2011). Concept-features-based semantic text classification. Jisuanji Gongcheng yu Yingyong (Computer Engineering and Applications), 47(28), 139–142.
Martín, S. G., & Lafuente, V. (2017). References: Indicators for evaluation in scientific papers. Investigación Bibliotecológica, 31(71), 151–180.
Mulligan, A., Hall, L., & Raphael, E. (2013). Peer review in a changing world: An international study measuring the attitudes of researchers. Journal of the American Society for Information Science and Technology, 64(1), 132–161.
Newman, M. E. (2001). Scientific collaboration networks. I. Network construction and fundamental results. Physica Review E, 64(1), 131.
Newman, M. E. (2003). The structure and function of complex networks. SIAM Review, 45(2), 167–256.
Parimelazhagan, R., Umapathy, D., Sivakamasundari, I. R., Sethupathy, S., & Namasivayan, N. (2021). Association between tumor prognosis marker visfatin and proinflammatory cytokines in hypertensive patients. BioMed Research International, 2021(3), 1–7.
Perez-Hornero, P., Arias-Nicolas, J. P., Pulgarin, A. A., & Pulgarin, A. (2013). An annual JCR impact factor calculation based on Bayesian credibility formulas. Journal of Informetrics, 7(1), 1–9.
Peters, M., Neumann, M., Iyyer, M., Gardner, M., & Zettlemoyer, L. (2018). Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).
Price, D. J. D. S. (1965). Networks of scientific papers. Science, 149, 510–515.
Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2011). Altmetrics: A Manifesto, 26 October 2010. Retrieved from http://altmetrics.org/manifesto
Qiu, J. (1988). Bibliometrics. Scientific and Technological Literature Press.
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.
Salton, G., & Yang, C. S. (1973). On the specification of term values in automatic indexing. Journal of Documentation, 29(4), 351–372.
Salton, G., Yang, C. S., & Yu, C. T. (1975). A theory of term importance in automatic text analysis. Journal of the American Society for Information Science, 26(1), 33–44.
Shi, G., & Chen, Y. (2021). A comparative study on the integration of text enhanced and pre-trained language model in the classification of Internet political messages. Library and Information Service, 65(13), 96–107.
Squazzoni, F., Brezis, E., & Marušić, A. (2017). Scientometrics of peer review. Scientometrics, 113(1), 501–502.
Su, X., Deng, S., & Shen, S. (2014). The design and application value of the Chinese Social Science Citation Index. Scientometrics, 98(3), 1567–1582.
Ulčar, M., & Robnik-Šikonja, M. (2020). Finest bert and crosloengual bert. In International Conference on Text, Speech, and Dialogue (pp. 104–111). Springer, Cham.
Wang, F., Jayroe, T. J., Qiu, J., & Yu, H. (2014). Analysis on research activity and impact of authors in Chinese information science based on citation relationship. Journal of Documentation, 70(3), 461–477.
Wang, H., & Deng, S. (2017). A paper-text perspective: Studies on the influence of feature granularity for Chinese short-text-classification in the Big Data era. The Electronic Library, 35(4), 689–708.
Wang, H., Tang, H., Zhang, H., Zhang, J., & Zhang, Z. (2019). A study on the measurement methods of term Discriminative Capacity for academic resources. Journal of the China Society for Scientific and Technical Information, 38(10), 1078–1091.
Wu, L., Wang, D., & Evans, J. A. (2019a). Large teams develop and small teams disrupt science and technology. Nature, 566(7744), 378–382.
Wu, X., Lv, S., Zang, L., Han, J., & Hu, S. (2019b). Conditional bert contextual augmentation. In International Conference on Computational Science (pp. 84–95). Springer, Cham.
Xie, Q., Zhang, X., Ding, Y., & Song, M. (2020). Monolingual and multilingual topic analysis using LDA and BERT embeddings. Journal of Informetrics, 14(3), 101055.
Yaghoubi, M., Teymourzadeh, E., Bahadori, M., & Ghardashi, F. (2017). Conceptual model of innovation capability in industrial and academic research centers: A systematic review. Iranian Journal of Management Studies, 10(3), 609–640.
Yang, X., Jin, P., & Chen, X. (2015). The Construction of a Kind of Chat Corpus in Chinese Word Segmentation. In 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) (Vol. 3, pp. 168–172).
Yin, X., Huang, Y., Zhou, B., Li, A., Lan, L., & Jia, Y. (2019). Deep entity linking via eliminating semantic ambiguity with BERT. IEEE Access, 7, 169434–169445.
Zhang, Y., Ma, J., Wang, Z., Chen, B., & Yu, Y. (2018). Collective topical PageRank: A model to evaluate the topic-dependent academic impact of scientific papers. Scientometrics, 114(3), 1345–1372.
Acknowledgements
This study was supported by Jiangsu Province "Six Talent Peaks" high-level talent project "Measurement and Analysis of Multi-granularity Academic Object Discriminative" (JY-001) as well as Jiangsu Province Graduate Student Practice Innovation Program "The Evaluation Method of Academic Articles Based on Aricle Discriminative Capacity" (SJCX21_0020).
Funding
Six Talent Peaks Project in Jiangsu Province,JY-001,Hao Wang ,Graduate Research and Innovation Projects of Jiangsu Province,SJCX21_0020,Yuetong Chen.
Author information
Authors and Affiliations
Contributions
YC: conceived and designed the analysis; collected the data; performed the analysis; wrote the paper. HW: conceived and designed the analysis; performed the analysis. BZ: conceived and designed the analysis. WZ: contributed data or analysis tools; revise the paper.
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, Y., Wang, H., Zhang, B. et al. A method of measuring the article discriminative capacity and its distribution. Scientometrics 127, 3317–3341 (2022). https://doi.org/10.1007/s11192-022-04371-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04371-0