Abstract
Views and downloads of academic articles have become important supplementary indicators of scholarly impact. It is assumed that linguistic characteristics have an influence on article views and downloads to some extent. To understand the relationship between linguistic characteristics and article views and downloads, this study selected 63,002 full-text articles published from 2014 to 2015 in the PLoS (Public Library of Science) journals (PLoS Biology, PLoS Computational Biology, PLoS Genetics, PLoS Medicine, PLoS Neglected Tropical Diseases, PLoS One and PLoS Pathogens), and introduced seven indicators (title length, abstract length, full text length, sentence length, lexical diversity, lexical density and lexical sophistication) to measure linguistic characteristics of articles, grouped into Top 20% viewed and downloaded (proxy of highly browsed and downloaded articles), total and Bottom 20% viewed and downloaded categories. The results suggested that most linguistic characteristics played little role in article views and downloads in our data sets in general, but some linguistic characteristics (e.g. title length and average sentence length) in specific PLoS journal and platform (PLoS platform or PubMed Central platform) played certain role in article views and downloads. Also, journal differences and platform differences regarding linguistic characteristics of highly viewed and downloaded articles were existed.
Similar content being viewed by others
References
Belter, C. W. (2014). Measuring the value of research data: A citation analysis of oceanographic data sets. PLoS ONE,9(3), e92590.
Bollen, J., Luce, R., Vemulapalli, S. S., & Xu, W. (2002). Usage analysis for the identification of research trends in digital libraries. D-Lib Magazine. https://doi.org/10.1045/may2003-bollen.
Bollen, J., Sompel, H. V. D., Smith, J. A., & Luce, R. (2005). Toward alternative metrics of journal impact: A comparison of download and citation data. Information Processing and Management,41(6), 1419–1440.
Bonzi, S., & Snyder, H. W. (1991). Motivations for citation: A comparison of self citation and citation to others. Scientometrics,21(2), 245–254.
Boyack, K. W., Eck, N. J. V., Colavizza, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics,12(1), 59–73.
Boyack, K. W., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text. Journal of the American Society for Information Science and Technology,64(9), 1759–1767.
Brooks, T. A. (1986). Evidence of complex citer motivations. Journal of the Association for Information Science and Technology,37(1), 34–36.
Cano, V. (1989). Citation behavior: Classification, utility, and location. Journal of the Association for Information Science and Technology,40(4), 284–290.
Case, D. O., & Higgins, G. M. (2000). How can we investigate citation behavior? A study of reasons for citing literature in communication. Journal of the American Society for Information Science,51(7), 635–645.
Chen, B. (2018). Usage pattern comparison of the same scholarly articles between Web of Science (WoS) and Springer. Scientometrics,115(1), 519–537.
Chen, B., Zhong, Z., & Zhan, C. (2017). Usage pattern analysis of academic articles from two Chinese journals. In K. Holmberg & J. Vainio (Eds.), Proceedings of ISSI 2017 (pp. 366–375). Wuhan: Wuhan University.
Chen, B., Zhou, H., Zhong, Z., & Wang, Y. (2018). Exploring the user platform preference and user interest preference of chinese scholarly articles: A comparison based on usage metrics. Journal of Library Science in China,44(6), 90–104. (in Chinese).
Chi, P. S., & Glänzel, W. (2018). Comparison of citation and usage indicators in research assessment in scientific disciplines and journals. Scientometrics,116(1), 537–554.
Chi, P. S., & Glänzel, W. (2017). An empirical investigation of the associations among usage, scientific collaboration and citation impact. Scientometrics,112(1), 403–412.
Davis, P. M. (2006). Ejournal interface can influence usage statistics: Implications for libraries, publishers, and project counter. Journal of the Association for Information Science and Technology,57(9), 1243–1248.
Davis, P. M., & Solla, L. R. (2003). An ip-level analysis of usage statistics for electronic journals in chemistry: Making inferences about user behavior. Journal of the American Society for Information Science and Technology,54(11), 1062–1068.
De Sordi, O. J., Conejero, M. A., & Meireles, M. (2016). Bibliometric indicators in the context of regional repositories: Proposing the d-index. Scientometrics,107(1), 235–258.
Ding, Y., Song, M., Han, J., Yu, Q., Yan, E., Lin, L., et al. (2013). Entitymetrics: Measuring the impact of entities. PLoS ONE,8(8), e71416.
Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the Association for Information Science and Technology,65(9), 1820–1833.
Duan, Y., & Xiong, Z. (2017). Download patterns of journal papers and their influencing factors. Scientometrics,112(3), 1761–1775.
Elgendi, M. (2019). Characteristics of a highly cited article: A machine learning perspective. IEEE Access,7, 87977–87986.
Ferris, D. R. (1994). Rhetorical strategies in student persuasive writing: Differences between native and non-native English speakers. Research in the Teaching of English,28(1), 45–65.
Gipp, B., & Beel, J. (2009). Citation proximity analysis (CPA)—a new approach for identifying related work based on co-citation analysis. In B. Larsen & J. Leta (Eds.), Proceedings of ISSI 2009 (pp. 571–575). Wuhan: Wuhan University.
Gorraiz, J., Gumpenberger, C., & Schloegl, C. (2014). Usage versus citation behaviours in four subject areas. Scientometrics,101(2), 1077–1095.
Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics,7(4), 887–896.
Jamali, H. R., & Nikzad, M. (2011). Article title type and its relation with the number of downloads and citations. Scientometrics,88(2), 653–661.
Khan, M. S., & Younas, M. (2017). Analyzing readers behavior in downloading articles from IEEE digital library: A study of two selected journals in the field of education. Scientometrics,110(3), 1523–1537.
Kim, H. J., Jeong, Y. K., & Song, M. (2016). Content- and proximity-based author co-citation analysis using citation sentences. Journal of Informetrics,10(4), 954–966.
Kormos, J. (2011). Task complexity and linguistic and discourse features of narrative writing performance. Journal of Second Language Writing,20(2), 148–161.
Kurtz, M. J., & Bollen, J. (2010). Usage bibliometrics. Annual Review of Information Science and Technology,44(1), 1–64.
Kurtz, M. J., & Henneken, E. A. (2016). Measuring metrics-a 40-year longitudinal cross-validation of citations, downloads, and peer review in astrophysics. Journal of the Association for Information Science and Technology,68(3), 695–708.
Lippi, G., & Favaloro, E. J. (2013). Article downloads and citations: Is there any relationship? Clinica Chimica Acta,415, 195.
Liu, S., & Chen, C. (2012). The proximity of co-citation. Scientometrics,91(2), 495–511.
Liu, S., & Chen, C. (2013). The differences between latent topics in abstracts and citation contexts of citing papers. Journal of the American Society for Information Science and Technology,64(3), 627–639.
Lu, C., Bu, Y., Dong, X., Wang, J., Ding, Y., Larivière, V., et al. (2019a). Analyzing linguistic complexity and scientific impact. Journal of Informetrics,13(3), 817–829.
Lu, C., Bu, Y., Wang, J., Ding, Y., Torvik, V., Schnaars, M., et al. (2019b). Examining scientific writing styles from the perspective of linguistic complexity. Journal of the Association for Information Science and Technology,70(5), 462–475.
Lu, C., Ding, Y., & Zhang, C. (2017). Understanding the impact change of a highly cited article: A content-based citation analysis. Scientometrics,112(2), 927–945.
Mckeown, K., Daume, H., Chaturvedi, S., Paparrizos, J., Thadani, K., Barrio, P., et al. (2016). Predicting the impact of scientific concepts using full-text features. Journal of the Association for Information Science and Technology,67(11), 2684–2696.
Moed, H. F. (2005). Statistical relationships between downloads and citations at the level of individual documents within a single journal. Journal of the American Society for Information Science and Technology,56(10), 1088–1097.
Moed, H. F., & Halevi, G. (2016). On full text download and citation distributions in scientific-scholarly journals. Journal of the Association for Information Science and Technology,67(2), 412–431.
Ojima, M. (2006). Concept mapping as pre-task planning: A case study of three Japanese ESL writers. System,34(4), 566–585.
O’Leary, D. E. (2008). The relationship between citations and number of downloads in decision support systems. Decision Support Systems, 45(4), 972–980.
Pan, X., Yan, E., Cui, M., & Hua, W. (2018). Examining the usage, citation, and diffusion patterns of bibliometric mapping software: A comparative study of three tools. Journal of Informetrics,12(2), 481–493.
Pan, X., Yan, E., Cui, M., & Hua, W. (2019). How important is software to library and information science research? A content analysis of full-text publications. Journal of Informetrics,13(1), 397–406.
Pan, X., Yan, E., & Hua, W. (2016). Disciplinary differences of software use and impact in scientific literature. Scientometrics,109(3), 1–18.
Pan, X., Yan, E., Wang, Q., & Hua, W. (2015). Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers. Journal of Informetrics,9(4), 860–871.
Schloegl, C., Gorraiz, J., Gumpenberger, C., Jack, K., & Kraker, P. (2014). Comparison of downloads, citations and readership data for two information systems journals. Scientometrics,101(2), 1113–1128.
Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics,87(2), 373–388.
Subotic, S., & Mukherjee, (2014). Short and amusing: The relationship between title characteristics, downloads, and citations in psychology articles. Journal of Information Science,40(1), 115–124.
Vajjala, S., & Meurers, D. (2012). On improving the accuracy of readability classification using insights from second language acquisition. In Proceedings of the seventh workshop on building educational applications using NLP (pp. 163–173), July 8–14, 2012, Jelu Island, South Korea.
Wan, J. K., Hua, P. H., Rousseau, R., & Sun, X. K. (2010). The journal download immediacy index (DII): Experiences using a chinese full-text database. Scientometrics,82(3), 555–566.
Wang, X., Fang, Z., & Sun, X. (2016a). Usage patterns of scholarly articles on Web of Science: A study on Web of Science usage count. Scientometrics,109(2), 917–926.
Wang, X., Peng, L., Zhang, C., Xu, S., Wang, Z., Wang, C., et al. (2013a). Exploring scientists’ working timetable: A global survey. Journal of Informetrics,7(3), 665–675.
Wang, X., Wang, Z., & Xu, S. (2013b). Tracing scientist’s research trends realtimely. Scientometrics,95(2), 717–729.
Wang, X., Xu, S., & Fang, Z. (2016). Tracing digital footprints to academic articles: An investigation of PeerJ publication referral data. Retrieved October 28, 2018, from http://cn.arxiv.org/abs/1601.05271.
Wang, X., Xu, S., Peng, L., Wang, Z., Wang, C., Zhang, C., et al. (2012). Exploring scientists’ working timetable: Do scientists often work overtime? Journal of Informetrics,6(4), 655–660.
Wang, Y., & Zhang, C. (2018). Using full-text of research articles to analyze academic impact of algorithms. In G. Chowdhury, J. McLeod, V. Gillet, & P. Willett (Eds.), Proceedings of iConference (pp. 395–401). Sheffield: University of Sheffield.
Zhang, C., Ding, R., & Wang, Y. (2018). Using behavior and influence assessment of algorithms based on full-text academic articles. Journal of the China Society for Scientific and Technical Information,37(12), 1175–1187. (in Chinese).
Zhao, X. (2017). Exploring the features of usage data for academic literatures. Journal of Library Science in China,43(3), 44–57. (in Chinese).
Zhao, S. X., Lou, W., Tan, A. M., & Yu, S. (2018). Do funded papers attract more usage? Scientometrics,115(1), 153–168.
Acknowledgements
This paper is supported by Youth Program of National Social Science Fund in China (15CTQ035), Social Public Safety S&T Collaborative Innovation Center of Universities in Jiangsu Province and China Scholarship Council (ID: 201906845042).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, B., Deng, D., Zhong, Z. et al. Exploring linguistic characteristics of highly browsed and downloaded academic articles. Scientometrics 122, 1769–1790 (2020). https://doi.org/10.1007/s11192-020-03361-4
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03361-4