Abstract
In scientometrics for trend analysis, parameter choices for observing trends are often made ad hoc in past studies. For examples, different year spans might be used to create the time sequence and different indices were chosen for trend observation. However, the effectiveness of these choices was hardly known, quantitatively and comparatively. This work provides clues to better interpret the results when a certain choice was made. Specifically, by sorting research topics in decreasing order of interest predicted by a trend index and then by evaluating this ordering based on information retrieval measures, we compare a number of trend indices (percentage of increase vs. regression slope), trend formulations (simple trend vs. eigen-trend), and options (various year spans and durations for prediction) in different domains (safety agriculture and information retrieval) with different collection scales (72500 papers vs. 853 papers) to know which one leads to better trend observation. Our results show that the slope of linear regression on the time series performs constantly better than the others. More interestingly, this index is robust under different conditions and is hardly affected even when the collection was split into arbitrary (e.g., only two) periods. Implications of these results are discussed. Our work does not only provide a method to evaluate trend prediction performance for scientometrics, but also provides insights and reflections for past and future trend observation studies.
Similar content being viewed by others
References
Braam, R. R., Moed, H. F., van Raan, A. F. J. (1989), Comparison and Combination of Co-citation and Co-word Clustering. Leiden: DSWO Press, University of Leiden.
Bruckner, E., Ebeling, W., Scharnhorst, A. (1990), The application of evolution models in scientometrics. Scientometrics 18(1–2): 21–41.
Buckley, C. trec_eval IR evaluation package. http://trec.nist.gov/trec_eval, accessed on 2007/02/10.
Callon, M., Courtial, J. P., Turner, W. A., Bauin, S. (1983), From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22: 191–235.
Chen, C. (2006), CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. Journal of the American Society for Information Science and Technology, 57(3): 359–377.
Chi, Y., Tseng, B. L., Tatemura, J. (2006), Eigen-trend: trend analysis in the blogosphere based on singular value decompositions. Paper presented at the Proceedings of the 15th ACM international Conference on Information and Knowledge Management (CIKM).
Lathauwer, L. D., Moor, B. D., Vandewalle, J. (2000), A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21(4).
Mendenhall, W., Sincich, T. L. (2003), A Second Course in Statistics: Regression Analysis (Sixth ed.): Prentice-Hall.
Noyons, E. C. M., Moed, H. F., Van Raan, A. F. J. (1999), Integrating research performance analysis and science mapping. Scientometrics, 46(3): 591–604.
Noyons, E. C. M., van Raan, A. F. J. (1998), Mapping Scientometrics, Informetrics, and Bibliometrics. Retrieved November 23, 2006, from http://www.cwts.nl/ed/sib/home.html
Noyons, E. C. M., van Raan, A. F. J. (1998), Monitoring science developments from dynamic perspective: self-organized structuring to map neural network research. Journal of the American Society for Information Science and Technology, 49(1): 68–81.
Rip, A. I., Courtial, J. (1984), Co-word maps of biotechnology: an example of cognitive scientometrics. Scientometrics, 6: 381–400.
Smeaton, A. F., Keogh, G., Gurrin, C., Mcdonald, K., Sodfing, T. (2003), Analysis of papers from twenty-five years of SIGIR conferences: What have we been doing for the last quarter of a century? ACM SIGIR Forum, 37(1): 49–53.
STFC (2004), The 8th Science and Technology Foresight Survey — Study on Rapidly-Developing Research Areas — Interim Report. National Institute of Science & Technology Policy, Japan.
Tseng, Y. H. (2002), Automatic thesaurus generation for Chinese documents. Journal of the American Society for Information Science and Technology, 53(13): 9.
Tseng, Y. H., Lin, C. J., Chen, H. H., Lin, Y. I. (2006), Toward generic title generation for clustered documents. Paper presented at the Proceedings of Asia Information Retrieval Symposium, Singapore, 2006, Oct. 16–18.
Tseng, Y. H., Lin, C. J., Lin, Y. I. (2007), Text mining techniques for patent analysis. Information Processing and Management, 43(5): 1216–1247.
Tseng, Y. H., Teahan, W. J. (2004, July 25–29), Verifying a Chinese collection for text categorization. Paper presented at the 27th International ACM SIGIR Conference on Research and Development in Information Retrieval — SIGIR’ 04, Sheffield, U.K.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tseng, YH., Lin, YI., Lee, YY. et al. A comparison of methods for detecting hot topics. Scientometrics 81, 73–90 (2009). https://doi.org/10.1007/s11192-009-1885-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-009-1885-x