Abstract
Quantitative measurements of bibliometrics based on knowledge entities (i.e., keywords) improve competencies in tracking the structure and dynamic development of various scientific domains. Co-word networks (a content analysis technique and type of knowledge network) are often employed to discern relationships among various scientific concepts in scholarly publications to reveal the development and evolution of scientific knowledge. In relation to evolutionary network analysis, different link prediction methods in network science can assist in the prediction of missing links and modelling of network dynamics. These traditional methods (based on topological similarity scores and time series methods of link prediction) can be used to predict future co-occurrence trends among scientific concepts. This study attempted to build supervised learning models for link prediction in co-word networks using network topological similarity metrics and their temporal evolutionary information. In addition to exploring the underlying mechanism of temporal co-word network evolution, classification datasets containing links with both positive and negative labels were also built. A set of topological metrics and their temporal evolutionary information were produced to describe instances of classification datasets. Supervised classifications methods were then applied to classify the links and accurately predict future associations among keywords. Time series based forecasting methods were used to predict the future values of topological evolution. Results in relation to supervised link prediction by different classifiers showed that both static and dynamic information are valuable in predicting new links between literary concepts extracted from scientific literature.
Similar content being viewed by others
References
Abbasi, A., Hossain, L., Uddin, S., & Rasmussen, K. J. R. (2011). Evolutionary dynamics of scientific collaboration networks: Multi-levels and cross-time analysis. Scientometrics, 89(2), 687–710.
Adamic, L. A., & Adar, E. (2003). Friends and neighbors on the Web. Social Networks, 3(25), 211–230.
Al Hasan, M., Chaoji, V., Salem, S. & Zaki, M. (2006). Link prediction using supervised learning. In 6th SDM’ workshop on link analysis, counter-terrorism and security, Bethesda, Maryland, Society for Industrial and Applied Mathematics.
Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37(1), 179–255.
Box, G. E., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control (revised ed.). San Francisco, CA: Holden-Day.
Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22(2), 191–235.
Canals, A. (2005). Knowledge diffusion and complex networks: A model of high-tech geographical industrial clusters. In Proceedings of the 6th European conference on organizational knowledge, learning, and capabilities (pp. 1–25). Boston, MA.
Cheng, X., Miao, D., & Wang, L. (2009). A statistics-based semantic relation analysis approach for document clustering. In P. Witold, M. Duoqian, S. Dominik, P. Georg, H. Qinghua, & W. Ruizhi (Eds.), Rough sets and knowledge technology (pp. 332–342). Shanghai: Springer International Publishing.
Choi, J., Yi, S., & Lee, K. C. (2011). Analysis of keyword networks in MIS research and implications for predicting knowledge evolution. Information and Management, 48(8), 371–381.
Chung, F., & Zhao, W. (2010). Pagerank and random walks on graphs. In G. O. H. Katona, A. Schrijver, T. Szonyi, & G. Sagi (Eds.), Fete of combinatorics and computer science (pp. 43–62). Berlin: Springer.
Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). Science mapping software tools: Review, analysis, and cooperative study among tools. Journal of the American Society for Information Science and Technology, 62(7), 1382–1402.
Davis, D., Lichtenwalter, R. & Chawla, N. V. (2011). Multi-relational link prediction in heterogeneous information networks. In Proceedings of the 2011 international conference on advances in social networks analysis and mining, IEEE Computer Society.
De Gooijer, J. G., & Hyndman, R. J. (2006). 25 years of time series forecasting. International Journal of Forecasting, 22(3), 443–473.
Ding, Y., Song, M., Han, J., Yu, Q., Yan, E., Lin, L., & Chambers, T. (2013). Entitymetrics: Measuring the impact of entities. PLoS ONE, 8(8), e71416.
Elsevier. (1880). Scopus. Amsterdam: Elsevier B. V.
Güneş, İ., Gündüz-Öğüdücü, Ş., & Çataltepe, Z. (2016). Link prediction using time series of neighborhood-based node similarity scores. Data Mining and Knowledge Discovery, 30(1), 147–180.
Guns, R. (2014). Link prediction. In Y. Ding, R. Rousseau, & D. Wolfram (Eds.), Measuring scholarly impact: Methods and practice (pp. 35–56). Cham: Springer.
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18.
He, Q. (1999). Knowledge discovery through co-word analysis. Library Trends, 48(1), 133.
Huang, Z., & Lin, D. K. (2009). The time-series link prediction problem with applications in communication surveillance. INFORMS Journal on Computing, 21(2), 286–303.
Huang, Z., & Zeng, D. D. (2006). A link prediction approach to anomalous email detection. In IEEE international conference on systems, man and cybernetics.
Hyndman, R. J., & Khandakar, Y. (2008). Automatic time series forecasting: The forecast package for R. Journal of Statistical Software, 27(3), 1–22.
Hyndman, R., Koehler, A. B., Ord, J. K., & Snyder, R. D. (2008). Forecasting with exponential smoothing: The state space approach. Berlin: Springer.
Jeh, G., & Widom, J. (2002). SimRank: A measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, Edmonton, Alberta, Canada, Association for Computing Machinery.
Kastrin, A., Rindflesch, T. C., & Hristovski, D. (2014). Link prediction on the semantic MEDLINE network. In S. Džeroski, P. Panov, D. Kocev, & L. Todorovski (Eds.), Discovery science (Vol. 8777, pp. 135–143). Bled: Springer International Publishing.
Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39–43.
Khan, A., Choudhury, N., Uddin, S., Hossain, L., & Baur, L. (2016). Longitudinal trends in global obesity research and collaboration: A review using bibliometric metadata. Obesity Reviews, 17(4), 377–384.
Kontostathis, A., Galitsky, L. M., Pottenger, W. M., Roy, S., & Phelps, D. J. (2004). A survey of emerging trend detection in textual data mining. In M. W. Berry (Ed.), Survey of text mining: Clustering, classification, and retrieval (Vol. 1, pp. 185–224). New York: Springer.
Latour, B., & Woolgar, S. (2013). Laboratory life: The construction of scientific facts. Princeton, NJ: Princeton University Press.
Lee, S., Yoon, B., & Park, Y. (2009). An approach to discovering new technology opportunities: Keyword-based patent map approach. Technovation, 29(6), 481–497.
Leydesdorff, L. (1996). Scientometrics and science studies: From Words and co-words to information and probabilistic entropy. Journal of the International Society for Scientometrics and Informetrics, 2, 33–39.
Leydesdorff, L. (2002). Indicators of structural change in the dynamics of science: Entropy statistics of the SCI Journal Citation Reports. Scientometrics, 53(1), 131–159.
Leydesdorff, L., & Milojević, S. (2015). Scientometrics. In D. J. Wright (Ed.), International encyclopedia of the social and behavioral sciences (pp. 322–327). Oxford: Elsevier.
Li, X., Du, N., Li, H., Li, K., Gao, J. & Zhang, A. (2014). A deep learning approach to link prediction in dynamic networks. In SIAM international conference on data mining, Philadelphia, USA, Society of Industrial & Applied Mathematics.
Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology, 58(7), 1019–1031.
Lichtenwalter, R. N., Lussier, J. T. & Chawla, N. V. (2010). New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM.
McNorgan, C., Kotack, R. A., Meehan, D. C., & McRae, K. (2007). Feature-feature causal relations and statistical co-occurrences in object concepts. Memory and Cognition, 35(3), 418–431.
Montemurro, M. A., & Zanette, D. H. (2013). Keywords and co-occurrence patterns in the Voynich Manuscript: An information-theoretic analysis. PLoS ONE, 8(6), e66344.
Newman, M. (2001). Clustering and preferential attachment in growing networks. Physical Review E, 64(2), 25102.
Noldus, R., & Van Mieghem, P. (2015). Assortativity in complex networks. Journal of Complex Networks, 3(4), 507–542.
Noyons, E. C., & van Raan, A. F. (1994). Bibliometric cartography of scientific and technological developments of an R & D field. Scientometrics, 30(1), 157–173.
Pan, R. K., Sinha, S., Kaski, K., & Saramäki, J. (2012). The evolution of interdisciplinarity in physics research. Scientific Reports, 2, 551.
Popping, R. (2003). Knowledge graphs and network text analysis. Social Science Information, 42(1), 91–106.
Rip, A., & Courtial, J. (1984). Co-word maps of biotechnology: An example of cognitive scientometrics. Scientometrics, 6(6), 381–400.
Ronda-Pupo, G. A., & Guerras-Martin, L. Á. (2012). Dynamics of the evolution of the strategy concept 1962–2008: A co-word analysis. Strategic Management Journal, 33(2), 162–188.
Rousseau, R. (2014). Library science: Forgotten founder of bibliometrics. Nature, 510(7504), 218.
Schulz, S., Costa, C. M., Kreuzthaler, M., Miñarro-Giménez, J. A., Andersen, U., Jensen, A. B. & Maegaard, B. (2014). Semantic relation discovery by using co-occurrence information. In: 9th Language resources and evaluation conference. Reykjavik: European Language Resources Association.
Shibata, N., Kajikawa, Y., & Sakata, I. (2012). Link prediction in citation networks. Journal of the American Society for Information Science and Technology, 63(1), 78–85.
Smalheiser, N. R., & Swanson, D. R. (1998). Using ARROWSMITH: A computer-assisted approach to formulating and assessing scientific hypotheses. Computer Methods and Programs in Biomedicine, 57(3), 149–153.
Soares, P. R. d. S., & Prudêncio, R. B. C. (2012). Time series based link prediction. In The 2012 international joint conference on neural networks (IJCNN), IEEE.
Su, H., & Lee, P. (2010). Network perspective of science and technology policy research community in Taiwan. Technology management for global economic growth (PICMET), 2010 Proceedings of PICMET’10: IEEE.
Sun, X., Kaur, J., Milojević, S., Flammini, A., & Menczer, F. (2012). Social dynamics of science. Scientific Reports, 3, 1069.
Tylenda, T., Angelova, R., & Bedathur, S. (2009). Towards time-aware link prediction in evolving social networks. In: Proceedings of the 3rd workshop on social network mining and analysis, Paris, France, Associations of Computing Machinery.
Uddin, S., Hossain, L., Abbasi, A., & Rasmussen, K. (2012). Trend and efficiency analysis of co-authorship network. Scientometrics, 90(2), 687–699.
Uddin, S., Hossain, L., & Rasmussen, K. (2013). Network effects on scientific collaborations. PLoS One, 8(2), e57546.
Uddin, S., Khan, A., & Baur, L. A. (2015). A framework to explore the knowledge structure of multidisciplinary research fields. PLoS One, 10(4), e0123537.
van der Eijk, C. C., van Mulligen, E. M., Kors, J. A., Mons, B., & van den Berg, J. (2004). Constructing an associative concept space for literature-based discovery. Journal of the American Society for Information Science and Technology, 55(5), 436–444.
Van Raan, A. (1997). Scientometrics: State-of-the-art. Scientometrics, 38(1), 205–218.
Van Raan, A. (2003). The use of bibliometric analysis in research performance assessment and monitoring of interdisciplinary scientific developments. Technology Assessment—Theory and Practice, 1(12), 20–29.
Waltman, L., van Eck, N. J., & Noyons, E. C. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629–635.
Wang, X., Jiang, T., & Li, X. (2010). Structures and dynamics of scientific knowledge networks: An empirical analysis based on a co-word network. Chinese Journal of Library and Information Science, 3(3), 19–36.
Wang, C., Satuluri, V. & Parthasarathy, S. (2007). Local probabilistic models for link prediction. In 7th IEEE international conference on data mining, ICDM 2007, Omaha, NE, IEEE.
Wang, X., & Sukthankar, G. (2014). Link prediction in heterogeneous collaboration networks. In R. S. Missaoui & I. Sarr (Eds.), Social network analysis-community detection and evolution (pp. 165–192). Cham: Springer.
Wang, P., Xu, B., Wu, Y., & Zhou, X. (2015). Link prediction in social networks: The state-of-the-art. Science China Information Sciences, 58(1), 1–38.
Wu, C., & Leu, H. (2014). Examining the trends of technological development in hydrogen energy using patent co-word map analysis. International Journal of Hydrogen Energy, 39(33), 19262–19269.
Yan, E., & Guns, R. (2014). Predicting and recommending collaborations: An author-, institution-, and country-level analysis. Journal of Informetrics, 8(2), 295–309.
Yang, Y., Lichtenwalter, R. N., & Chawla, N. V. (2015). Evaluating link prediction methods. Knowledge and Information Systems, 45(3), 751–782.
Yu, Q., Long, C., Lv, Y., Shao, H., He, P., & Duan, Z. (2014). Predicting co-author relationship in medical co-authorship networks. PLoS One, 9(7), e101214.
Zelinka, I., Davendra, D. D., Chadli, M., Senkerik, R., Dao, T. T., & Skanderova, L. (2012). Evolutionary dynamics as the structure of complex networks. Handbook of Optimization: From Classical to Modern Approach, 38, 215.
Zhou, T., Lü, L., & Zhang, Y.-C. (2009). Predicting missing links via local information. The European Physical Journal B, 71(4), 623–630.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Choudhury, N., Uddin, S. Time-aware link prediction to explore network effects on temporal knowledge evolution. Scientometrics 108, 745–776 (2016). https://doi.org/10.1007/s11192-016-2003-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-016-2003-5