Abstract
This study investigates knowledge transfer of university research to industry moving forward from traditional indicators by using methods from computational linguistics. We introduce a novel empirical use of pattern recognition and text mining tools to compare scientific publications to company documents. The contribution of the paper is twofold; first, a new method for tracing knowledge transfer is suggested and, second, our understanding of university–industry knowledge transfer is increased by introducing an additional perspective. We find that common text mining tools are suitable to identify concrete chunks of research knowledge within the collaborating industry. The method proves direct links between published university research and the information disclosed by companies in their websites and documents. We offer an extension to commonly used concepts, which rely either on qualitative case studies or the assessment of commercial indicators for the assessment of university research. Our empirical evidence shows that knowledge exchange can be detected with this approach, and, given some additions in the tools selection and adaption, it has the potential to become a supplementary method for the research community.
Similar content being viewed by others
References
Aggarwal, C. C., & Zhai, C. (2012). Mining text data. Berlin: Springer.
Agrawal, A., & Henderson, R. (2002). Putting patents in context: Exploring knowledge transfer from MIT. Management Science, 48(1), 44–60.
Agrawal, A. K. (2001). University-to-industry knowledge transfer: Literature review and unanswered questions. International Journal of Management Reviews, 3(4), 285–302.
Aizawa, A. (2003). An information-theoretic perspective of tf-idf measures. Information Processing & Management, 39(1), 45–65.
Arundel, A., & Marcó, C. B. (2008). Developing internationally comparable indicators for the commercialization of publicly-funded research. Maastricht: UNU-MERIT, 31, 1–23.
Berry, M. W., & Castellanos, M. (2007). Survey of text mining: Clustering, classification, and retrieval (2nd ed., p. 241). New York: Springer.
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Branstetter, L. (2006). Is foreign direct investment a channel of knowledge spillovers? Evidence from Japan’s FDI in the United States. Journal of International Economics, 68(2), 325–344. https://doi.org/10.1016/j.jinteco.2005.06.006.
Chapman, Hall/CRC. (2010). Handbook of natural language processing (2nd Ed.). https://doi.org/10.1007/978-1-4612-3426-5_15.
Cheah, S. (2016). Framework for measuring research and innovation impact. Innovation, 18(2), 212–232. https://doi.org/10.1080/14479338.2016.1219230.
Cohen, W. M., Nelson, R. R., & Walsh, J. P. (2002). Links and impacts: The influence of public research on industrial R&d. Management Science, 48(1), 1–23.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. The Journal of Machine Learning, 12, 2493–2537.
D’Este, P., & Patel, P. (2007). University–industry linkages in the UK: What are the factors underlying the variety of interactions with industry? Research Policy, 36(9), 1295–1313.
Etzkowitz, H., & Leydesdorff, L. (2000a). The dynamics of innovation: From national systems and mode 2 to a triple helix of university–industry–government relations. Research Policy, 29(2), 109–123.
Etzkowitz, H., & Leydesdorff, L. (2000b). The dynamics of innovation: From National Systems and Mode 2 to a triple helix of university–industry–government relations. Research Policy, 29(2), 109.
Etzkowitz, H., Webster, A., Gebhardt, C., & Terra, B. R. C. (2000). The future of the university and the university of the future: Evolution of ivory tower to entrepreneurial paradigm. Research Policy, 29(2), 313–330.
Franceschini, S., Faria, L. G. D., & Jurowetzki, R. (2016). Unveiling scientific communities about sustainability and innovation. A bibliometric journey around sustainable terms. Journal of Cleaner Production, 127, 72–83. https://doi.org/10.1016/j.jclepro.2016.03.142.
Gaikwad, S. V., Chaugule, A., & Patil, P. (2014). Text mining methods and techniques. International Journal of Computer Applications, 85(17), 42–45.
Garechana, G., Río-Belver, R., Bildosola, I., & Salvador, M. R. (2017). Effects of innovation management system standardization on firms: Evidence from text mining annual reports. Scientometrics, 111(3), 1987–1999.
Glänzel, W., & Thijs, B. (2012). Using core documents for detecting and labelling new emerging topics. Scientometrics, 91(2), 399–416.
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101, 5228–5235.
Grimpe, C., & Hussinger, K. (2013). Formal and informal knowledge and technology transfer from academia to industry: Complementarity effects and innovation performance. Industry and Innovation, 20(8), 683–700.
Grün, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1–30.
Gulbrandsen, M., & Slipersaeter, S. (2007). The third mission and the entrepreneurial university model. In Universities and strategic knowledge creation (pp. 112–143).
Han, J. (2017). Technology commercialization through sustainable knowledge sharing from university–industry collaborations, with a focus on patent propensity. Sustainability, 9(10), 1808.
Heinze, N., & Hu, Q. (2006). The evolution of corporate web presence: A longitudinal study of large American companies. International Journal of Information Management, 26(4), 313–325. https://doi.org/10.1016/j.ijinfomgt.2006.03.008.
Jaffe, A. B., Trajtenberg, M., & Fogarty, M. S. (2000). Knowledge spillovers and patent citations: Evidence from a survey of inventors. American Economic Review, 90(2), 215–218.
Kao, A., & Poteet, S. R. (2007). Natural language processing and text mining. Berlin: Springer.
Khan, G. F., & Park, H. W. (2011). Measuring the triple helix on the web: Longitudinal trends in the university–industry–government relationship in Korea. Journal of the Association for Information Science and Technology, 62(12), 2443–2455.
Leydesdorff, L. (2004). The university–industry knowledge relationship: Analyzing patents and the science base of technologies. Journal of the Association for Information Science and Technology, 55(11), 991–1001.
Link, A. N., Siegel, D. S., & Bozeman, B. (2007). An empirical analysis of the propensity of academics to engage in informal university technology transfer. Industrial and Corporate Change, 16(4), 641–655.
Liyanage, C., Ballal, T., Elhag, T., & Li, Q. (2009). Knowledge communication and translation—A knowledge transfer model. Journal of Knowledge Management, 13(3), 118–131.
Magerman, T., Van Looy, B., & Song, X. (2010). Exploring the feasibility and accuracy of latent semantic analysis based text mining techniques to detect similarity between patent documents and scientific publications. Scientometrics, 82(2), 289–306.
Mao, W., & Chu, W. W. (2007). The phrase-based vector space model for automatic retrieval of free-text medical documents. Data and Knowledge Engineering, 61(1), 76–92. https://doi.org/10.1016/j.datak.2006.02.008.
Meyer, M., Siniläinen, T., & Utecht, J. T. (2003). Towards hybrid triple helix indicators: A study of university-related patents and a survey of academic inventors. Scientometrics, 58(2), 321–350.
Niwattanakul, S., Singthongchai, J., Naenudorn, E., & Wanapu, S. (2013). Using of Jaccard coefficient for keywords similarity. In Proceedings of the international multiconference of engineers and computer scientists (Vol. 1).
Paukkeri, M. S., & Honkela, T. (2010). Likey: Unsupervised language-independent keyphrase extraction. In Proceedings of the 5th international workshop on semantic evaluation (pp. 162–165). Association for Computational Linguistics.
Perkmann, M., & Walsh, K. (2007). University-industry relationships and open innovation: Towards a research agenda. International Journal of Management Reviews, 9(4), 259–280.
Ponweiser, M. (2012). Latent Dirichlet allocation in R. Ph.D. thesis.
Richardson, G. M., Bowers, J., Woodill, a J, Barr, J. R., Gawron, J. M., & Levine, Ra. (2014). Topic models: A tutorial with R. International Journal of Semantic Computing, 08(01), 85–98.
Robertson, S. (2004). Understanding inverse document frequency: On theoretical arguments for IDF. Journal of Documentation, 60, 2004.
Rus, V., Niraula, N., & Banjade, R. (2013). Similarity measures based on latent Dirichlet allocation (pp. 459–470). Berlin: Springer.
Schmidtler, M. A., & Amtrup, J. W. (2007). Automatic document separation: A combination of probabilistic classification and finite-state sequence modeling. In A. Kao & S. R. Poteet (Eds.), Natural language processing and text mining (pp. 123–144). Berlin: Springer.
Siegel, D. S., Waldman, D. A., Atwater, L. E., & Link, A. N. (2003). Commercial knowledge transfers from universities to firms: Improving the effectiveness of university–industry collaboration. The Journal of High Technology Management Research, 14(1), 111–133.
Sung, T. K., & Gibson, D. V. (2000). Knowledge and technology transfer: Levels and Key factors. In: Proceeding of the 4th international conference on technology policy and innovation
Thursby, J. G. J. J. G., Ra, Jensen, & Thursby, M. C. M. (2001). Objectives, characteristics and outcomes of university licensing: A survey of major US universities. The Journal of Technology Transfer, 26(1), 59–72.
Tijssen, R. J., Van Leeuwen, T. N., & Van Wijk, E. (2009). Benchmarking university–industry research cooperation worldwide: Performance measurements and indicators based on co-authorship data for the world’s largest universities. Research Evaluation, 18(1), 13–24.
Toutkoushian, R. K., Porter, S. R., Danielson, C., & Hollis, P. R. (2003). Using publications counts to measure an institution’s research productivity. Research in Higher Education, 44(2), 121–148.
Tussen, R., Buter, R., & Van Leeuwen, T. N. (2000). Technological relevance of science: An assessment of citation linkages between patents and research papers. Scientometrics, 47(2), 389–412.
Van Eck, N. J., & Waltman, L. (2017). Citation-based clustering of publications using citnetexplorer and vosviewer. Scientometrics, 111(2), 1053–1070.
Wu, Y., Welch, E. W., & Huang, W. L. (2015). Commercialization of university inventions: Individual and institutional factors affecting licensing of university patents. Technovation, 36, 12–25.
Xia, T., & Chai, Y. (2011). An improvement to TF-IDF: Term distribution based term weight algorithm. Journal of Software, 6(3), 413–420.
Yau, C. K., Porter, A., Newman, N., & Suominen, A. (2014). Clustering scientific documents with topic modeling. Scientometrics, 100(3), 767–786.
Zhang, Y., Zhou, X., Porter, A. L., Gomila, J. M. V., & Yan, A. (2014). Triple helix innovation in China’s dye-sensitized solar cell industry: Hybrid methods with semantic triz and technology roadmapping. Scientometrics, 99(1), 55–75.
Zhang, Y., Zhang, G., Chen, H., Porter, A. L., Zhu, D., & Lu, J. (2016). Topic analysis and forecasting for science, technology and innovation: Methodology with a case study focusing on big data research. Technological Forecasting and Social Change, 105, 179–191.
Acknowledgements
We thank the people performing the human validation of our results and the helpful comments we received on several conferences.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Woltmann, S.L., Alkærsig, L. Tracing university–industry knowledge transfer through a text mining approach. Scientometrics 117, 449–472 (2018). https://doi.org/10.1007/s11192-018-2849-9
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-018-2849-9