, Volume 82, Issue 2, pp 217–241 | Cite as

Discovery of factors influencing patent value based on machine learning in patents in the field of nanotechnology

  • Scott D. Bass
  • Lukasz A. KurganEmail author


Patents represent the technological or inventive activity and output across different fields, regions, and time. The analysis of information from patents could be used to help focus efforts in research and the economy; however, the roles of the factors that can be extracted from patent records are still not entirely understood. To better understand the impact of these factors on patent value, machine learning techniques such as feature selection and classification are used to analyze patents in a sample industry, nanotechnology. Each nanotechnology patent was represented by a comprehensive set of numerical features that describe inventors, assignees, patent classification, and outgoing references. After careful design that included selection of the most relevant features, selection and optimization of the accuracy of classification models that aimed at finding most valuable (top-performing) patents, we used the generated models to analyze which factors allow to differentiate between the top-performing and the remaining nanotechnology patents. A few interesting findings surface as important such as the past performance of inventors and assignees, and the count of referenced patents.


Patent Patent value Nanotechnology Machine learning Classification Feature selection 


  1. Albert, M. B., Avery, D., Narin, F., & McAllister, P. (1991). Direct validation of citation counts as indicators of industrially important patents. Research Policy, 20, 251–259.CrossRefGoogle Scholar
  2. Baldini, N., & Grimaldi, R. (2007). To patent or not to patent? A survey of Italian inventors on motivations, incentives, and obstacles to university patenting. Scientometrics, 70, 333–354.CrossRefGoogle Scholar
  3. Braun, T., Schubert, A., & Zsindely, S. (1997). Nanoscience and nanotechnology on the balance. Scientometrics, 38, 321–325.CrossRefGoogle Scholar
  4. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.zbMATHCrossRefGoogle Scholar
  5. Breitzman, A. F., & Mogee, M. E. (2002). The many applications of patent analysis. Journal of Information Science, 28, 187–205.CrossRefGoogle Scholar
  6. Carpenter, M. P., Narin, F., & Woolf, P. (1981). Citation rates to technologically important patents. World Patent Information, 3, 160–163.CrossRefGoogle Scholar
  7. Chen, D., Lin, W. C., & Huang, M. (2007a). Using essential patent index and essential technological strength to evaluate industrial technological innovation competitiveness. Scientometrics, 71, 101–116.Google Scholar
  8. Chen, H., Li, X., & Lin, Y. (2007b). Worldwide nanotechnology development: a comparative study of USPTO, EPO, and JPO patents (1976–2004). Journal of Nanoparticle Research, 9, 977–1002.Google Scholar
  9. Cohen, W. (1995). Fast effective rule induction. In Proceedings of the twelfth international conference on machine learning (pp. 115–123). San Mateo: Morgan Kaufmann Publishers.Google Scholar
  10. Connolly, R. A., & Hirschey, M. (1988). Market value and patents: A Bayesian approach. Economics Letters, 27, 83–87.CrossRefGoogle Scholar
  11. Debackere, K., Verbeek, A., Luwel, M., & Zimmermann, E. (2002). Measuring progress and evolution in science and technology-II: The multiple uses of technometric indicators. International Journal of Management Reviews, 4, 213–231.CrossRefGoogle Scholar
  12. Domingos, P. (1999). MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the fifth international conference on knowledge discovery and data mining (pp. 155–164). New York: ACM Press.Google Scholar
  13. Gay, C., & Le Bas, C. (2005). Uses without too many abuses of patent citations or the simple economics of patent citations as a measure of value and flows of knowledge. Economics of Innovation and New Technology, 14, 333–338.CrossRefGoogle Scholar
  14. Griliches, Z. (1990). Patent statistics as economic indicators: A survey. Journal of Economic Literature, 28, 1661–1707.Google Scholar
  15. Gupta, V. K. (1999). Technological trends in the area of fullerenes using bibliometric analysis of patents. Scientometrics, 44, 17–31.CrossRefGoogle Scholar
  16. Hagedoorn, J., & Cloodt, M. (2003). Measuring innovative performance: Is there an advantage in using multiple indicators? Research Policy, 32, 1365–1379.CrossRefGoogle Scholar
  17. Hall, B. H., Jaffe, A., & Trajtenberg, M. (2005). Market value and patent citations. RAND Journal of Economics, 36, 16–38.Google Scholar
  18. Harhoff, D., Narin, F., Scherer, F. M., & Vopel, K. (1999). Citation frequency and the value of patented innovation. Review of Economics and Statistics, 81, 511–515.CrossRefGoogle Scholar
  19. Hilario, M. & Kalousis, A. (2000). Quantifying the resilience of inductive classification algorithms. In Proceedings of the 4th European conference on principles of data mining and knowledge discovery (pp. 106–115). France: Lyon.Google Scholar
  20. Huang, Z., Chen, H., Chen, Z. K., & Roco, M. C. (2004). International nanotechnology development in 2003: Country, institution and technology field analysis based on USPTO patent database. Journal of Nanoparticle Research, 6, 325–354.CrossRefGoogle Scholar
  21. Huang, Z., Chen, H., Li, X., & Roco, M. C. (2006). Connecting NSF funding to patent innovation in nanotechnology (2001–2004). Journal of Nanoparticle Research, 8, 859–879.CrossRefGoogle Scholar
  22. Huang, Z., Chen, H., Yip, A., Ng, G., Guo, F., Chen, Z.-K., et al. (2003). Longitudinal patent analysis for nanoscale science and engineering: Country, institution and technology field. Journal of Nanoparticle Research, 5, 333–363.CrossRefGoogle Scholar
  23. Hullmann, A. (2007). Measuring and assessing the development of nanotechnology. Scientometrics, 70, 739–758.CrossRefGoogle Scholar
  24. Hullmann, A., & Meyer, M. (2003). Publications and patents in nanotechnology: An overview of previous studies and the state of the art. Scientometrics, 58, 507–527.CrossRefGoogle Scholar
  25. John, G. H. & Langley, P. (1995). Estimating continuous distributions in Bayesian classifiers. In Proceedings of the eleventh conference on uncertainty in artificial intelligence (pp. 338–345). San Mateo: Morgan Kaufmann Publishers.Google Scholar
  26. Karki, M. (1997). Patent citation analysis: A policy analysis tool. World Patent Information, 19, 269–272.CrossRefGoogle Scholar
  27. Kononenko, I. (1994). Estimation attributes: analysis and extensions of RELIEF. In Proceedings of the 1994 European conference on machine learning (pp. 171–182). San Mateo: Morgan Kaufmann Publishers.Google Scholar
  28. Kostoff, R. N., Koytcheff, R. G., & Lau, C. G. Y. (2007). Global nanotechnology research metrics. Scientometrics, 70, 565–601.CrossRefGoogle Scholar
  29. Kostoff, R., Stump, J., Johnson, D., Murday, J., Lau, C., & Tolles, W. (2006). The structure and infrastructure of the global nanotechnology literature. Journal of Nanoparticle Research, 8, 301–321.CrossRefGoogle Scholar
  30. Le Cessie, S., & Van Houwelingen, J. C. (1992). Ridge estimators in logistic regression. Applied Statistics, 41, 191–201.zbMATHCrossRefGoogle Scholar
  31. Lee, L. L., Chan, C. K., Ngaim, M., & Ramakrishna, S. (2006). Nanotechnology patent landscape 2006. Nano, 1(2), 101–113.zbMATHCrossRefGoogle Scholar
  32. Leydesdorff, L., & Meyer, M. (2007). The scientometrics of a Triple Helix of university–industry–government relations. Scientometrics, 70, 207–222.CrossRefGoogle Scholar
  33. Lo, S.-C. (2008). Patent coupling analysis of primary organizations in genetic engineering research. Scientometrics, 74, 143–151.CrossRefMathSciNetGoogle Scholar
  34. Marinova, D., & Mcaleer, M. (2003). Nanotechnology strength indicators: International rankings based on US patents. Nanotechnology, 14, R1–R7.CrossRefGoogle Scholar
  35. Meyer, M. (2001). Patent citation analysis in a novel field of technology: An exploration of nano-science and nano-technology. Scientometrics, 51, 163–183.CrossRefGoogle Scholar
  36. Meyer, M. (2007). What do we know about innovation in nanotechnology? Some propositions about an emerging field between hype and path-dependency. Scientometrics, 70, 779–810.CrossRefGoogle Scholar
  37. Meyer, M., & Persson, O. (1998). Nanotechnology-interdisciplinarity, patterns of collaboration and differences in application. Scientometrics, 42, 195–205.CrossRefGoogle Scholar
  38. Narin, F. (1993). Technology indicators and corporate strategy. Review of Business, 14, 19–23.Google Scholar
  39. Narin, F. (1994). Patent bibliometrics. Scientometrics, 30, 147–155.CrossRefGoogle Scholar
  40. Narin, F., Breitzman, A. F., & Thomas, P. (2004). Using patent citation indicators to manage a stock portfolio. In H. F. Moed, W. Glänzel, & U. Schmoch (Eds.), Handbook of quantitative science and technology research: The use of publication and patent statistics in studies of S&T systems (pp. 553–568). Netherlands: Springer.Google Scholar
  41. Narin, F., & Hamilton, K. S. (1996). Bibliometric performance measures. Scientometrics, 36, 293–310.CrossRefGoogle Scholar
  42. Quinlan, R. (1993). C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann Publishers.Google Scholar
  43. Reitzig, M. (2003). What determines patent value? Insights from the semiconductor industry. Research Policy, 32, 13–26.CrossRefGoogle Scholar
  44. Rozhkov, S., & Ivantcheva, L. (1998). Scientometrical indicators of national science & technology policy on patent statistics data. World Patent Information, 20, 161–166.CrossRefGoogle Scholar
  45. Sampat, B. (2004). Examining patent examination: An analysis of examiner and applicant generated prior art. Working Paper, School of Public Policy, Georgia Institute of Technology.Google Scholar
  46. Tong, X., & Frame, J. D. (1992). Measuring national technological performance with patent claims data. Research Policy, 23, 133–141.CrossRefGoogle Scholar
  47. Trajtenberg, M. (1990). A penny for your quotes: Patent citations and the value of innovations. RAND Journal of Economics, 21, 172–187.CrossRefGoogle Scholar
  48. Trippe, A. J. (2003). Patinformatics: Tasks to tools. World Patent Information, 25, 211–221.CrossRefGoogle Scholar
  49. Van Looy, B., Debackere, K., Callaert, J., Tussen, R., & Van Leeuwen, T. (2006). Scientific capabilities and technological performance of national innovation systems: An exploration of emerging industrial relevant research domains. Scientometrics, 66, 295–310.CrossRefGoogle Scholar
  50. Van Someren, M., & Urbancic, T. (2005). Applications of machine learning: Matching problems to tasks and methods. Knowledge Engineering Review, 20, 363–402.CrossRefGoogle Scholar
  51. Verbeek, A., & Debackere, K. (2006). Patent evolution in relation to public/private R&D investment and corporate profitability: Evidence from the United States. Scientometrics, 66, 279–294.CrossRefGoogle Scholar
  52. Verbeek, A., Debackere, K., Luwel, M., & Zimmermann, E. (2002). Measuring progress and evolution in science and technology-I: The multiple uses of bibliometric indicators. International Journal of Management Reviews, 4, 179–211.CrossRefGoogle Scholar
  53. Wallin, J. A. (2005). Bibliometric methods: Pitfalls and possibilities. Basic & Clinical Pharmacology & Toxicology, 97, 261–275.CrossRefGoogle Scholar
  54. Wang, S. (2007). Factors to evaluate a patent in addition to citations. Scientometrics, 71, 509–522.CrossRefGoogle Scholar
  55. Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufman Publishers.zbMATHGoogle Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2009

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringUniversity of AlbertaEdmontonCanada

Personalised recommendations