Patent analysis and classification prediction of biomedicine industry: SOM-KPCA-SVM model

  • Bingchun LiuEmail author
  • Mingzhao Lai
  • Jheng-Long Wu
  • Chuanchuan Fu
  • Arihant Binaykia


This paper proposed the application of a combinatorial model of machine learning to patent quality classification and forecasting in the biomedical industry. The model consists of three methods: Self-Organizing Map (SOM), Kernel Principal Component Analysis (KPCA) and Support Vector Machine (SVM), and names it SOM-KPCA-SVM model. The model proposed in this paper is implemented in two steps. First, the SOM groups the patent data and defines the patent level. Second, the patent data is reduced by KPCA to decrease noise, and SVM is applied to KPCA’s patent data to derive the classification results. The study collected 11,251 biopharmaceutical patent data from the patent transaction news. After training the patent quality model, 2196 historical patents were used to verify the performance of the training model. The accuracy of the match between experimental results and actual transaction status reached 84.13%. Therefore, the proposed patent quality method as a preliminary screening solution automatically and effectively evaluates the quality of patents. This method saves valuable time for reviewing experts, facilitates the rapid identification of high-quality patents, and can be used for the development of commercialization and mass customization of products.


Machine learning Patent analysis Patent quality Patent quality classification 



This work was supported by the National Natural Science Foundation of China [grant number 71503180]; Major Project of Tianjin Education Committee and Social Science [grant number 2017JWZD16]; Tianjin science and technology project [18ZLZDZF00040].


  1. 1.
    Abbas A, Zhang L, Khan SU (2014) A literature review on the state-of-the-art in patent analysis. World Patent Inf 37(4):3–13CrossRefGoogle Scholar
  2. 2.
    Basberg BL (1987) Patents and the measurement of technological change: a survey of the literature. Res Policy 16(2–4):0–141Google Scholar
  3. 3.
    Chapelle O et al (1999) Model selection for support vector machines. Adv Neural Inf Proces Syst 55(1–2):230–236Google Scholar
  4. 4.
    Chiu CY, Huang PT (2013) Application of the honeybee mating optimization algorithm to patent document classification in combination with the support vector machine. Achper International Conference: Creating Active Futures ACHPERGoogle Scholar
  5. 5.
    Chiu TF, Hong CF, Chiu YT (2012) A proposed IPC-based clustering and applied to technology strategy formulation. Asian Conference on Intelligent Information & Database Systems. Springer-VerlagGoogle Scholar
  6. 6.
    Cho Y (2014) Industrial technology roadmap as a decision making tool to support public R&D planning. Portland International Conference on Management of Engineering & Technology IEEEGoogle Scholar
  7. 7.
    Dang J, Motohashi K (2015) Patent statistics: a good indicator for innovation in China? Patent subsidy program impacts on patent quality. China Econ Rev 35:137–155CrossRefGoogle Scholar
  8. 8.
    Ercan S, Kayakutlu G (2014) Patent value analysis using support vector machines. Soft Comput 18(2):313–328CrossRefGoogle Scholar
  9. 9.
    Fischer T, Leidinger J (2014) Testing patent value indicators on directly observed patent value—an empirical analysis of ocean Tomo patent auctions. Res Policy 43(3):519–529CrossRefGoogle Scholar
  10. 10.
    Guan J, Zhao Q (2013) The impact of university–industry collaboration networks on innovation in nanobiopharmaceuticals. Technol Forecast Soc Chang 80(7):1271–1286CrossRefGoogle Scholar
  11. 11.
    Harhoff D et al (1999) Citation frequency and the value of patented innovation. Rev Econ Stat 81(3):511–515CrossRefGoogle Scholar
  12. 12.
    Hsu DH, Ziedonis RH (2013) Resources as dual sources of advantage: implications for valuing entrepreneurial-firm patents. Strateg Manag J 34(7):761–781CrossRefGoogle Scholar
  13. 13.
    Hussinger K , Pacher S (2018) Information ambiguity, patents and the market value of innovative assets. Res Policy 48(3):665–675Google Scholar
  14. 14.
    Jeong C, Kim K (2014) Creating patents on the new technology using analogy-based patent mining. Expert Syst Appl 41(8):3605–3614CrossRefGoogle Scholar
  15. 15.
    Jiang M et al (2017) Opportunities and challenges of real-time release testing in biopharmaceutical manufacturing. Biotechnol Bioeng 114:2445–2456Google Scholar
  16. 16.
    Juntunen P, Liukkonen M, Lehtola M, Hiltunen Y (2013) Cluster analysis by self-organizing maps: an application to the modelling of water quality in a treatment process. Appl Soft Comput 13:3191–3196CrossRefGoogle Scholar
  17. 17.
    Lee C, Song B, Park Y (2013) How to assess patent infringement risks: a semantic patent claim analysis using dependency relationships. Tech Anal Strat Manag 25(1):23–38Google Scholar
  18. 18.
    Nijjar R, Ellenbogen MA, Hodgins S (2016) Sexual risk behaviors in the adolescent offspring of parents with bipolar disorder: prospective associations with parents’ personality and externalizing behavior in childhood. J Abnorm Child Psychol 44(7):1347–1359CrossRefGoogle Scholar
  19. 19.
    Noh H, Jo Y, Lee S (2015) Keyword selection and processing strategy for applying text mining to patent analysis. Pergamon Press, Inc., OxfordCrossRefGoogle Scholar
  20. 20.
    Ormel J et al (2013) The biological and psychological basis of neuroticism: current status and future directions. Neurosci Biobehav Rev 37(1):59–72CrossRefGoogle Scholar
  21. 21.
    Park YN et al (2018) The structure and knowledge flow of building information modeling based on patent citation network analysis. Autom Constr 87:215–224CrossRefGoogle Scholar
  22. 22.
    Tidd J (2010) Development of novel products through intraorganizational and interorganizational networks the case of home automation. J Prod Innov Manag 12(4):307–322Google Scholar
  23. 23.
    Ravalison FA, Rabenja N (2011) Using patent statistics and principal component analysis to predict global competition. Int J Ind Eng Manage 2(2):34–50Google Scholar
  24. 24.
    Segev A, Kantola J (2012) Identification of trends from patents using self-organizing maps. Expert Syst Appl 39(18):13235–13242CrossRefGoogle Scholar
  25. 25.
    Shao R et al (2014) The fault feature extraction and classification of gear using principal component analysis and kernel principal component analysis based on the wavelet packet transform. Measurement 54:118–132CrossRefGoogle Scholar
  26. 26.
    Su HN, Chen ML, Lee PC (2012) Patent litigation precaution method: analyzing characteristics of US litigated and non-litigated patents from 1976 to 2010. Scientometrics 92(1):181–195CrossRefGoogle Scholar
  27. 27.
    Thelwell, Craig (2014) Biological standards for potency assignment to fibrinolytic agents used in thrombolytic therapy. Semin Thromb Hemost 40(02):205–213CrossRefGoogle Scholar
  28. 28.
    Trajtenberg M (1990) A penny for your quotes: patent citations and the value of innovations. RAND J Econ 21(1):172–187CrossRefGoogle Scholar
  29. 29.
    Trappey AJC et al (2012) A patent quality analysis for innovative technology and product development. Adv Eng Inform 26(1):26–34CrossRefGoogle Scholar
  30. 30.
    Trappey AJC et al (2013) Intelligent patent recommendation system for innovative design collaboration. J Netw Comput Appl 36(6):1441–1450CrossRefGoogle Scholar
  31. 31.
    Venugopalan S, Rai V (2015) Topic based classification and pattern identification in patents. Technol Forecast Soc Chang 94:236–250CrossRefGoogle Scholar
  32. 32.
    Wang X, Fan N, Pardalos PM (2018) Robust chance-constrained support vector machines with secondorder moment information. Ann Oper Res 263(1):45–68Google Scholar
  33. 33.
    Wu C-H, Ken Y, Huang T (2010) Patent classification system using a new hybrid genetic algorithm support vector machine. Appl Soft Comput J 10(4):1164–1177CrossRefGoogle Scholar
  34. 34.
    Zhang G, Lv X, Zhou J (2014) Private value of patent right and patent infringement: an empirical study based on patent renewal data of China. China Econ Rev 28(C):37–54CrossRefGoogle Scholar
  35. 35.
    Zoladz PR, Diamond DM (2013) Current status on behavioral and biological markers of PTSD: a search for clarity in a conflicting literature. Neurosci Biobehav Rev 37(5):860–895CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Bingchun Liu
    • 1
    Email author
  • Mingzhao Lai
    • 1
  • Jheng-Long Wu
    • 2
  • Chuanchuan Fu
    • 1
  • Arihant Binaykia
    • 3
  1. 1.School of ManagementTianjin University of TechnologyTianjinPeople’s Republic of China
  2. 2.Innovation of Information ScienceSoochow UniversityTaipeiTaiwan
  3. 3.Department of Industrial and Systems EngineeringIndian Institute of TechnologyKharagpurIndia

Personalised recommendations