Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Identifying patterns in rare earth element patents based on text and data mining


Rare earth elements (REE) are needed to produce many cutting-edge products, and their depletion is a major concern. In this paper, we identify unique characteristics of REE-related patents granted from 1975 to 2013 in five large patent offices around the world. Through topic detection and clustering of patent text, we found that purification processes related to oxides, nitrogen oxide, and exhaust gas were highlighted in the Korean Intellectual Property Office and Japan Patent Office (JPO). Molecular sieve, dispersion, and preparation methods involving yttrium, cerium, methane, zirconium, and ammonia were prominent in the China Patent and Trademark Office (CPTO) in the areas of performing operation and transporting. Quadratic assignment procedure correlation analysis was performed for IPC co-occurrence among REE patents in different offices, and the United States Patent and Trademark Office showed significantly different patterns than the CPTO and JPO. Furthermore, using betweenness centrality as an indicator of technology transition, the manufacture and treatment of nanostructures, nanotechnology for materials and surface science, and electrodes were identified as important REE technologies to be protected in Korea. In Japan, the technological areas identified as important for protection were the apparatuses and processes of manufacturing or assembling devices, compounds of iron, and materials. Our study results offer insights into national strategies for REE-related technologies in each country.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. Adams, S. (2010). The text, the full text and nothing but the text: Part 1—Standards for creating textual information in patent documents and general search implications. World Patent Information, 32(1), 22–29.

  2. Antonelli, D., Baralis, E., Bruno, Gl., Cerquitelli, T., Chiusano, S., & Mahoto, N. (2013). Analysis of diabetic patients through their examination history. Expert Systems with Applications, 40(11), 4672–4678.

  3. Apte, C., Damerau, F., & Weiss, S. (1998). Text mining with Decision Rules and Decision Trees. In Proceedings of the conference on automated learning and discovery, June.

  4. Borgatti, A. P. (2005). Centrality and network flow. Social Networks, 27(1), 55–71.

  5. Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). UCINET 6 for Windows: Software for social network analysis. Harvard: Analytic Technologies.

  6. Breschi, S., Cassi, L., Malerba, F., & Vonortas, N. V. (2009). Network research: European policy intervention in ICTs. Technology Analysis & Strategic Management, 21(7), 833–857.

  7. Chen, C. (2003). Patents, citation, & innovations: A window on the knowledge economy. Journal of the American Society for Information Science and Technology, 54(8), 802–803.

  8. Corney, M., DeVel, O., Anderson, A., & Mohay, G. (2002). Gender-preferential text mining of e-main discourse. Proceedings of 18th annual computer security application conference (ACSAC) (pp. 282–289). IEEE Computer Society, 9–13, December, Las Vegas.

  9. Ernst, H. (2003). Patent information for strategic technology management. World Patent Information, 25(3), 233–242.

  10. Hotho, A., Nurnberger, A., & Paaß, G. (2005). A brief survey of text mining. 20th LVD Forum.

  11. Humphries, M. (2010). Rare earth elements: The global supply chain. Washington, DC: Congressional Research Service.

  12. Izatt, S. R., Izatt, N. E., & Bruening, R. L. (2010). Metal separations of interest to the Chinese metallurgical industry. Journal of Rare Earths, 28, 22–29.

  13. Kostoff, R. N., Koytcheff, R. G., & Lau, C. G. (2007). Global nanotechnology research metrics. Scientometrics, 70(3), 565–601.

  14. Lee, B., & Jeong, Y. (2008). Mapping Korea’s national R&D domain of robot technology by using the co-word analysis. Scientometrics, 77(1), 3–19.

  15. Lee, P., Su, H., & Wu, F. (2010). Quantitative mapping of patented technology—The case of electrical conducting polymer nanocomposite. Technological Forecasting and Social Change, 77, 466–478.

  16. Liu, C., Hsaio, W., Lee, C., & Chen, C. (2013). Clustering tagged documents with labeled and unlabeled documents. Information Processing and Management, 49(3), 596–606.

  17. Magerman, T., Looy, B. V., & Song, X. (2010). Exploring the feasibility and accuracy of latent semantic analysis based on text mining techniques to detect similarity between patent documents and scientific publications. Scientometrics, 82, 289–306.

  18. Maggioni, M. A., & Uberti, T. E. (2005). International networks of knowledge flows: An econometric analysis (No. 0519). Papers on economics and evolution.

  19. Mancheri, N. (2012). Chinese monopoly in rare earth elements: Supply–demand and industrial applications. China Report, 48(4), 449–468.

  20. Mittermayer, M. (2004). Forecasting intraday stock price trends with text mining techniques. Proceedings of the 37th annual Hawaii international conference on system sciences (HICSS’04)—Track 3 (Vol. 3, p. 30064b).

  21. Porter, A. L., Kongthon, A., & Lu, J. (2002). Research profiling: Improving the literature review. Scientometrics, 53(3), 351–370.

  22. Schmoch, U. (1993). Tracing the knowledge transfer from science to technology as reflected in patent indicators. Scientometrics, 26(1), 193–211.

  23. Simpson, W. (2001, March). QAP: The quadratic assignment procedure. In North American Stata Users’ Group meeting.

  24. Suzuki, J., & Kodama, F. (2004). Technological diversity of persistent innovators in Japan: Two case studies of large Japanese firms. Research Policy, 33(3), 531–549.

  25. Tremblay, M. C., Berndt, D. J., Luther, S. L., Foulis, P. R., & French, D. D. (2009). Identifying fall-related injuries: Text mining the electronic medical record. Information Technology and Management, 10, 253–265.

  26. Tseng, Y., Lin, C., & Lin, Y. (2007a). Text mining techniques for patent analysis. Information Processing and Management, 43, 1216–1247.

  27. Tseng, Y., Wang, Y., Lin, Y., Lin, C., & Juang, D. (2007b). Patent surrogate extraction and evaluation in the context of patent mapping. Journal of Information Science, 33(6), 718–736.

  28. UCINET 6.493. (2013). http://www.analytictech.com/ucinet.

  29. Verspagen, B., van Moergastel, T., & Slabbers, M. (1994). MERIT concordance table: IPC-ISIC (rev. 2). MERIT Research Memorandum 94-004.

  30. Zheng, J., Zhao, Z. Y., Zhang, X., Chen, D. Z., & Huang, M. H. (2013). International collaboration development in nanotechnology: A perspective of patent network analysis. Scientometrics, 1–20.

Download references


This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIP) (2013R1A2A1A09004699).

Author information

Correspondence to So Young Sohn.


Appendix 1

See Tables 9, 10, 11, 12, 13 and 14.

Table 9 Descriptive terms of each cluster (IPC C)
Table 10 Distribution of clusters by patent office (IPC C)
Table 11 Descriptive terms of each cluster (IPC G)
Table 12 Distribution of clusters by patent office (IPC G)
Table 13 Descriptive terms of each cluster (IPC H)
Table 14 Distribution of clusters by patent office (IPC H)

Appendix 2

See Table 15.

Table 15 Definition of IPC in association rules (Source: WIPO)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ju, Y., Sohn, S.Y. Identifying patterns in rare earth element patents based on text and data mining. Scientometrics 102, 389–410 (2015). https://doi.org/10.1007/s11192-014-1382-8

Download citation


  • Rare earth elements
  • Patent
  • Text mining
  • Topic detection
  • Quadratic assignment procedure (QAP)
  • Patent network analysis
  • Association rule

Mathematics Subject Classification

  • 62-07 (data analysis)
  • 62P30 (statistics, applications in engineering and industry)

JEL Classification

  • C14 (quantitative methods, semiparametric methods)
  • C49 (quantitative methods, other)
  • F00 (international economics, general)