Noise Removal Using TF-IDF Criterion for Extracting Patent Keyword
These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).
KeywordsKeyword extraction Patent analysis Text mining TF-IDF
Unable to display preview. Download preview PDF.
- 1.Korean Intellectual Property Office, Korea Invention Promotion Association.: Patent and information analysis (for researchers). Kyungsung Books, pp. 302–306 (2009)Google Scholar
- 2.Jun, S.H.: An efficient text mining for patent information analysis. In: Proceedings of KIIS Spring Conference, vol. 19(1), pp. 255–257 (2009)Google Scholar
- 5.Jung, C.W., Kim, J.J.: Analysis of trend in construction using textmining method. Journal of the Korean Digital Architecture Interior Association 12(2), 53–60 (2012)Google Scholar
- 6.Lee, S.J., Kim, H.J.: Keyword extraction from news corpus using modified TF-IDF. Society for e-business Studies 14(4), 59–73 (2009)Google Scholar
- 7.Myunghoi, H.: Introduction to social network analysis using R. Freeacademy, 1–24 (2010)Google Scholar
- 8.Korea Intellectual Property Rights Information Service, http://www.kipris.or.kr
- 9.WIPS ON, http://www.wipson.com