Advertisement

Noise Removal Using TF-IDF Criterion for Extracting Patent Keyword

  • Jongchan Kim
  • Dohan Choe
  • Gabjo Kim
  • Sangsung Park
  • Dongsik Jang
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 271)

Abstract

These days, governments and enterprises are analyzing trends in technology as a part of their investment strategy and R&D planning. Qualitative methods by experts are mainly used in technology trend analyses. However, such methods are inefficient in terms of cost and time for large amounts of data. In this study, we quantitatively analyzed patent data using text mining with TF-IDF used as weights. Keywords and noises were also classified using TF-IDF weighting. In addition, we propose new criteria for removing noises more effectively, and visualize the resulting keywords derived from patent data using social network analysis (SNA).

Keywords

Keyword extraction Patent analysis Text mining TF-IDF 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Korean Intellectual Property Office, Korea Invention Promotion Association.: Patent and information analysis (for researchers). Kyungsung Books, pp. 302–306 (2009)Google Scholar
  2. 2.
    Jun, S.H.: An efficient text mining for patent information analysis. In: Proceedings of KIIS Spring Conference, vol. 19(1), pp. 255–257 (2009)Google Scholar
  3. 3.
    Kim, J.S.: A knowledge Conversion Tool for Expert systems. International Journal of Fuzzy Logic and Intelligent Systems 11(1), 1–7 (2011)CrossRefGoogle Scholar
  4. 4.
    Kang, B.Y., Kim, D.W.: A Muti-resolution Approach to restaurant Named Entity Recognition in Korean web. International Journal of Fuzzy Logic and In telligent Systems 12(4), 277–284 (2012)CrossRefGoogle Scholar
  5. 5.
    Jung, C.W., Kim, J.J.: Analysis of trend in construction using textmining method. Journal of the Korean Digital Architecture Interior Association 12(2), 53–60 (2012)Google Scholar
  6. 6.
    Lee, S.J., Kim, H.J.: Keyword extraction from news corpus using modified TF-IDF. Society for e-business Studies 14(4), 59–73 (2009)Google Scholar
  7. 7.
    Myunghoi, H.: Introduction to social network analysis using R. Freeacademy, 1–24 (2010)Google Scholar
  8. 8.
    Korea Intellectual Property Rights Information Service, http://www.kipris.or.kr
  9. 9.
  10. 10.
    Juanzi, L., Qi’na, F., Kuo, Z.: Keyword extraction based on TF-IDF for Chinese new document. Wuhan University Journal of Natural Sciences 12(5), 917–921 (2007)CrossRefGoogle Scholar
  11. 11.
    Yoon, B.U., Park, Y.T.: A text-mining-based patent network: Analytical tool for high-technology trend. The Journal of High Technology Management Research 15(1), 37–50 (2004)CrossRefGoogle Scholar
  12. 12.
    Jung, J.H., Kim, J.K., Lee, J.H.: Size-Independent Caption Extraction for Korean Captions with Edge Connected Components. International Journal of Fuzzy Logic and Intelligent Systems 12(4), 308–318 (2012)CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jongchan Kim
    • 1
    • 2
  • Dohan Choe
    • 1
    • 2
  • Gabjo Kim
    • 1
    • 2
  • Sangsung Park
    • 2
  • Dongsik Jang
    • 1
  1. 1.Division of Industrial Management EngineeringKorea UniversitySeoulKorea
  2. 2.Graduate School of Management of TechnologyKorea UniversitySeoulKorea

Personalised recommendations